OpenCRAVAT is a new open source, scalable decision support system for variant and gene prioritization. It includses a modular resource catalog to maximize community and developer involvement, and as a result the catalog is being actively developed and growing every month. Resources made available via the store are well-suited for analysis of cancer, as well as Mendelian and complex diseases.
OBAllocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=16g --gres=gpu:p100:1,lscratch:10 -c4 --tunnel [user@cn2389 ~]$ module load OpenCRAVAT [+] Loading annovar 2020-06-08 on cn2389 [+] Loading OpenCRAVAT 2.4.2 ...In order to annotate and interpret variants, OpenCRAVAT (OC) makes use of a database comprising chanks of data called "modules" (not to be confused with the Biowulf modules). The currenly installed modules OC modules are available in the folder: $OC_MODULES.
[user@cn2389 ~]$ cd /data/$USER [user@cn2389 ~]$ mkdir OpenCRAVAT && cd OpenCRAVAT [user@cn2389 ~]$ oc new example-input .The latter command will create a file "example_input" in your current directory:
[user@cn2389 ~]$ cat example_input | wc -l 373 [user@cn2389 ~]$ head -n 20 example_input chr10 121593817 - A T s0 chr10 2987654 + T A s1 chr10 43077259 + A T s2 chr10 8055656 + A T s3 chr10 87864470 + A T s4 chr10 87864486 + A - s0 chr10 87864486 + AA - s1 chr10 87894027 + - CG s2 chr10 87894027 + - CT s3 chr1 100719861 + A T s4 chr1 10100 + C T s0 chr1 110340653 + CGGCTTT - s1 chr11 108227625 + A T s2 chr11 113789394 + G A s3 chr1 111762684 + G A s4 chr11 119206418 + A T s0 chr1 114713881 + TGGTC - s1 chr1 114713881 + TGGTCTC - s2 chr1 114716160 - A T s3 chr11 1584916 + - GCC s4Run OpenCRAVAT on the test input file:
[user@cn2389 ~]$ oc run ./example_input -l hg38 --mp 1 Input file(s): /vf/users/denisovga/OpenCRAVAT/example_input Genome assembly: hg38 Running converter... Converter (converter) finished in 1.329s Running gene mapper... finished in 2.192s Running annotators... annotator(s) finished in 1.028s Running aggregator... Variants finished in 0.197s Genes finished in 0.150s Samples finished in 0.145s Tags finished in 0.276s Indexing variant base__chrom finished in 0.061s variant base__coding finished in 0.011s variant base__so finished in 0.011s Running postaggregators... Tag Sampler (tagsampler) finished in 0.138s Finished normally. Runtime: 5.968sOnce the job is finished, the following files wil be created:
example_input.logIn particular, file example_input.sqlite is the sqlite database with the results.
example_input.sqlite
example_input.err
[user@cn2389 ~]$ wget https://github.com/KarchinLab/open-cravat/archive/refs/tags/2.4.2.tar.gz [user@cn2389 ~]$ export PYTHONPATH==open-cravat-2.4.2:$PYTHONPATH [user@cn2389 ~]$ python-oc open-cravat-2.4.2/cravat/oc.py gui --port $PORT1 example_input.sqlite ____ __________ ___ _ _____ ______ / __ \____ ___ ____ / ____/ __ \/ | | / / |/_ __/ / / / / __ \/ _ \/ __ \/ / / /_/ / /| | | / / /| | / / / /_/ / /_/ / __/ / / / /___/ _, _/ ___ | |/ / ___ |/ / \____/ .___/\___/_/ /_/\____/_/ |_/_/ |_|___/_/ |_/_/ /_/ ...where $PORT1 is the tunneling port number you've got after allocating the interactive session. Store this port number and an id of the compute node you have been using, in this example node_id=cn2389
ssh -t -L $PORT:localhost:$PORT1 biowulf.nih.gov "ssh -L $PORT1:localhost:$PORT1 $node_id"where $PORT1 and $node_id should be replaced by the actual values you stored.
[user@cn2389 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$