Description
This application requires a graphical connection using NX
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cncn4338 ~]$ module load transcript_clean [+] Loading transcript_clean 2.0.3 on cn4338 [+] Loading singularity 3.10.5 on cn4338 [user@cn4338 ~]$ cd /usr/local/apps/transcript_clean/2.0.3/TranscriptClean-2.0.3 [user@cn4338 ~]$ exit salloc.exe: Relinquishing job allocation 46116226
Example using help command:
[user@cn4338 TranscriptClean-2.0.3]$ python TranscriptClean.py --help Usage: TranscriptClean.py [options] Options: -h, --help show this help message and exit -s FILE, --sam=FILE Input SAM file containing transcripts to correct. Must contain a header. -g FILE, --genome=FILE Reference genome fasta file. Should be the same one used during mapping to generate the provided SAM file. -t N_THREADS, --threads=N_THREADS Number of threads to run program with. -j FILE, --spliceJns=FILE Splice junction file obtained by mapping Illumina reads to the genome using STAR, or alternately, extracted from a GTF using the accessory script. More formats may be supported in the future. -v FILE, --variants=FILE VCF formatted file of variants to avoid correcting away in the data (optional). --maxLenIndel=MAXLENINDEL Maximum size indel to correct (Default: 5 bp) --maxSJOffset=MAXSJOFFSET Maximum distance from annotated splice junction to correct (Default: 5 bp) -o FILE, --outprefix=FILE Output file prefix. '_clean' plus a file extension will be added to the end. -m CORRECTMISMATCHES, --correctMismatches=CORRECTMISMATCHES If set to false, TranscriptClean will skip mismatch correction. Default: true -i CORRECTINDELS, --correctIndels=CORRECTINDELS If set to false, TranscriptClean will skip indel correction. Default: true --correctSJs=CORRECTSJS If set to false, TranscriptClean will skip splice junction correction. Default: true, but you must provide a splice junction annotation file in order for it to work. --dryRun If this option is set, TranscriptClean will read in the sam file and record all insertions, deletions, and mismatches, but it will skip correction. This mode is useful for checking the distribution of transcript errors in the data before running correction. --primaryOnly If this option is set, TranscriptClean will only output primary mappings of transcripts (ie it will filter out unmapped and multimapped lines from the SAM input. --canonOnly If this option is set, TranscriptClean will output only canonical transcripts and transcripts containing annotated noncanonical junctions to the clean SAM file at the end of the run. --tmpDir=TMP_PATH If you would like the tmp files to be written somewhere different than the final output, provide the path to that location here. --bufferSize=BUFFER_SIZE Number of lines to output to file at once by each thread during run. Default = 100 --deleteTmp If this option is set, the temporary directory generated by TranscriptClean (TC_tmp) will be removed at the end of the run.
Create a batch input file (e.g. TEMPLATE.sh). For example:
#!/bin/bash set -e module load TEMPLATE TEMPLATE < TEMPLATE.in > TEMPLATE.out
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] TEMPLATE.sh
Create a swarmfile (e.g. TEMPLATE.swarm). For example:
TEMPLATE < TEMPLATE.in > TEMPLATE.out TEMPLATE < TEMPLATE.in > TEMPLATE.out TEMPLATE < TEMPLATE.in > TEMPLATE.out TEMPLATE < TEMPLATE.in > TEMPLATE.out
Submit this job using the swarm command.
swarm -f TEMPLATE.swarm [-g #] [-t #] --module TEMPLATEwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module TEMPLATE | Loads the TEMPLATE module for each subjob in the swarm |