The GenomeTools genome analysis system is a free collection of bioinformatics tools (in the realm of genome informatics) combined into a single binary named gt. It is based on a C library named libgenometools which contains a wide variety of classes for efficient and convenient implementation of sequence and annotation processing software.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --mem=5g salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load genometools [user@cn3144 ~]$ gt -help Usage: bin/gt [option ...] [tool | script] [argument ...] The GenomeTools genome analysis system. -i enter interactive mode after executing 'tool' or 'script' -q suppress warnings -test perform unit tests and exit -seed set seed for random number generator manually. 0 generates a seed from current time and process id -help display help and exit -version display version information and exit Tools: bed_to_gff3 cds chain2dim chseqids clean ... ... [user@cn3144 ~]$ gt bed_to_gff3 -help Usage: bin/gt bed_to_gff3 [BED_file] Parse BED file and convert it to GFF3. -featuretype Set type of parsed BED features default: BED_feature -thicktype Set type of parsed thick BED features default: BED_thick_feature -blocktype Set type of parsed BED blocks default: BED_block -o redirect output to specified file default: undefined -gzip write gzip compressed output file default: no -bzip2 write bzip2 compressed output file default: no -force force writing to output file default: no -help display help and exit -version display version information and exit [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. gt.sh). For example:
#!/bin/bash set -e module load genometools cd /data/$USER gt bed_to_gff3 -force yes -o out.gff3 input.bed
Submit this job using the Slurm sbatch command.
sbatch --mem=5g gt.sh
Create a swarmfile (e.g. gt.swarm). For example:
cd dir1; gt bed_to_gff3 -force yes -o out.gff3 input.bed cd dir2; gt bed_to_gff3 -force yes -o out.gff3 input.bed ... cd dir10; gt bed_to_gff3 -force yes -o out.gff3 input.bed
Submit this job using the swarm command.
swarm -f gt.swarm -g 5 --module gtwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module gt | Loads the gt module for each subjob in the swarm |