High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

TransDecoder identifies likely coding sequences based on the following criteria:

There are multiple versions of TransDecoder available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail TransDecoder

To select a module, type

module load TransDecoder/[ver]

where [ver] is the version of choice.

Environment variables set:

Interactive use

module load TransDecoder
TransDecoder.LongOrfs -t target_transcripts.fasta
TransDecoder.Predict -t target_transcripts.fasta [ homology options ]


Create a batch input file (e.g. TransDecoder.sh), which uses the input file 'TransDecoder.in'. For example:

module load TransDecoder
cufflinks_gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta > transcripts.fasta

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=1 TransDecoder.sh


Create a swarmfile (e.g. TransDecoder.swarm). For example:

TransDecoder.LongOrfs -t target_transcripts1.fasta
TransDecoder.LongOrfs -t target_transcripts2.fasta
TransDecoder.LongOrfs -t target_transcripts3.fasta
TransDecoder.LongOrfs -t target_transcripts4.fasta

Submit this job using the swarm command.

swarm -f TransDecoder.swarm TransDecoder.swarm --module TransDecoder


module load TransDecoder
cp $TRANSDECODERHOME/sample_data/* .