SOAPdenovo-Trans: a de novo transcriptome assembler for RNA-Seq
SOAPdenovo-Trans is a de novo transcriptome assembler designed specifically for RNA-Seq. Its performance on transcriptome datasets from rice and mouse. It provides higher contiguity, lower redundancy and faster execution than other popular transcriptome assemblers.
References:
- Yinlong Xie, Gengxiong Wu, Jingbo Tang, Ruibang Luo, Jordan Patterson,
Shanlin Liu, Weihua Huang, Guangzhu He, Shengchang Gu, Shengkang Li,
Xin Zhou, Tak-Wah Lam, Yingrui Li, Xun Xu, Gane Ka-Shu Wong, and
Jun Wang
SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads
Bioinformatics Vol. 30 no. 12 2014, pages 1660–1666.
Documentation
Important Notes
- Module Name: SOAPdenovo-Trans (see the modules page for more information)
- Unusual environment variables set
- SOAPDENOVO_TRANS_HOME installation directory
- SOAPDENOVO_TRANS_BIN executable directory
- SOAPDENOVO_TRANS_SRC source code directory
- SOAPDENOVO_TRANS_DATA sample data directory
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g [user@cn3316 ~]$ module load soapdenovo-trans [+] Loading samtools 0.1.20 ... [+] Loading SOAPdenovo-Trans 1.04 ... [user@cn3316 ~]$ SOAPdenovo-Trans -h Version 1.04 Usage: SOAPdenovo-Trans[option] pregraph construction kmer-graph contig eliminate errors and output contigs map map reads to contigs scaff scaffolding all doing all the above in turn
[user@cn3316 ~]$ SOAPdenovo-Trans all Version 1.04 SOAPdenovo-Trans all -s configFile -o outputGraph [-R -f -S -F] [-K kmer -p n_cpu -d kmerFreqCutoff -e EdgeCovCutoff -M mergeLevel -L minContigLen -t locusMaxOutput -G gapLenDiff] -s <string> configFile: the config file of reads -o <string> outputGraph: prefix of output graph file name -R (optional) output assembly RPKM statistics -f (optional) output gap related reads for SRkgf to fill gap, [NO] -S (optional) scaffold structure exists, [NO] -F (optional) fill gaps in scaffolds, [NO] -K <int> kmer(min 13, max 31): kmer size, [23] -p <int> n_cpu: number of cpu for use, [8] -d <int> kmerFreqCutoff: kmers with frequency no larger than KmerFreqCutoff will be deleted, [0] -e <int> EdgeCovCutoff: edges with coverage no larger than EdgeCovCutoff will be deleted, [2] -M <int> mergeLevel(min 0, max 3): the strength of merging similar sequences during contiging, [1] -L <int> minContigLen: shortest contig for scaffolding, [100] -t <int> locusMaxOutput: output the number of transcripts no more than locusMaxOutput in one locus, [5] -G <int> gapLenDiff: allowed length difference between estimated and filled gap, [50]End the interactive session:
[user@cn3316 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$