Trinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms. Trinotate makes use of a number of different well referenced methods for functional annotation including homology search to known sequence data (BLAST+/SwissProt), protein domain identification (HMMER/PFAM), protein signal peptide and transmembrane domain prediction (signalP/tmHMM), and leveraging various annotation databases (eggNOG/GO/Kegg databases). All functional annotation data derived from the analysis of transcripts is integrated into a SQLite database which allows fast efficient searching for terms with specific qualities related to a desired scientific hypothesis or a means to create a whole annotation report for a transcriptome.
Users MUST allocate lscratch to run Trinotate. This is because its dependancy, signalp, requires a temporary directory that defaults to lscratch.
March 2023: The documentation below is for Trinotate v3.2.0. The latest version, 4.0.0, has many differences, primarily because it is run as a singularity container. It is available on Biowulf for testing.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --mem=10g --gres=lscratch:10g salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load trinotate [user@cn3144 ~]$ cd /lscratch/$SLURM_JOBID [user@cn3144 ~]$ cp -r $TRINOTATE_HOME/sample_data . [user@cn3144 ~]$ cd sample_data [user@cn3144 ~]$ ./runMe.sh edgeR_trans/ edgeR_trans/Trinity_trans.counts.matrix.heatshock_vs_plateau.edgeR.DE_results.samples edgeR_trans/Trinity_trans.counts.matrix.diauxic_shift_vs_log_growth.diauxic_shift.vs.log_growth.EdgeR.Rscript edgeR_trans/diffExpr.P0.1_C1.matrix edgeR_trans/Trinity_trans.counts.matrix.heatshock_vs_log_growth.edgeR.DE_results.P0.1_C1.heatshock-UP.subset edgeR_trans/clusters_fixed_P_60.heatmap.heatmap.pdf edgeR_trans/Trinity_trans.counts.matrix.heatshock_vs_log_growth.edgeR.DE_results.MA_n_Volcano.pdf edgeR_trans/diffExpr.P0.1_C1.matrix.RData edgeR_trans/Trinity_trans.counts.matrix.log_growth_vs_plateau.edgeR.DE_results.MA_n_Volcano.pdf edgeR_trans/Trinity_trans.counts.matrix.log_growth_vs_plateau.edgeR.DE_results.samples [...] ########################### Generating report table ########################### ######################################### Extracting Gene Ontology Mappings Per Gene ######################################### ########################## done. See annotation summary file: Trinotate_report.xls ########################## [user@cn3144 ~]$ ls -rtl total 498272 drwxr-x--- 2 user user 4096 Mar 8 2016 edgeR_genes drwxr-x--- 3 user user 12288 Mar 8 2016 edgeR_trans -rwxr-x--- 1 user user 755 Aug 22 15:53 cleanme.pl -rwxr-x--- 1 user user 5678 Aug 22 15:53 runMe.sh drwxr-x--- 2 user user 4096 Aug 22 15:53 data -rw-r----- 1 user user 18445845 Aug 22 15:53 Trinotate_report.xls -rw-r----- 1 user user 6479239 Aug 22 15:54 Trinotate_report.xls.gene_ontology -rw-r----- 1 user user 478552064 Aug 22 15:54 myTrinotate.sqlite -rw-r----- 1 user user 544 Aug 22 15:54 Trinotate_report_stats.taxonomy_counts -rw-r----- 1 user user 177 Aug 22 15:54 Trinotate_report_stats.species_counts -rw-r----- 1 user user 232 Aug 22 15:54 Trinotate_report_stats.eggnog_counts -rw-r----- 1 user user 170 Aug 22 15:54 Trinotate_report_stats.eggnog_counts.funcats -rw-r----- 1 user user 130768 Aug 22 15:54 Trinotate_report_stats.kegg.counts -rw-r----- 1 user user 11 Aug 22 15:54 Trinotate_report_stats.pfam.counts -rw-r----- 1 user user 6479239 Aug 22 15:54 Trinotate_report_stats.GO -rw-r----- 1 user user 47526 Aug 22 15:54 Trinotate_report_stats.GO.slim -rw-r----- 1 user user 16367 Aug 22 15:54 Trinotate_report_stats.cXp_summary.html [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$Trinotate produces Excel and html files. The easiest way to view them is to use hpcdrive to mount your Biowulf /home or /data area onto your desktop, then click on the file. For the test job above, since the output is in /lscratch/$SLURM_JOBID (temporary local disk on the node), you should copy the desired files back to your /data area before exiting the session.
Create a batch input file (e.g. trinotate.sh). For example:
#!/bin/bash set -e module load trinotate $TRINOTATE_HOME/Trinotate Trinotate.sqlite init --gene_trans_map--transcript_fasta --transdecoder_pep etc
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=32 --mem=20g --gres=lscratch:20 trinotate.shNote: these are suggested values for cpus-per-task and mem. Based on your initial runs, you may need to increase or decrease them.
As of February 2020, certain dependencies of Trinotate are now available as separate modules. These include signalp v4.1, rnammer v1.2 and tmhmm v2.0c. You can load them as follows:
module load signalp/4.1 module load rnammer/1.2 module load tmhmm/2.0c