The QoRTs software package is a fast, efficient, and portable multifunction toolkit designed to assist in the analysis, quality control, and data management of RNA-Seq datasets.
References:
- Hartley, S. W. & Mullikin, J. C. 2015. QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments BMC Bioinformatics 16:224.
- Module Name: QoRTs (see the modules page for more information)
- Multithreaded
- Environment variables set
- QORTS_JAR -- filepath for the essential QoRTs.jar jarfile
- QORTS_JARPATH -- generic path to jar files
- QORTS_EXAMPLES -- full example file set
QoRTs is a mixture of java and R. The jarfile can be executed directly using the $QORTS_JARFILE environment variable:
java -Xmx1G -jar $QORTS_JARFILE --man
NOTE: Memory and scratch space are handled through java options -Xm and -Djava.io.tmpdir. The amount of memory needed is 10-20GB for most genomes. To allocate 10GB of RAM to the java machine, include the option -Xmx10G.
The R steps are done as with any R job:
R > library(QoRTS);
There is an example walkthrough available which uses data available in $QORTS_EXAMPLES.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --cpus-per-task=16 --mem=10g --gres=lscratch:100 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load QoRTs [user@cn3144 ~]$ cp -R $QORTS_EXAMPLES/* . [user@cn3144 ~]$ sh exampleScripts/step0/step0.reset.sh [user@cn3144 ~]$ mkdir -p outputData/qortsQcData/SAMP1_RG1 [user@cn3144 ~]$ java -Xmx10G -jar -Djava.io.tmpdir=/lscratch/$SLURM_JOB_ID \ $QORTS_JARFILE QC --stranded inputData/bamFiles/SAMP1_RG1.bam \ inputData/annoFiles/anno.gtf.gz outputData/qortsQcData/SAMP1_RG1/ [user@cn3144 ~]$ while read line do mkdir outputData/qortsData/$line/ java -Xmx10G -Djava.io.tmpdir=/lscratch/$SLURM_JOB_ID \ -jar $QORTS_JARFILE QC --stranded \ inputData/bamFiles/$line.bam \ inputData/annoFiles/anno.gtf.gz \ outputData/qortsData/$line/ done < "inputData/annoFiles/uniqueID.list.txt" [user@cn3144 ~]$ R library(QoRTs): res <- read.qc.results.data("outputData/qortsData/", decoder.files = "inputData/annoFiles/decoder.byUID.txt", calc.DESeq2 = TRUE, calc.edgeR = TRUE); makeMultiPlot.all(res, outfile.dir = "outputData/qortsPlots/summaryPlots/", plot.device.name = "png"); [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. QoRTs.sh). For example:
#!/bin/bash module load QoRTs java -Xmx10G \ -Djava.io.tmpdir=/lscratch/$SLURM_JOB_ID \ -jar $QORTS_JARFILE QC \ --stranded \ --chromSizes inputData/annoFiles/chrom.sizes \ --trackTitlePrefix SAMP1_RG1_WIGGLE \ inputData/bamFiles/SAMP1_RG1.bam \ inputData/annoFiles/anno.gtf.gz \ outputData/qortsQcData/SAMP1_RG1/
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=16 --mem=10g --gres=lscratch:100 QoRTs.sh