Biowulf High Performance Computing at the NIH
QoRTs on Biowulf

The QoRTs software package is a fast, efficient, and portable multifunction toolkit designed to assist in the analysis, quality control, and data management of RNA-Seq datasets.

References:

Documentation
Important Notes

QoRTs is a mixture of java and R. The jarfile can be executed directly using the $QORTS_JARFILE environment variable:

java -Xmx1G -jar $QORTS_JARFILE --man

NOTE: Memory and scratch space are handled through java options -Xm and -Djava.io.tmpdir. The amount of memory needed is 10-20GB for most genomes. To allocate 10GB of RAM to the java machine, include the option -Xmx10G.

The R steps are done as with any R job:

R
> library(QoRTS);

There is an example walkthrough available which uses data available in $QORTS_EXAMPLES.

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive --cpus-per-task=16 --mem=10g --gres=lscratch:100
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load QoRTs
[user@cn3144 ~]$ cp -R $QORTS_EXAMPLES/* .
[user@cn3144 ~]$ sh exampleScripts/step0/step0.reset.sh
[user@cn3144 ~]$ mkdir -p outputData/qortsQcData/SAMP1_RG1
[user@cn3144 ~]$ java -Xmx10G -jar -Djava.io.tmpdir=/lscratch/$SLURM_JOB_ID \
 $QORTS_JARFILE QC --stranded inputData/bamFiles/SAMP1_RG1.bam \
 inputData/annoFiles/anno.gtf.gz outputData/qortsQcData/SAMP1_RG1/

[user@cn3144 ~]$ while read line
do
  mkdir outputData/qortsData/$line/
  java -Xmx10G -Djava.io.tmpdir=/lscratch/$SLURM_JOB_ID \
    -jar $QORTS_JARFILE QC --stranded \
    inputData/bamFiles/$line.bam \
    inputData/annoFiles/anno.gtf.gz \
    outputData/qortsData/$line/
done < "inputData/annoFiles/uniqueID.list.txt"

[user@cn3144 ~]$ R
library(QoRTs):
res <- read.qc.results.data("outputData/qortsData/",
  decoder.files = "inputData/annoFiles/decoder.byUID.txt",
  calc.DESeq2 = TRUE,
  calc.edgeR = TRUE);
makeMultiPlot.all(res,
  outfile.dir = "outputData/qortsPlots/summaryPlots/",
  plot.device.name = "png");

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. QoRTs.sh). For example:

#!/bin/bash
module load QoRTs

java -Xmx10G \
  -Djava.io.tmpdir=/lscratch/$SLURM_JOB_ID \
  -jar $QORTS_JARFILE QC \
  --stranded \
  --chromSizes inputData/annoFiles/chrom.sizes \
  --trackTitlePrefix SAMP1_RG1_WIGGLE  \
  inputData/bamFiles/SAMP1_RG1.bam \
  inputData/annoFiles/anno.gtf.gz \
  outputData/qortsQcData/SAMP1_RG1/

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=16 --mem=10g --gres=lscratch:100 QoRTs.sh