High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Stringtie on Biowulf & Helix

StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. Its input can include not only the alignments of raw reads used by other transcript assemblers, but also alignments longer sequences that have been assembled from those reads.To identify differentially expressed genes between experiments, StringTie's output can be processed either by the Cuffdiff or Ballgown programs.

StringTie is free, open source software released under an Artistic Licen

 

Running on Helix

Sample session:

helix$ module load stringtie

helix$ stringtie <aligned_reads.bam> [other options]
Running a single batch job on Biowulf

Set up a batch script along the following lines.

#!/bin/bash 

cd /data/$USER/mydir
module load stringtie
stringtie <aligned_reads.bam> -p $SLURM_CPUS_PER_TASK [other options]*

Submit to the batch system with:

$ sbatch --cpus-per-task=4 myscript

The command above will allocate 4 cpus to the job.

 

Running a swarm of jobs on Biowulf

Set up a swarm command file (eg /data/$USER/cmdfile). Here is a sample file:

cd /data/$USER/mydir1; stringtie <aligned_reads.bam> -p $SLURM_CPUS_PER_TASK [other options]*
cd /data/$USER/mydir2; stringtie <aligned_reads.bam> -p $SLURM_CPUS_PER_TASK [other options]*
cd /data/$USER/mydir3; stringtie <aligned_reads.bam> -p $SLURM_CPUS_PER_TASK [other options]*
[...]   

Submit this job with

$ swarm -f cmdfile -t 4 --module stringtie

The '-t 4' flag will be assigned to $SLURM_CPUS_PER_TASK in the swarm file automatically.

Running an interactive job on Biowulf

Users may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.

[user@biowulf]$ sinteractive -cpus-per-task=4 

[user@pXXXX]$ cd /data/$USER/myruns

[user@pXXXX]$ module load stringtie

[user@pXXXX]$ stringtie <aligned_reads.bam> -p $SLURM_CPUS_PER_TASK [other options]

[user@pXXXX] exit

[user@biowulf]$ 

Documentation

http://ccb.jhu.edu/software/stringtie/