IsoQuant on Biowulf

IsoQuant is used to analyze long read RNA sequencing data from PacBio or Oxford Nanopore.

IsoQuant allows to reconstruct and quantify transcript models with high precision and decent recall. If the reference annotation is given, IsoQuant also assigns reads to the annotated isoforms based on their intron and exon structure. IsoQuant further performs annotated gene, isoform, exon and intron quantification. If reads are grouped (e.g. according to cell type), counts are reported according to the provided grouping.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive --mem=8G -c4
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load isoquant

[user@cn3144 ~]$ cd /data/$USER

[user@cn3144 ~]$ mkdir -p isoquant_test/output

[user@cn3144 ~]$ cd isoquant_test

[user@cn3144 ~]$ cp -r $ISOQUANT_TEST_DATA/toy_data .

[user@cn3144 ~]$ isoquant.py --reference toy_data/MAPT.Mouse.reference.fasta \
 --genedb toy_data/MAPT.Mouse.genedb.gtf \
 --fastq toy_data/MAPT.Mouse.ONT.simulated.fastq \
 --data_type nanopore \
 -o output --threads 4
2025-11-17 12:42:02,675 - INFO - Running IsoQuant version 3.10.0
2025-11-17 12:42:12,678 - INFO - Novel unspliced transcripts will not be reported, set --report_novel_unspliced true to discover them
2025-11-17 12:42:12,678 - INFO -  === IsoQuant pipeline started === 
2025-11-17 12:42:12,678 - INFO - Python version: 3.13.9 | packaged by conda-forge | (main, Oct 22 2025, 23:33:35) [GCC 14.3.0]
2025-11-17 12:42:12,678 - INFO - gffutils version: 0.13
2025-11-17 12:42:12,678 - INFO - pysam version: 0.23.3
2025-11-17 12:42:12,678 - INFO - pyfaidx version: 0.9.0.3
2025-11-17 12:42:12,678 - INFO - Reading reference genome from tests/toy_data/MAPT.Mouse.reference.fasta
2025-11-17 12:42:12,679 - INFO - Checking input gene annotation
2025-11-17 12:42:12,680 - INFO - Gene annotation seems to be correct
2025-11-17 12:42:12,680 - INFO - Converting gene annotation file to .db format (takes a while)...
...
2025-11-17 12:42:13,731 - INFO - Extended annotation is saved to output/OUT/OUT.extended_annotation.gtf
2025-11-17 12:42:13,731 - INFO - Counts for generated transcript models are saves to: output/OUT/OUT.discovered_transcript_counts.tsv
2025-11-17 12:42:13,734 - INFO - Processed experiment OUT
2025-11-17 12:42:13,735 - INFO - Processed 1 experiment
2025-11-17 12:42:13,735 - INFO -  === IsoQuant pipeline finished ===

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. isoquant.sh). For example:

#!/bin/bash
set -e
module load isoquant
cd /data/$USER/analysis
isoquant.py -d pacbio_ccs --bam mapped_reads.bam --genedb annotation.db --threads $SLURM_CPUS_PER_TASK --output output_dir

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#] isoquant.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. isoquant.swarm). For example:

isoquant.py -d pacbio_ccs --bam reads1.bam --genedb ann1.db --output out1 --threads $SLURM_CPUS_PER_TASK
isoquant.py -d pacbio_ccs --bam reads2.bam --genedb ann2.db --output out2 --threads $SLURM_CPUS_PER_TASK
isoquant.py -d pacbio_ccs --bam reads3.bam --genedb ann3.db --output out3 --threads $SLURM_CPUS_PER_TASK
isoquant.py -d pacbio_ccs --bam reads4.bam --genedb ann4.db --output out4 --threads $SLURM_CPUS_PER_TASK

Submit this job using the swarm command.

swarm -f isoquant.swarm [-g #] [-t #] --module isoquant
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module isoquant Loads the isoquant module for each subjob in the swarm