IsoQuant is used to analyze long read RNA sequencing data from PacBio or Oxford Nanopore.
IsoQuant allows to reconstruct and quantify transcript models with high precision and decent recall. If the reference annotation is given, IsoQuant also assigns reads to the annotated isoforms based on their intron and exon structure. IsoQuant further performs annotated gene, isoform, exon and intron quantification. If reads are grouped (e.g. according to cell type), counts are reported according to the provided grouping.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load isoquant [user@cn3144 ~]$ isoquant.py --test # run toy test === Running in test mode === Any other option is ignored 2024-11-19 13:06:49,350 - INFO - Running IsoQuant version 3.6.2 2024-11-19 13:06:49,355 - INFO - Novel unspliced transcripts will not be reported, set --report_novel_unspliced true to discover them 2024-11-19 13:06:49,355 - INFO - === IsoQuant pipeline started === 2024-11-19 13:06:49,355 - INFO - gffutils version: 0.13 2024-11-19 13:06:49,355 - INFO - pysam version: 0.22.1 2024-11-19 13:06:49,355 - INFO - pyfaidx version: 0.8.1.3 2024-11-19 13:06:49,356 - INFO - Checking input gene annotation 2024-11-19 13:06:49,359 - INFO - Gene annotation seems to be correct 2024-11-19 13:06:49,359 - INFO - Converting gene annotation file to .db format (takes a while)... ... 2024-11-19 13:06:54,286 - INFO - Read assignment statistics 2024-11-19 13:06:54,286 - INFO - ambiguous: 30 2024-11-19 13:06:54,286 - INFO - inconsistent: 93 2024-11-19 13:06:54,286 - INFO - inconsistent_ambiguous: 10 2024-11-19 13:06:54,286 - INFO - noninformative: 1 2024-11-19 13:06:54,286 - INFO - unique: 139 2024-11-19 13:06:54,286 - INFO - unique_minor_difference: 150 2024-11-19 13:06:54,294 - INFO - Processed experiment TEST_DATA 2024-11-19 13:06:54,294 - INFO - Processed 1 experiment 2024-11-19 13:06:54,294 - INFO - === IsoQuant pipeline finished === 2024-11-19 13:06:54,294 - INFO - === TEST PASSED CORRECTLY === [user@cn3144 ~]$ cd /data/$USER/analysis [user@cn3144 ~]$ # real analysis example: [user@cn3144 analysis]$ isoquant.py -d nanopore \ --fastq ONT.cDNA.raw.fastq.gz \ --reference reference.fasta \ --output output_dir \ --prefix My_ONT_cDNA [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. isoquant.sh). For example:
#!/bin/bash set -e module load isoquant cd /data/$USER/analysis isoquant.py -d pacbio_ccs --bam mapped_reads.bam --genedb annotation.db --output output_dir
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] isoquant.sh
Create a swarmfile (e.g. isoquant.swarm). For example:
isoquant.py -d pacbio_ccs --bam reads1.bam --genedb ann1.db --output out1 isoquant.py -d pacbio_ccs --bam reads2.bam --genedb ann2.db --output out2 isoquant.py -d pacbio_ccs --bam reads3.bam --genedb ann3.db --output out3 isoquant.py -d pacbio_ccs --bam reads4.bam --genedb ann4.db --output out4
Submit this job using the swarm command.
swarm -f isoquant.swarm [-g #] [-t #] --module isoquantwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module isoquant | Loads the isoquant module for each subjob in the swarm |