High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Qualimap on Biowulf & Helix

Qualimap is a platform-independent application written in Java and R that provides both a Graphical User Inter- face (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data.

Qualimap was developed at the Max-Planck Institute for Infection Biology, Germany and the Bioinformatics Department of Centro de Investigación Príncipe Felipe (CIPF), Spain. Qualimap website.

Running Qualimap on Helix

You need to make an X-windows connection to Helix to allow the Qualimap GUI to display on your local desktop. Then type 'module load qualimap' to set up the environment, and then type 'qualimap'.

Batch job on Biowulf

Set up a batch script along the following lines:

#!/bin/bash
# this file is called qualimap.sh

cd /data/$USER/mydir
module load qualimap
unset DISPLAY
qualimap bamqc -nt $SLURM_CPUS_PER_TASK -bam test_DNase_seq.hg19.bam -outfile result.pdf

Submit this job with:

sbatch --mem=10g  --cpus-per-task=8 qualimap.sh
Important notes:

Swarm of jobs on Biowulf

If you have a large number of independent Qualimap jobs to run, you will probably want to use the swarm utility

Set up a swarm command script along the following lines:

# this file is called qualimap.swarm
unset DISPLAY; qualimap rnaseq -bam file1.bam -gtf Homo_sapiens.GRCh37.gtf -outdir rnaseq_qc_results
unset DISPLAY; qualimap rnaseq -bam file2.bam -gtf Homo_sapiens.GRCh37.gtf -outdir rnaseq_qc_results
[....]

Submit this swarm of commands with

swarm -f qualimap.swarm -g 5 --module qualimap/2.2
where '-g 5' means that each qualimap command requires 5 GB of memory. You may need to adjust this value.

Interactive command-line job on Biowulf

For debugging purposes, you may want to run your command-line Qualimap jobs interactively. Allocate an interactive session and run your Qualimap commands on the allocated node.

[susanc@biowulf ~]$ sinteractive --cpus-per-task=4 --mem=8g
salloc.exe: Granted job allocation 143331

[susanc@cn0124 ~]$ module load qualimap
[+] Loading Perl 5.8.9 ...
[+] Loading gcc 4.4.7 ...
[+] Loading OpenMPI 1.8.1 for GCC 4.4.7 (ethernet) ...
[+] Loading tcl_tk 8.6.1
[+] Loading LAPACK 3.5.0-gcc-4.4.7 libraries...
[+] Loading R 3.2.0 on cn0124

[susanc@cn0124 ~]$  qualimap bamqc -nt $SLURM_CPUS_PER_TASK -bam test_DNase_seq.hg19.bam -outfile result.pdf
Java memory size is set to 1200M
Launching application...

QualiMap v.2.2
Built on 2016-01-29 12:10

Selected tool: bamqc
Available memory (Mb): 33
Max memory (Mb): 1118
Starting bam qc....
Loading sam header...
Loading locator...
Loading reference...
Number of windows: 400, effective number of windows: 423
Chunk of reads size: 1000
Number of threads: 4
Processed 50 out of 423 windows...
Processed 100 out of 423 windows...
Processed 150 out of 423 windows...
Processed 200 out of 423 windows...
Processed 250 out of 423 windows...
Processed 300 out of 423 windows...
Processed 350 out of 423 windows...
Processed 400 out of 423 windows...
Total processed windows:423
Number of reads: 4830586
Number of valid reads: 4830586
Number of correct strand reads:0

Inside of regions...
Num mapped reads: 4830586
Num mapped first of pair: 0
Num mapped second of pair: 0
Num singletons: 0
Time taken to analyze reads: 111
Computing descriptors...
numberOfMappedBases: 173901096
referenceSize: 3036320417
numberOfSequencedBases: 173854103
numberOfAs: 46212432
Computing per chromosome statistics...
Computing histograms...
Overall analysis time: 112
end of bam qc
Computing report...
Writing PDF report...
PDF file created successfully

Finished

[susanc@cn0124 ~]$ exit
salloc.exe: Relinquishing job allocation 19807448
[susanc@biowulf ~]$
Interactive GUI job on Biowulf

Qualimap uses some R packages. R cannot be run on the Biowulf login node. Therefore, to run the Qualimap GUI interactively, you need to allocate an interactive node. First make an Xwindows connection to Biowulf.

To increase the memory available to qualimap, use the '-java-mem-size' parameter. e.g.

qualimap --java-mem-size=8000M
qualimap bamqc -bam very_large_alignment.bam --java-mem-size=4G

Sample session:

[susanc@biowulf ~]$ sinteractive --mem=8g
salloc.exe: Granted job allocation 143331

[susanc@cn0124 ~]$ module load qualimap
[+] Loading Perl 5.8.9 ...
[+] Loading gcc 4.4.7 ...
[+] Loading OpenMPI 1.8.1 for GCC 4.4.7 (ethernet) ...
[+] Loading tcl_tk 8.6.1
[+] Loading LAPACK 3.5.0-gcc-4.4.7 libraries...
[+] Loading R 3.2.0 on cn0124

[susanc@cn0124 ~]$ qualimap --java-mem-size=8000M
Java memory size is set to 8000M
Launching application...

QualiMap v.2.1.1
Built on 2015-06-15 14:19
Qualimap home is /usr/local/apps/qualimap/qualimap_v2.1.1


[susanc@cn0124 ~]$ exit exit salloc.exe: Relinquishing job allocation 143331 salloc.exe: Job allocation 143331 has been revoked. [susanc@biowulf ~]$

Documentation

Qualimap Manual (PDF)