High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed

SOAP3-dp, like its predecessor SOAP3, is a GPU-based software for aligning short reads to a reference sequence. It improves SOAP3 in terms of both speed and sensitivity by exploitation of whole-genome indexing and dynamic programming on a GPU. SOAP3 is limited to find alignments with at most 4 mismatches, while SOAP3-dp can find alignments involving mismatches, INDELs, and small gaps. The number of reads aligned, especially for paired-end data, typically increases 5 to 10 percent from SOAP3 to SOAP3-dp. More interestingly, SOAP3-dp's alignment time is much shorter than SOAP3, as it is found that GPU-based dynamic programming when coupled with indexing can be much more efficient. For example, when aligning length-100 single-end reads with the human genome, SOAP3 typically requires tens of seconds per million reads, while SOAP3-dp takes only a few seconds.

Each SOAP3-dp invocation can run on only a single GPU. If you'd like to use multiple, you should split up your input data and submit multiple jobs, each requesting one GPU. You may, however, use multiple CPUs. The example commands substitute SLURM_CPUS_PER_TASK in the run configuration file so that the number of CPUs that you requested are used by the program. Keep in mind that the higher this number is, the more output files you will get as the data are partitioned per thread.


There may be multiple versions of SOAP3-dp available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail soap3-dp

To select a module, type

module load soap/soap3-dp/[ver]

where [ver] is the version of choice.

Environment variables set:

Batch job on Biowulf

Create a batch input file (e.g. soap3-dp.sh). For example:

module load soap/soap3-dp
sed -i "/NumOfCpuThreads/s/=.*/=$SLURM_CPUS_PER_TASK/" soap3-dp.ini
soap3-dp pair /fdb/soap3-dp/hg38.fa.index gcat_set_053_{1,2}.fastq.gz -b 3

Submit this job using the Slurm sbatch command.

mkdir /data/$USER/soap3-dp
cd /data/$USER/soap3-dp
ln -s /fdb/app_testdata/fastq/Homo_sapiens/gcat_set_053_*.fastq.gz .
cp $SOAP3DP_HOME/soap3-dp.ini . # make any configuration changes you'd like
sbatch --partition=gpu --gres=gpu:k80:1 --cpus-per-task=4 --mem=32g SOAP3-dp.sh
Swarm of Jobs on Biowulf

Suppose your data are organized as follows:

├── sample01
│   ├── reads_1.fastq.gz
│   └── reads_2.fastq.gz
├── sample02
│   ├── reads_1.fastq.gz
│   └── reads_2.fastq.gz
└── sample03
    ├── reads_1.fastq.gz
    └── reads_2.fastq.gz

Prepare your run configuration:

[teacher@biowulf ~]$ mkdir /data/$USER/soap3-dp
[teacher@biowulf ~]$ cd /data/$USER/soap3-dp
[teacher@biowulf ~]$ mkdir $(ls /data/$USER/seqdata)
[teacher@biowulf soap3-dp]$ module load soap/soap3-dp
[+] Loading CUDA Toolkit 8.0.44 ...
[+] Loading soap/soap3-dp, version 2.3.178+20170103...
[teacher@biowulf soap3-dp]$ cp $SOAP3DP_HOME/soap3-dp.ini .
[teacher@biowulf soap3-dp]$ NPROC=4
[teacher@biowulf soap3-dp]$ sed -i "/NumOfCpuThreads/s/=.*/=$NPROC/" soap3-dp.ini
[teacher@biowulf soap3-dp]$ for dir in */; do (cd $dir && ln -s ../soap3-dp.ini .); done

Create a swarmfile (e.g. soap3-dp.swarm). For example:

cd /data/$USER/soap3-dp/sample01 && soap3-dp pair /fdb/soap3-dp/hg38.fa.index reads_{1,2}.fastq.gz -b 3
cd /data/$USER/soap3-dp/sample02 && soap3-dp pair /fdb/soap3-dp/hg38.fa.index reads_{1,2}.fastq.gz -b 3
cd /data/$USER/soap3-dp/sample03 && soap3-dp pair /fdb/soap3-dp/hg38.fa.index reads_{1,2}.fastq.gz -b 3

Submit this job using the swarm command:

[teacher@biowulf soap3-dp]$ swarm -f soap3-dp.swarm -g 32 -t $NPROC --module soap/soap3-dp --partition gpu --gres gpu:k80:1
Interactive job on Biowulf

[teacher@biowulf ~]$ sinteractive --constraint=gpuk80 --gres=gpu:k80:1 --cpus-per-task=4 --mem=32g
salloc.exe: Pending job allocation 39865508
salloc.exe: job 39865508 queued and waiting for resources
salloc.exe: job 39865508 has been allocated resources
salloc.exe: Granted job allocation 39865508
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn4176 are ready for job
srun: error: x11: no local DISPLAY defined, skipping
[teacher@cn4176 ~]$ mkdir /data/$USER/soap3-dp
[teacher@cn4176 ~]$ cd /data/$USER/soap3-dp
[teacher@cn4176 soap3-dp]$ ln -s /fdb/app_testdata/fastq/Homo_sapiens/gcat_set_053_*.fastq.gz .
[teacher@cn4176 soap3-dp]$ module load soap/soap3-dp
[+] Loading CUDA Toolkit 8.0.44 ...
[+] Loading soap/soap3-dp, version 2.3.178+20170103...
[teacher@cn4176 soap3-dp]$ cp $SOAP3DP_HOME/soap3-dp.ini .
[teacher@cn4176 soap3-dp]$ sed -i "/NumOfCpuThreads/s/=.*/=$SLURM_CPUS_PER_TASK/" soap3-dp.ini
[teacher@cn4176 soap3-dp]$ soap3-dp pair /fdb/soap3-dp/hg38.fa.index gcat_set_053_{1,2}.fastq.gz -b 3

[Main] SOAP3-DP v2.3.178 (build)
[Main] Loading read files gcat_set_053_1.fastq.gz and gcat_set_053_2.fastq.gz
[Main] loading index into host...
[Main] Finished loading index into host.
[Main] Loading time :    6.9985 seconds

[Main] Finished loading index into host.
[Main] Loading time :    6.9986 seconds
[Main] Reference sequence length : 2934876730

[Main] Loaded 12582912 short reads from the query file.
[Main] Elapsed time on host :   31.0444 seconds

[Main] Finished copying index into device (GPU).
[Main] Loading time :    0.5444 seconds

[Main] Finished alignment with <= 2 mismatches
[Main] Number of pairs aligned: 5636545
[Main] Elapsed time :  100.2907 seconds

[... lots of output ....]

[Main] 17828 unaligned reads are proceeded to DP.
[Main] Finished copying index into device (GPU).
[Main] Loading time :    0.2960 seconds

[Main] Number of reads aligned by single-end DP: 4952
[Main] Elapsed time :    0.4681 seconds

[Main] Overall number of pairs of reads aligned: 44732647
[Main] Overall read load time :   38.0453 seconds
[Main] Overall alignment time (excl. read loading) :  926.7622 seconds
[Main] Free index from host memory..
[Main] Free host memory..
[teacher@cn4176 soap3-dp]$ ls
[teacher@cn4176 soap3-dp]$ exit
salloc.exe: Relinquishing job allocation 39865508
The -b 3 arguments specify to output in bam format. The files *.gout.* and *.dpout.1 here are actually bam files. See the program documentation for futher details.