iSAAC is an ultrafast DNA sequence aligner (Isaac Genome Alignment Software) that takes advantage of high-memory hardware (>48 GB) and variant caller (Isaac Variant Caller)
Sample session with the demo dataset that is provided with iSAAC. Allocate an interactive session with memory, cpus and local scratch (which will be used for temporary files).
[user@biowulf ~]$ sinteractive --mem=50g --cpus-per-task=32 --gres=lscratch:50 salloc.exe: Pending job allocation 36744333 salloc.exe: job 36744333 queued and waiting for resources salloc.exe: job 36744333 has been allocated resources salloc.exe: Granted job allocation 36744333 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn2493 are ready for job [user@cn2493 ~] cd /data/$USER/testjob [user@cn2493 ~] module load isaac [user@cn2493 testjob]$ isaac-sort-reference \ -g $ISAAC_HOME/share/iSAAC-02.16.03.09/data/examples/PhiX/iGenomes/PhiX/NCBI/1993-04-28/Sequence/Chromosomes/phix.fa \ -o ./PhiX make: Entering directory `/spin1/users/user/iSAAC/PhiX' /usr/local/apps/iSAAC/02.16.03.09/bin/../share/iSAAC-02.16.03.09/makefiles/reference/../../../../share/iSAAC-02.16.03.09/makefiles/common/../../../../libexec/iSAAC-02.16.03.09/findNeighbors -r /spin1/users/user/iSAAC/PhiX/Temp/neighbor-positions-56.xml \ --seed-length 56 \ --neighborhood-distance 1 \ --mask-width 0 \ --mask 0 \ --output-file /spin1/users/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz.tmp && mv /spin1/users/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz.tmp /spin1/users/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz [2017-03-31 09:15:57] [biowulf.nih.gov] [ers/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz] make Target: /spin1/users/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz [2017-03-31 09:15:57] [biowulf.nih.gov] [ers/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz] make Reason: /spin1/users/user/iSAAC/PhiX/Temp/neighbor-positions-56.xml Temp/.sentinel [2017-03-31 09:15:57] [biowulf.nih.gov] [ers/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz] make Prereqs: /spin1/users/user/iSAAC/PhiX/Temp/neighbor-positions-56.xml Temp/.sentinel [2017-03-31 09:15:57] [biowulf.nih.gov] [ers/user/iSAAC/PhiX/Temp/neighbors-1-56.16bpb.gz] make Cmd: /usr/local/apps/iSAAC/02.16.03.09/bin/../share/iSAAC-02.16.03.09/m [...] [2017-03-31 09:16:50] [biowulf.nih.gov] [all] make Target: all [2017-03-31 09:16:50] [biowulf.nih.gov] [all] make Reason: sorted-reference.xml [2017-03-31 09:16:50] [biowulf.nih.gov] [all] make Prereqs: sorted-reference.xml [2017-03-31 09:16:50] [biowulf.nih.gov] [all] make Cmd: [[ 2 == 0 ]] || 1>&2 echo -e "INFO:" "All done!" [2017-03-31 09:16:50] [biowulf.nih.gov] [all] INFO: All done! make: Leaving directory `/spin1/users/user/iSAAC/PhiX' [user@cn2493 testjob]$ isaac-align \ -r ./PhiX/sorted-reference.xml \ -b $ISAAC_HOME/share/*/data/examples/PhiX/Fastq \ -f fastq \ --use-bases-mask y150,y150 \ --variable-read-length yes -m10 \ -j $SLURM_CPUS_PER_TASK \ -t /lscratch/$SLURM_JOBID 2017-03-31 09:18:18 [2aaaac264940] Forcing LC_ALL to C 2017-03-31 09:18:18 [2aaaac264940] Version: iSAAC-02.16.03.09 2017-03-31 09:18:18 [2aaaac264940] argc: 12 argv: isaac-align -r ./PhiX/sorted-reference.xml -b /usr/local/apps/iSAAC/02.16.03.09/share/iSAAC-02.16.03.09/data/examples/PhiX/Fastq -f fastq --use-bases-mask y150,y150 --variable-read-length yes -m10 2017-03-31 09:18:18 [2aaaac264940] FastqReader uncompressedBufferSize_=67108864 2017-03-31 09:18:18 [2aaaac264940] Opened fastq stream on /usr/local/apps/iSAAC/02.16.03.09/share/iSAAC-02.16.03.09/data/examples/PhiX/Fastq/lane1_read1.fastq and base Q0 ! 2017-03-31 09:18:18 [2aaaac264940] FastqReader uncompressedBufferSize_=67108864 [...] 2017-03-31 09:35:00 [2aaaac264940] Generating Build statistics 2017-03-31 09:35:00 [2aaaac264940] Generating Build statistics done 2017-03-31 09:35:00 [2aaaac264940] Generating the BAM files done 2017-03-31 09:35:00 [2aaaac264940] md5 checksum for /spin1/users/user/iSAAC/./Aligned/Projects/default/default/sorted.bam:341a8efa2f33a7f486dd5c0a4cb34c0d 2017-03-31 09:35:00 [2aaaac264940] Saving workflow state to "/spin1/users/user/iSAAC/./Temp/AlignerState.txt" 2017-03-31 09:35:00 [2aaaac264940] Saving workflow state done to "/spin1/users/user/iSAAC/./Temp/AlignerState.txt" [user@cn2493 testjob]$ exit exit salloc.exe: Relinquishing job allocation 36744333
Create a batch input file (e.g. run.sh). The following example uses the demo data provided with iSAAC.
The I/O involved for a large alignment is significant. It is recommended that you use local scratch on the allocated node for temporary files, as in the example below.
#!/bin/bash # submit with: sbatch --mem=50g --cpus-per-task=32 --gres=lscratch:50 run.sh cd /data/$USER/iSAAC module load iSAAC/02.16.03.09 isaac-sort-reference \ -g $ISAAC_HOME/share/iSAAC-02.16.03.09/data/examples/PhiX/iGenomes/PhiX/NCBI/1993-04-28/Sequence/Chromosomes/phix.fa \ -o ./PhiX isaac-align \ -r ./PhiX/sorted-reference.xml -b $ISAAC_HOME/share/*/data/examples/PhiX/Fastq \ -f fastq \ --use-bases-mask y150,y150 \ --variable-read-length yes -m10 \ -j $SLURM_CPUS_PER_TASK \ -t /lscratch/$SLURM_JOBID
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=32 --mem=50g --gres=lscratch:50 run.sh
Create a swarmfile (e.g. isaac.swarm). For example:
cd /data/$USER/dir1; isaac-align \ -r ./PhiX/sorted-reference.xml \ -b Fastq1 \ -f fastq \ --use-bases-mask y150,y150 \ --variable-read-length yes -m10 \ -j $SLURM_CPUS_PER_TASK \ -t /lscratch/$SLURM_JOBID cd /data/$USER/dir2; isaac-align \ -r ./PhiX/sorted-reference.xml \ -b Fastq2 \ -f fastq \ --use-bases-mask y150,y150 \ --variable-read-length yes -m10 \ -j $SLURM_CPUS_PER_TASK \ -t /lscratch/$SLURM_JOBID cd /data/$USER/dir3; isaac-align \ -r ./PhiX/sorted-reference.xml \ -b Fastq3 \ -f fastq \ --use-bases-mask y150,y150 \ --variable-read-length yes -m10 \ -j $SLURM_CPUS_PER_TASK \ -t /lscratch/$SLURM_JOBID [...]
Submit this job using the swarm command.
swarm -f isaac.swarm -g 20 -t 24 --gres=lscratch:20