Sailfish quantifies the expression of a given set of transcripts using NGS reads. It is run in two stages: (1) The indexing step is run once per set of transcripts (2) The quantification step is run once for each sample.
SAILFISH_TEST_DATA
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=8g --cpus-per-task=4 --gres=lscratch:20 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144]$ module load sailfish [user@cn3144]$ zcat $SAILFISH_TEST_DATA/gencode.vM9.transcripts.fa.gz > M9.fa [user@cn3144]$ sailfish index -t M9.fa -o M9.idx -p $SLURM_CPUS_PER_TASK [user@cn3144]$ cp $SAILFISH_TEST_DATA/ENCFF138LJO.fastq.gz . [user@cn3144]$ sailfish quant -i M9.idx -r <(zcat ENCFF138LJO.fastq.gz) --libType U \ -o quant -p $SLURM_CPUS_PER_TASK [user@cn3144]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf]$
Create a batch input file (e.g. sailfish.sh), which uses the input file 'sailfish.in'. For example:
#! /bin/bash module load sailfish/0.10.0 || exit 1 wd=$PWD cd /lscratch/$SLURM_JOB_ID || exit 1 # get transcriptome from the example directory # This is usually done once - not for each job. Only included # here to show all steps involved in sailfish quantitation. zcat $SAILFISH_TEST_DATA/gencode.vM9.transcripts.fa.gz \ > gencode.vM9.transcripts.fa # index the transcripts sailfish index -t gencode.vM9.transcripts.fa -o gencode.vM9.idx \ -p $SLURM_CPUS_PER_TASK # quantify the transcripts cp $SAILFISH_TEST_DATA/ENCFF138LJO.fastq.gz . sailfish quant -i gencode.vM9.idx -l U \ -r <(zcat ENCFF138LJO.fastq.gz) \ -o quant -p $SLURM_CPUS_PER_TASK cp -r quant $wd
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=8 --mem=8g --gres=lscratch:16 sailfish.sh
Create a swarmfile (e.g. sailfish.swarm). For example:
sailfish quant -i gencode.vM9.idx -l U -r <(zcat sample1.fq.gz) \ -o quant_sample1 -p $SLURM_CPUS_PER_TASK sailfish quant -i gencode.vM9.idx -l U -r <(zcat sample2.fq.gz) \ -o quant_sample2 -p $SLURM_CPUS_PER_TASK sailfish quant -i gencode.vM9.idx -l U -r <(zcat sample3.fq.gz) \ -o quant_sample3 -p $SLURM_CPUS_PER_TASK
Submit this job using the swarm command.
swarm -f sailfish.swarm -g 8 -t 8 --module sailfish/0.10.0where
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module sailfish | Loads the sailfish module for each subjob in the swarm |