SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances. It is based on snakemake, salmon and R. Note that SalmonTE packages its own version of salmon.
SalmonTE.py quant
is multithreaded. Please match the number of
threads to the number of allocated CPUs$SALMONTE_TEST_DATA
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=6g --cpus-per-task=4 --gres=lscratch:10 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144]$ module load salmonte [user@cn3144]$ cp -r ${SALMONTE_TEST_DATA:-none}/data . [user@cn3144]$ ls -lh data total 5.0M -rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_1_R1.fastq -rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_1_R2.fastq -rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_2_R1.fastq -rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_2_R2.fastq -rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_1_R1.fastq -rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_1_R2.fastq -rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_2_R1.fastq -rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_2_R2.fastq [user@cn3144]$ SalmonTE.py quant --reference=hs --outpath=quant_out \ --num_threads=$SLURM_CPUS_PER_TASK --exprtype=count data 2019-11-11 10:19:30,550 Starting quantification mode 2019-11-11 10:19:30,550 Collecting FASTQ files... 2019-11-11 10:19:30,553 The input dataset is considered as a paired-ends dataset. 2019-11-11 10:19:30,553 Collected 4 FASTQ files. 2019-11-11 10:19:30,553 Quantification has been finished. 2019-11-11 10:19:30,553 Running Salmon using Snakemake ... [user@cn3144]$ ls -lh quant_out total 68K -rw-rw-r-- 1 user group 23K Nov 11 10:19 clades.csv -rw-rw-r-- 1 user group 63 Nov 11 10:19 condition.csv drwxrwxr-x 5 user group 4.0K Nov 11 10:19 CTRL_1 drwxrwxr-x 5 user group 4.0K Nov 11 10:19 CTRL_2 -rw-rw-r-- 1 user group 17K Nov 11 10:19 EXPR.csv -rw-rw-r-- 1 user group 161 Nov 11 10:19 MAPPING_INFO.csv drwxrwxr-x 5 user group 4.0K Nov 11 10:19 TARDBP_1 drwxrwxr-x 5 user group 4.0K Nov 11 10:19 TARDBP_2
Notes:
Before running the differential expression test, it is necessary to update the file quant_out/condition.csv to include your experimental conditions.
[user@cn3144]$ mv quant_out/condition.csv quant_out/condition.csv.orig [user@cn3144]$ cat <<EOF > quant_out/condition.csv SampleID,condition TARDBP_1,treatment CTRL_1,control TARDBP_2,treatment CTRL_2,control EOF [user@cn3144]$ ### or just edit the condition.csv file with your favorite text editor [user@cn3144]$ SalmonTE.py test --inpath=quant_out --outpath=test_out \ --tabletype=csv --figtype=png --analysis_type=DE \ --conditions=control,treatment [user@cn3144]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf]$
Create a batch input file (e.g. salmonte.sh), which uses the input file 'salmonte.in'. For example:
#!/bin/bash module load salmonte/0.4 || exit 1 SalmonTE.py quant --reference=hs --outpath=all_quant_out \ --num_threads=$SLURM_CPUS_PER_TASK --exprtype=count data
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=8 --mem=10g salmonte.sh