snippy is a tool for rapid haploid variant calling and core genome alignment.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=16g --gres=lscratch:10 [user@cn3200 ~]$module load snippy [+] Loading snippy 4.4.1 ...Here is how snippy can be run on a test example:
[user@cn3200 ~]$ cp -a /usr/local/apps/snippy/4.6.0/sample_data . [user@cn3200 ~]$ ln -s /fdb/igenomes/Homo_sapiens/UCSC/hg19/hg19.fa [user@cn3200 ~]$ snippy --cpus 16 --outdir mysnps --ref hg19.fa --R1 test_R1.fastq.gz --R2 test_R2.fastq.gz ... [11:14:32] Written by Torsten SeemannAn output folder mysnps is created:[11:14:32] Obtained from https://github.com/tseemann/snippy [11:14:32] Detected operating system: linux [11:14:32] Enabling bundled linux tools. [11:14:32] Found bwa - /opt/conda/envs/snippy/bin/bwa [11:14:32] Found samtools - /opt/conda/envs/snippy/bin/samtools [11:14:32] Found tabix - /opt/conda/envs/snippy/bin/tabix [11:14:32] Found bgzip - /opt/conda/envs/snippy/bin/bgzip [11:14:32] Found snpEff - /opt/conda/envs/snippy/bin/snpEff [11:14:32] Found java - /opt/conda/envs/snippy/bin/java [11:14:32] Found gzip - /usr/bin/gzip [11:14:32] Found parallel - /opt/conda/envs/snippy/bin/parallel [11:14:32] Found freebayes - /opt/conda/envs/snippy/bin/freebayes [11:14:32] Found freebayes-parallel - /opt/conda/envs/snippy/bin/freebayes-parallel [11:14:32] Found fasta_generate_regions.py - /opt/conda/envs/snippy/bin/fasta_generate_regions.py [11:14:32] Found vcfstreamsort - /opt/conda/envs/snippy/bin/vcfstreamsort [11:14:32] Found vcfuniq - /opt/conda/envs/snippy/bin/vcfuniq [11:14:32] Found vcffirstheader - /opt/conda/envs/snippy/bin/vcffirstheader [11:14:32] Found vcf-consensus - /opt/conda/envs/snippy/bin/../binaries/noarch/vcf-consensus [11:14:32] Found snippy-vcf_to_tab - /opt/conda/envs/snippy/bin/snippy-vcf_to_tab [11:14:32] Found snippy-vcf_report - /opt/conda/envs/snippy/bin/snippy-vcf_report [11:14:32] Found snippy-vcf_filter - /opt/conda/envs/snippy/bin/snippy-vcf_filter [11:14:32] Checking version: samtools --version is >= 1.3 - ok, have 1.6 [11:14:32] Checking version: freebayes --version is >= 1.1 - ok, have 1.3 [11:14:32] Checking version: snpEff -version is >= 4.3 - ok, have 5.1 [11:14:32] Using reference: /gpfs/gsfs7/users/user/snippy/hg19.fa [11:14:50] Treating reference as 'fasta' format. [11:14:50] Will use 16 CPU cores. [11:14:50] Using read file: /gpfs/gsfs7/users/user/snippy/test_R1.fastq.gz [11:14:50] Using read file: /gpfs/gsfs7/users/user/snippy/test_R2.fastq.gz [11:14:50] Output folder mysnps3 already exists. [user@cn1113 snippy]$ snippy --cpus 16 --outdir mysnps4 --ref hg19.fa --R1 test_R1.fastq.gz --R2 test_R2.fastq.gz [11:15:07] This is snippy 3.2-dev [11:15:07] Written by Torsten Seemann [11:15:07] Obtained from https://github.com/tseemann/snippy [11:15:07] Detected operating system: linux [11:15:07] Enabling bundled linux tools. [11:15:07] Found bwa - /opt/conda/envs/snippy/bin/bwa [11:15:07] Found samtools - /opt/conda/envs/snippy/bin/samtools [11:15:07] Found tabix - /opt/conda/envs/snippy/bin/tabix [11:15:07] Found bgzip - /opt/conda/envs/snippy/bin/bgzip [11:15:07] Found snpEff - /opt/conda/envs/snippy/bin/snpEff [11:15:07] Found java - /opt/conda/envs/snippy/bin/java [11:15:07] Found gzip - /usr/bin/gzip [11:15:07] Found parallel - /opt/conda/envs/snippy/bin/parallel [11:15:07] Found freebayes - /opt/conda/envs/snippy/bin/freebayes [11:15:07] Found freebayes-parallel - /opt/conda/envs/snippy/bin/freebayes-parallel [11:15:07] Found fasta_generate_regions.py - /opt/conda/envs/snippy/bin/fasta_generate_regions.py [11:15:07] Found vcfstreamsort - /opt/conda/envs/snippy/bin/vcfstreamsort [11:15:07] Found vcfuniq - /opt/conda/envs/snippy/bin/vcfuniq [11:15:07] Found vcffirstheader - /opt/conda/envs/snippy/bin/vcffirstheader [11:15:07] Found vcf-consensus - /opt/conda/envs/snippy/bin/../binaries/noarch/vcf-consensus [11:15:07] Found snippy-vcf_to_tab - /opt/conda/envs/snippy/bin/snippy-vcf_to_tab [11:15:07] Found snippy-vcf_report - /opt/conda/envs/snippy/bin/snippy-vcf_report [11:15:07] Found snippy-vcf_filter - /opt/conda/envs/snippy/bin/snippy-vcf_filter [11:15:07] Checking version: samtools --version is >= 1.3 - ok, have 1.6 [11:15:07] Checking version: freebayes --version is >= 1.1 - ok, have 1.3 [11:15:08] Checking version: snpEff -version is >= 4.3 - ok, have 5.1 [11:15:08] Using reference: /gpfs/gsfs7/users/user/snippy/hg19.fa [11:15:19] Treating reference as 'fasta' format. [11:15:19] Will use 16 CPU cores. [11:15:19] Using read file: /gpfs/gsfs7/users/user/snippy/test_R1.fastq.gz [11:15:19] Using read file: /gpfs/gsfs7/users/user/snippy/test_R2.fastq.gz [11:15:19] Creating folder: mysnps4 [11:15:19] Changing working directory: mysnps4 [11:15:19] Creating reference folder: reference [11:15:19] Extracting FASTA and GFF from reference. [11:16:01] Wrote 25 sequences to ref.fa [11:16:01] Wrote 0 features to ref.gff [11:16:01] Freebayes will process 64 chunks of 49176391 bp, 16 chunks at a time. [11:16:01] Using BAM RG (Read Group) ID: snps [11:16:01] Running: samtools faidx reference/ref.fa 2>> snps.log [11:16:17] Running: bwa index reference/ref.fa 2>> snps.log [12:01:52] Running: mkdir reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log [12:01:54] Running: mkdir reference/ref && bgzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log [12:01:54] Running: (bwa mem -v 2 -M -R '@RG\tID:snps\tSM:snps' -t 16 reference/ref.fa /gpfs/gsfs7/users/user/snippy/test_R1.fastq.gz /gpfs/gsfs7/users/user/snippy/test_R2.fastq.gz | samtools view -u -T reference/ref.fa -F 3844 -q 60 | samtools sort --reference reference/ref.fa > snps.bam) 2>> snps.log [12:01:57] Running: samtools index snps.bam 2>> snps.log [12:01:58] Running: samtools depth -aa -q 20 snps.bam | bgzip > snps.depth.gz 2>> snps.log [12:07:08] Running: tabix -s 1 -b 2 -e 2 snps.depth.gz 2>> snps.log [12:11:06] Running: fasta_generate_regions.py reference/ref.fa.fai 49176391 > reference/ref.txt 2>> snps.log [12:11:07] Running: freebayes-parallel reference/ref.txt 16 -p 1 -q 20 -m 60 --min-coverage 10 -V -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log [12:11:25] Running: /opt/conda/envs/snippy/bin/snippy-vcf_filter --minqual 10 --mincov 10 --minfrac 0.9 snps.raw.vcf > snps.filt.vcf 2>> snps.log [12:11:25] Running: cp snps.filt.vcf snps.vcf 2>> snps.log [12:11:25] Running: bgzip -c snps.vcf > snps.vcf.gz 2>> snps.log [12:11:25] Running: tabix -p vcf snps.vcf.gz 2>> snps.log [12:11:25] Running: /opt/conda/envs/snippy/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab 2>> snps.log [12:11:49] Running: vcf-consensus snps.vcf.gz < reference/ref.fa > snps.consensus.fa 2>> snps.log [12:13:43] Running: /opt/conda/envs/snippy/bin/snippy-vcf_filter --subs snps.filt.vcf > snps.filt.subs.vcf 2>> snps.log [12:13:43] Running: bgzip -c snps.filt.subs.vcf > snps.filt.subs.vcf.gz 2>> snps.log [12:13:43] Running: tabix -p vcf snps.filt.subs.vcf.gz 2>> snps.log [12:13:43] Running: vcf-consensus snps.filt.subs.vcf.gz < reference/ref.fa > snps.consensus.subs.fa 2>> snps.log [12:15:40] Generating aligned/masked FASTA relative to reference: snps.aligned.fa ... [12:46:32] Creating extra output files: BED GFF CSV TXT HTML [12:46:32] Identified 3 variants. [12:46:32] Result folder: mysnps4 [12:46:32] Result files: [12:46:32] * mysnps4/snps.aligned.fa [12:46:32] * mysnps4/snps.bam [12:46:32] * mysnps4/snps.bam.bai [12:46:32] * mysnps4/snps.bed [12:46:32] * mysnps4/snps.consensus.fa [12:46:32] * mysnps4/snps.consensus.subs.fa [12:46:32] * mysnps4/snps.csv [12:46:32] * mysnps4/snps.depth.gz [12:46:32] * mysnps4/snps.depth.gz.tbi [12:46:32] * mysnps4/snps.filt.subs.vcf [12:46:32] * mysnps4/snps.filt.subs.vcf.gz [12:46:32] * mysnps4/snps.filt.subs.vcf.gz.tbi [12:46:32] * mysnps4/snps.filt.vcf [12:46:32] * mysnps4/snps.gff [12:46:32] * mysnps4/snps.html [12:46:32] * mysnps4/snps.log [12:46:32] * mysnps4/snps.raw.vcf [12:46:32] * mysnps4/snps.tab [12:46:32] * mysnps4/snps.txt [12:46:32] * mysnps4/snps.vcf [12:46:32] * mysnps4/snps.vcf.gz [12:46:32] * mysnps4/snps.vcf.gz.tbi [12:46:32] Walltime used: 1 hours, 31 minutes, 25 seconds [12:46:32] Found a bug? Post it at https://github.com/tseemann/snippy/issues [12:46:32] Done.
[user@cn3200 ~]$ tree mysnps mysnps ├ reference │ ├ genomes │ │ └ ref.fa │ ├ ref │ │ ├ genes.gff.gz │ │ └ snpEffectPredictor.bin │ ├ ref.fa │ ├ ref.fa.amb │ ├ ref.fa.ann │ ├ ref.fa.bwt │ ├ ref.fa.fai │ ├ ref.fa.pac │ ├ ref.fa.sa │ ├ ref.gff │ ├ ref.txt │ └ snpeff.config ├ ref.fa -> reference/ref.fa ├ ref.fa.fai -> reference/ref.fa.fai ├ snps.aligned.fa ├ snps.bam ├ snps.bam.bai ├ snps.bed ├ snps.consensus.fa ├ snps.consensus.subs.fa ├ snps.csv ├ snps.filt.vcf ├ snps.gff ├ snps.html ├ snps.log ├ snps.raw.vcf ├ snps.subs.vcf ├ snps.tab ├ snps.txt ├ snps.vcf ├ snps.vcf.gz └ snps.vcf.gz.csi 3 directories, 33 filesaEnd the interactive session:
[user@cn3200 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$