Bismark is a program to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step. The output can be easily imported into a genome viewer, such as SeqMonk, and enables a researcher to analyse the methylation levels of their samples straight away. It's main features are:
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=6g salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ bismark --help This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ...Full output from the help command
[user@cn3144 ~]$ mkdir -p /data/$USER/bismark_test/XandY [user@cn3144 ~]$ cd /data/$USER/bismark_test [user@cn3144 bismark_test]$ cp /fdb/genome/human-feb2009/chrX.fa ./XandY [user@cn3144 bismark_test]$ cp /fdb/genome/human-feb2009/chrY.fa ./XandY [user@cn3144 bismark_test]$ cp $BISMARK_HOME/test_data.fastq . [user@cn3144 bismark_test]$ bismark_genome_preparation XandY Writing bisulfite genomes out into a single MFA (multi FastA) file Bisulfite Genome Indexer version v0.16.0 (last modified 25 August 2015) Step I - Prepare genome folders - completed Total number of conversions performed: [....]Full output from the prep command
[user@cn3144 ~]$ bismark XandY test_data.fastq Path to Bowtie 2 specified as: bowtie2 Output format is BAM (default) Alignments will be written out in BAM format. Samtools found here: '/usr/local/apps/samtools/1.3.1/bin/samtools' Reference genome folder provided is XandY/ (absolute path is '/spin1/users/user/bismark_test/XandY/)' FastQ format assumed (by default) Files to be analysed: test_data.fastq Library is assumed to be strand-specific (directional), alignments to strands complementary to the original top or bottom strands will be ignored (i.e. not performed!) ...Full output from the run command
[user@cn3144 bismark_test]$ bismark_methylation_extractor test_data_bismark_bt2.bam *** Bismark methylation extractor version v0.16.0 *** Trying to determine the type of mapping from the SAM header line of file test_data_bismark_bt2.bam Treating file(s) as single-end data (as extracted from @PG line) Setting core usage to single-threaded (default). Consider using --multicore <int> to speed up the extraction process. Summarising Bismark methylation extractor parameters: =============================================================== ... [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Full output from the extract command
Create a batch input file (e.g. bismark.sh). For example:
#!/bin/bash # this file is called bismark.sh set -e module load bismark cd /data/$USER/bismark_test bismark_genome_preparation XandY bismark XandY test_data.fastq bismark_methylation_extractor test_data_bismark_bt2.bam
Submit this job using the Slurm sbatch command.
sbatch bismark.sh
Sample swarm command file
# --------file myjobs.swarm---------- bismark directory1 test_data.fastq bismark directory2 test_data.fastq bismark directory3 test_data.fastq .... bismark directoryN test_data.fastq # -----------------------------------
Submit this set of runs to the batch system by typing
[user@biowulf ~]$ swarm --module bismark -f myjobs.swarm