High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
methylQA

methylQA is a methylation sequencing data quality assessment tool for MeDIP-seq and MRE-seq. It provides basic mapping status of next generating sequencing data, like number of total reads, number of mapped reads, etc. It also provides CpG status information such as how many CpG have been covered by one experiment, how many times one CpG have been covered, etc. methylQA can also process general ChIP-seq data like Histone/TF ChIP-seq data, generate read density and mapping statistics.

Supporting Files

Supporting files for this program (chromosome size files, CpG bed files, MRE fragments bed files) can be found in /fdb/methylQA.

References:

There may be multiple versions of methylQA available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail methylQA

To select a module, type

module load methylQA/[ver]

where [ver] is the version of choice.

Environment variables set:

On Helix

Sample session:

$ module load methylQA
$ mkdir /data/$USER/methylQA-helix
$ cd !$
$ methylQA medip -m /fdb/methylQA/CpG.bed.gz /fdb/methylQA/h19_full.size /usr/local/apps/methylQA/data/Brain_MeDIP.bam
* CpG bed file /fdb/methylQA/CpG.bed.gz provided, will calculate CpG stats
* Reading the CpG bed file
* Parsing the SAM/BAM file
* Processed read ends: 142273429
* Single end data
* Skipped supplementary alignments: 0
null device 
          1 
* Generating fragments size stats
null device 
          1 
* fragments total base: 10843080126
* Generating CpG stats
null device 
          1 
null device 
          1 
* Sorting extended bed
* Generating bedGraph
* Generating bigWig
* Calculating genome coverage
null device 
          1 
* Preparing report file
* Done, time used 10453 seconds.
Batch job on Biowulf

Create a batch input file (e.g. methylQA.sh). For example:

#!/bin/bash
module load methylQA
methylQA medip -m /fdb/methylQA/CpG.bed.gz /fdb/methylQA/hg19_full.size /usr/local/apps/methylQA/data/Brain_MeDIP.bam

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=1 --mem=16g methylQA.sh
Swarm of Jobs on Biowulf

Create a swarmfile following the swarm guide using the example commands on this page.

Interactive job on Biowulf

$ sinteractive --mem=16g
salloc.exe: Pending job allocation 37529438
salloc.exe: job 37529438 queued and waiting for resources
salloc.exe: job 37529438 has been allocated resources
salloc.exe: Granted job allocation 37529438
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn2480 are ready for job
srun: error: x11: no local DISPLAY defined, skipping
$ module load methylQA
[+] Loading GSL 2.2.1 ...
[+] Loading Graphviz v2.38.0 ...
[+] Loading LAPACK 3.6.1-gcc-6.2.0 libraries...
[+] Loading gdal 2.0 ...
[+] Loading proj 4.9.2 ...
[+] Loading gcc 6.2.0 ...
[+] Loading openmpi 2.0.1 for GCC 6.2.0
[+] Loading tcl_tk 8.6.3
[+] Loading pandoc 1.15.0.6 ...
[+] Loading Zlib 1.2.8 ...
[+] Loading Bzip2 1.0.6 ...
[+] Loading pcre 8.38 ...
[+] Loading liblzma 5.2.2 ...
[+] Loading curl 7.46.0 ...
[+] Loading R 3.3.2 on cn2480
[+] Loading methylQA, version 0.1.8...
$ mkdir /data/$USER/methylQA-biowulf-interactive
$ cd !$
$ methylQA medip -m /fdb/methylQA/CpG.bed.gz /fdb/methylQA/hg19_full.size /usr/local/apps/methylQA/data/Brain_MeDIP.bam
* CpG bed file /fdb/methylQA/CpG.bed.gz provided, will calculate CpG stats
* Reading the CpG bed file
* Parsing the SAM/BAM file
* Processed read ends: 142273429
* Single end data
* Skipped supplementary alignments: 0
null device 
          1 
* Generating fragments size stats
null device 
          1 
* fragments total base: 10843080126
* Generating CpG stats
null device 
          1 
null device 
          1 
* Sorting extended bed
* Generating bedGraph
* Generating bigWig
* Calculating genome coverage
null device 
          1 
* Preparing report file
* Done, time used 1300 seconds.

Documentation