High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
PileOMeth

A (mostly) universal methylation extractor for BS-seq experiments.

PileOMeth (a temporary name derived due to it using a PILEup to extract METHylation metrics) will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. PileOMeth requires an indexed fasta file containing the reference genome as well.

Web site

PileOMeth On Helix
back to top

Test data can be found at /usr/local/apps/PileOMeth/ver/tests and a python script at that location contains example commands. In this example a user copies this directory to their data directory for testing. (User input in bold)

[user@helix ~]$ module load PileOMeth
[+] Loading PileOMeth 0.1.13 on helix.nih.gov

[user@helix tests]$ PileOMeth --help
PileOMeth: A tool for processing bisulfite sequencing alignments.
Version: 0.1.13 (using HTSlib version 1.2.1)
Usage: PileOMeth  [options]

Commands:
    mbias    Determine the position-dependent methylation bias in a dataset,
             producing diagnostic SVG images.
    extract  Extract methylation metrics from an alignment file in BAM/CRAM
             format.
    mergeContext   Combine single Cytosine metrics from 'PileOMeth extract' into
             per-CpG/CHG metrics.
             
[user@helix tests]$ PileOMeth --version
0.1.13 (using HTSlib version 1.2.1)

[user@helix ~]$ cp -r /usr/local/apps/PileOMeth/0.1.13/tests /data/$USER

[user@helix ~]$ cd /data/$USER/tests

[user@helix tests]$ PileOMeth extract ct100.fa ct_aln.bam -q 2
writing to prefix:'ct_aln'

Running a single PileOMeth job on Biowulf
back to top

Set up a batch script along the following lines:

#!/bin/bash
# file called myjob.bat

module load PileOMeth
cd /data/$USER/tests
PileOMeth extract ct100.fa ct_aln.bam -q 2

Submit this job with:

[user@biowulf ~]$ sbatch myjob.bat

Running a swarm of PileOMeth jobs on Biowulf
back to top

Set up a swarm command file containing one line for each of your PileOMeth runs. Typically, only the input sequence name will change from line to line, but in the example below, different parameters are being applied to each sequence.

Sample swarm command file

# --------file myjobs.swarm----------
PileOMeth extract ct100.fa ct_aln.bam -q 2
PileOMeth extract ct100.fa ct_aln.bam -q 3
PileOMeth extract ct100.fa ct_aln.bam -q 4
....
PileOMeth extract ct100.fa ct_aln.bam -q N
-------------------------------------

Submit this set of runs to the batch system by typing

[user@biowulf ~]$ swarm --module PileOMeth -f myjobs.swarm

For details on using swarm see Swarm on Biowulf.

Documentation
back to top

The source code for PileOMeth can be found in this github repository.