High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
mapDamage on Biowulf & Helix

Description

mapDamage profiles DNA damage patterns in next-generation sequencing analyses of ancient DNA samples.

There may be multiple versions of mapDamage available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail mapDamage 

To select a module use

module load mapDamage/[version]

where [version] is the version of choice.

Environment variables set

Dependencies

Dependencies are automatically loaded by the environment module.

References

Documentation

On Helix

Set up the environment

helix$ module load mapDamage
[...snip...]
[+] Loading R 3.3.0 on helix.nih.gov
[+] Loading mapDamage 2.0.6

and run mapDamage

helix$ mapDamage -i alignments.bam -r reference.fa

Note that helix should only be used for small data sets.

Batch job on Biowulf

Create a batch script similar to the following example:

#! /bin/bash
# this file is mapDamage.sh

module load mapDamage || exit 1
mapDamage -i alignments.bam -r reference.fa

Submit to the queue with sbatch:

biowulf$ sbatch mapDamage.sh
Swarm of jobs on Biowulf

Create a swarm command file similar to the following example:

# this file is mapDamage.swarm
mapDamage -i alignments_sample1.bam -r reference.fa
mapDamage -i alignments_sample2.bam -r reference.fa
mapDamage -i alignments_sample3.bam -r reference.fa

And submit to the queue with swarm

biowulf$ swarm -f mapDamage.swarm --module mapDamage
Interactive job on Biowulf

Allocate an interactive session with sinteractive and use as described above

biowulf$ sinteractive 
node$ module load mapDamage
node$ mapDamage -i alignments.bam -r reference.fa
node$ exit
biowulf$