High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
crossmap on Biowulf

CrossMap is a program for convenient conversion of genome coordinates between different assemblies (e.g. mm9->mm10). It can convert SAM, BAM, bed, GTF, GFF, wig/bigWig, and VCF files.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ td=/usr/local/apps/crossmap/TEST_DATA

[user@cn3144 ~]$ crossmap bed $td/hg18ToHg19.over.chain $td/test_input > test_output

[user@cn3144 ~]$ head -n2 test_output
chr1    142614848       142617697       ->      chr1    143903503       143906352
chr1    142617697       142623312       ->      chr1    143906355       143911970

[user@cn3144 ~]$ diff --ignore-all-space expected_output test_output

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. crossmap.sh). For example:

#!/bin/bash
function fail() {
    echo "$@" >&2
    exit 1
}

module load crossmap || fail "could not load crossmap module"
if [[ ! -f hg19ToHg38.over.chain.gz ]]; then
    wget http://hgdownload.soe.ucsc.edu/goldenPath/mm9/liftOver/hg19ToHg38.over.chain.gz
fi
crossmap bam hg19ToHg38.over.chain.gz hg19_example.bam out

Submit this job using the Slurm sbatch command.

sbatch crossmap.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. crossmap.swarm). For example:

crossmap bam hg19ToHg38.over.chain.gz sample1.bam sample1_hg38.bam
crossmap bam hg19ToHg38.over.chain.gz sample2.bam sample2_hg38.bam
crossmap bam hg19ToHg38.over.chain.gz sample3.bam sample3_hg38.bam

Submit this job using the swarm command.

swarm -f crossmap.swarm --module crossmap
where
--module crossmap Loads the crossmap module for each subjob in the swarm