High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
bam-matcher

A simple tool for determining whether two BAM files contain reads sequenced from the same sample or patient by counting genotype matches at common SNPs. bam-matcher is most useful at comparing whole-genome-sequencing (WGS), whole-exome-sequencing (WES) and RNA-sequencing (RNA-seq) human data, but can also be customised to compare panel data or non-human data.

References:

There are multiple versions of bam-matcher available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail bam-matcher

To select a module, type

module load bam-matcher/[ver]

where [ver] is the version of choice.

Environment variables set:

Important Note

Example

$ module load bam-matcher
$ cp -Rp $BAM_MATCHER_HOME/test_data/* .
$ bam-matcher.py -B1 sample1.bam -B2 sample2.bam -NC --scratch-dir $(pwd)/$RANDOM --caller freebayes
...
________________________________________

Positions with same genotype:   139
     breakdown:    hom: 57
                   het: 82
________________________________________

Positions with diff genotype:   77
     breakdown:
                       BAM 1
               | het  | hom  | subset
        -------+------+------+-------
         het   |    1 |    0 |   21 |
        -------+------+------+-------
BAM 2    hom   |    0 |   12 |   -  |
        -------+------+------+-------
         subset|   43 |   -  |   -  |
________________________________________

Total sites compared: 216
Fraction of common: 0.643519 (139/216)
________________________________________
CONCLUSION:
LIKELY FROM DIFFERENT SOURCES
On Helix

Sample session:

$ module load bam-matcher
$ bam-matcher.py -B1 sample1.bam -B2 sample2.bam
Batch job on Biowulf

Create a batch input file (e.g. bam-matcher.sh). For example:

#!/bin/bash
module load bam-matcher
bam-matcher.py -B1 sample1.bam -B2 sample2.bam

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=1 bam-matcher.sh
Swarm of Jobs on Biowulf

Create a swarmfile (e.g. bam-matcher.swarm). For example:

bam-matcher.py -B1 sample1.bam -B2 sample2.bam
bam-matcher.py -B1 sample3.bam -B2 sample4.bam
bam-matcher.py -B1 sample5.bam -B2 sample6.bam
bam-matcher.py -B1 sample7.bam -B2 sample8.bam

Submit this job using the swarm command.

swarm -f bam-matcher.swarm 
Interactive job on Biowulf

Once an interactive session is started, the steps are identical to that of Helix (above).

Documentation