High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Breakdancer on Biowulf & Helix

The BreakDancer package provides genome-wide detection of structural variants from next generation paired-end sequencing reads.

Program breakdancer-max predicts five types of structural variants: insertions, deletions, inversions, inter- and intra-chromosomal translocations from next-generation short paired-end sequencing reads using read pairs that are mapped with unexpected separation distances or orientation.

Running on Helix

$ module load breakdancer
$ cd /data/$USER/dir
$ bam2cfg.pl bam_file1 bam_file2 %BreakdancerOptions% > config_file.cfg
$ breakdancer-max config_file.cfg > file.ctx

Running a single batch job on Biowulf

1. Create a script file. The file will contain the lines similar to the lines below.

#!/bin/bash


module load breakdancer
cd /data/$USER/dir
bam2cfg.pl bam_file1 bam_file2 %BreakdancerOptions% > config_file.cfg
breakdancer-max config_file.cfg > file.ctx

2. Submit the script on biowulf:

$ sbatch jobscript

If more momory is required (default 4gb), specify --mem=Mg, for example --mem=10g.

$ sbatch --mem=10g jobscript

Running a swarm of jobs on Biowulf

Setup a swarm command file:

  cd /data/$USER/dir1; breakdancer-max config_file.cfg > file.ctx
  cd /data/$USER/dir2; breakdancer-max config_file.cfg > file.ctx
  cd /data/$USER/dir3; breakdancer-max config_file.cfg > file.ctx
	[......]
  

Submit the swarm file, -f specify the swarmfile name, and module breakdancer will be loaded for each command line in the file:

  $ swarm -f swarmfile --module breakdancer

If more memory is needed for each line of commands, the below example allocate 10g for each command:

  $ swarm -f swarmfile -g 10 --module breakdancer

For more information regarding running swarm, see swarm.html

Running an interactive job on Biowulf

It may be useful for debugging purposes to run jobs interactively. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.

biowulf$ sinteractive 
salloc.exe: Granted job allocation 16535

cn999$ module load breakdancer
cn999$ cd /data/$USER/dir
cn999$ breakdancer-max config_file.cfg > file.ctx
[...etc...]

cn999$ exit
exit

biowulf$

Make sure to exit the job once finished.

If more memory is needed, use --mem. For example

biowulf$ sinteractive --mem=8g

Documentation

gmt.genome.wustl.edu/packages/breakdancer/documentation.html