High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Hotspot on Biowulf & Helix

Hotspot is a program for identifying regions of local enrichment of short-read sequence tags mapped to the genome using a binomial distribution model. Regions flagged by the algorithm are called "hotspots." The algorithm utilizes a local background model that automatically normalizes for large regions of elevated tag levels due to, for example, copy number effects. Hotpsot is otherwise able to detect regions of enrichment of highly-variable size, making it applicable to both broad and highly-punctate signals.

Hotspot was developed by John et al at the NIH and the University of Washington, Seattle. [Hotspot paper]

The examples in this page use the sample dataset provided with the program.

On Helix

Use module load hotspot to load the appropriate version of hotspot, and then run hotspot. Sample session:

[helix]$ cd /data/$USER/hotspot

[helix]$ module load hotspot

[helix]$ cp $HOTSPOT_TEST_DIR/* .

[helix]$ module load bedops bedtools R

[helix]$ ./runhotspot

run_badspot
run_badspot: generating tile, small window...
run_badspot: bedmap & awk, small window...
run_badspot: generating tile, small window...
run_badspot: bedmap & awk, large window...
run_badspot: generating tile, small window...
run_badspot: bedmap & awk, small window...
[...etc...]

Batch job on Biowulf

Set up a batch script along the following lines. This particular script copies down the test data and runs hotspot.

#!/bin/bash

module load hotspot bedops bedtools R

cd /data/$USER/hotspot
cp $HOTSPOT_TEST_DIR/* .

./runhotspot

Submit this job with:

sbatch  jobscript
Documentation

Hotspot website