Biowulf High Performance Computing at the NIH
fade on Biowulf

FADE(Fragmentase Artifact Detection and Elimination) is a method of identification and removal of enymatic fragmentation artifacts.

Features

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive --cpus-per-task=2 --mem=4G
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load fade
[user@cn3144 ~]$ mkdir -p /data/$USER/fade; cd /data/$USER/fade
[user@cn3144 fade]$ fade -h
Fragmentase Artifact Detection and Elimination
usage: ./fade [subcommand]
	annotate: marks artifact reads in bam tags (must be done first)
	out: eliminates artifact from reads(may require queryname sorted bam)
	stats: reports extended information about artifact reads
	stats-clip: reports extended information about all soft-clipped reads
	extract: extracts artifacts into a mapped bam
-h --help This help information.

[user@cn3144 fade]$ fade annotate
Fragmentase Artifact Detection and Elimination
annotate: performs re-alignment of soft-clips and annotates bam records with bitflag (rs) and realignment tags (am)
usage: ./fade annotate [BAM/SAM input] [Indexed fasta reference]

-t     --threads extra threads for parsing the bam file
    --min-length Minimum number of bases for a soft-clip to be considered for artifact detection
-w --window-size Number of bases considered outside of read or mate region for re-alignment
-b         --bam output bam
-u        --ubam output uncompressed bam
-h        --help This help information.

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. fade.sh). For example:

#!/bin/bash
#SBATCH --job-name=S1_fade
#SBATCH --output=S1_fade.out
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=4Gb
#SBATCH --time=2:00:00
#SBATCH --partition=norm

set -e
module load fade
cd /data/$USER/fade
fade annotate -t 8 -b sam1.bam ref.fa > sam1.anno.bam

Submit the job:

sbatch fade.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. fade.swarm). For example:

fade annotate -t 8 -b sam1.bam ref.fa > sam1.anno.bam
fade annotate -t 8 -b sam2.bam ref.fa > sam2.anno.bam
fade annotate -t 8 -b sam3.bam ref.fa > sam3.anno.bam

Submit this job using the swarm command.

swarm -f fade.swarm -g 8 --module fade
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).