Additional information for users of the nci-dragen partition

The nci-dragen partition as of January 2024 includes one dragen server. It has been funded by NCI/CBIIT until the end of FY 2027

Notes:

Efficient use of licenses
back to top

The Dragen license is metered. If you do not have access to the nci_dragen_turbo QOS please don't do more then some test runs without contacting staff@hpc.nih.gov. License usage can be optimize by creating all needed variation calls for a sample in a single run. For example, when running with a single bam file, i.e.

--bam-input /staging/${ID}/xxxxx.bam

SNV, CNV, and SV can all be called concurrently in a single run by enabling all three caller flags

FlagCalls
--enable-variant-caller trueFor Germline SNV
--enable-cnv trueFor Germline CNV
--enable-sv trueFor Germline SV

Regardless of how many of these three flags used in a single run, the license will only be charged once.

The same applies to somatic variant calling (i.e. a run that includes a tumor bam with --tumor-bam-input. However, tumor-only and somatic variant calls cannot be combined into a single run. Therrefore, ineffect, a full tumor-normal run will charge the license for 2 samples (tumor/normal + germline).

Running a batch job
back to top

Create a batch script similar to the following:

#! /bin/bash

# set up paths etc
source /etc/profile.d/edico.sh

RUNPATH=/fdb/app_testdata/fastq/Homo_sapiens
RUNFOLDER=SRR24373805
ANALYSIS="/staging/${RUNFOLDER}-$(date +%s)"
METRICS=${ANALYSIS}/Results/MetricsOutput.tsv
RESULTPATH=${PWD}/${RUNFOLDER}-dragen-results

# clean up after run
trap 'rm -rf "/staging/${RUNFOLDER}" "${ANALYSIS}"' EXIT

cp -r "${RUNPATH}/${RUNFOLDER}" /staging || exit 100
mkdir -p "${ANALYSIS}" || exit 101

#load reference for your analysis, for DNA, RNA, CNV, and HLA analysis, use
dragen -l -r /staging/human/hg38
or
dragen -l -r /staging/human/hg19
or 
dragen -l -r /staging/human/GRCh37d5
or
dragen -l -r /staging/human/chm13_v2


#for Methylation analysis, use
#dragen -l -r /staging/human/hg38/hg38.fa.single_pass.Methylation

# Running a RNA pipeline with dragen
dragen -r /staging/human/hg38 \
    -1 /staging/${RUNFOLDER}/SRR24373805_1.fastq.gz \
    -2 /staging/${RUNFOLDER}/SRR24373805_2.fastq.gz \
    -a /staging/human/hg38/genes.gtf \
    --output-dir ${ANALYSIS} \
    --output-file-prefix RNA_test \
    --enable-rna true \
    --enable-rna-gene-fusion true \
    --RGID rg \
    --RGSM sm \
    --enable-rna-quantification=true

# copy results back to working directory
cp -r "${ANALYSIS}" "${RESULTPATH}" || exit 103

And submit with

[user@biowulf]$ sbatch --mem=0 --cpus-per-task=64 --partition nci-dragen --qos=nci_dragen_turbo dragen.sh
12345678

Note that the $ANALYSIS folder is lager than the input with Logs_Intermediates taking up most the space. The script above could be modified to only transfer a subset of files back to shared storage.


Please send questions and comments to staff@hpc.nih.gov