Biowulf High Performance Computing at the NIH
Long Ranger on Biowulf

Long Ranger is a set of analysis pipelines that processes GemCode sequencing output to align reads and call and phase SNPs, indels, and structural variants. There are four pipelines:

These pipelines combine GemCode-specific algorithms with widely used components such as BWA, Freebayes, and GATK. Output is delivered in standard BAM, VCF, and BEDPE formats that are augmented with long range information.

Long Ranger supports the following GemCode sequencing workflows:

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive --cpus-per-task=16 --gres=lscratch:50
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ cd /lscratch/$SLURM_JOBID

[user@cn3144 ~]$ module load longranger/2.2.2

[user@cn3144 ~]$ tar xvzf /usr/local/apps/longranger/tiny-bcl.tar.gz

[user@cn3144 ~]$ longranger testrun --id=tiny
longranger testrun (2.2.2)
Copyright (c) 2018 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------

Running Long Ranger in test mode...

Martian Runtime - '2.2.2-2.3.2'
Serving UI at http://cn0860:39652?auth=Hele1mGjtYv5V3s0b24NT_bqOMfi-sOlCtiXTBz4_mM

Running preflight checks (please wait)...

[...]
Outputs:
- Run summary:  /lscratch/46116226/tiny/outs/summary.csv
- BAM barcoded: /lscratch/46116226/tiny/outs/possorted_bam.bam
- BAM index:    /lscratch/46116226/tiny/outs/possorted_bam.bam.bai
- VCF:          /lscratch/46116226/tiny/outs/variants.vcf.gz
- VCF index:    /lscratch/46116226/tiny/outs/variants.vcf.gz.tbi


iting 6 seconds for UI to do final refresh.
Pipestance completed successfully!

Saving pipestance info to tiny/tiny.mri.tgz

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. longranger.sh). For example:

#!/bin/bash
# this file is called 1r.sh

set -e
module load longranger/2.2.2

cd /data/$USER
mkdir longranger; cd longranger
tar xzf /fdb/longranger/tiny-bcl-2.0.0.tar.gz

longranger mkfastq --run=./tiny-bcl-2.0.0/  \
      --samplesheet=/fdb/longranger/tiny-bcl-samplesheet-2.1.0.csv
      
longranger wgs --id=sample345 --reference=/fdb/longranger/refdata-hg19-2.1.0 --sex=f \
      --localcores=$SLURM_CPUS_PER_TASK --localmem=$SLURM_MEM_PER_NODE \
      --fastqs=./H77WWBBXX/outs/fastq_path --vcmode freebayes

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=16 --mem=100g --time=8:00:00 lr.sh