High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Long Ranger on Biowulf
Long Ranger is a set of analysis pipelines that processes GemCode sequencing output to align reads and call and phase SNPs, indels, and structural variants. There are four pipelines: These pipelines combine GemCode-specific algorithms with widely used components such as BWA, Freebayes, and GATK. Output is delivered in standard BAM, VCF, and BEDPE formats that are augmented with long range information.

Long Ranger supports the following GemCode sequencing workflows:

Long Ranger is developed at 10X Genomics . Long Ranger website

Note: The reference data for Long Ranger (hg19 and b37) is available in /fdb/longranger. By default, the program is set up to use the hg19 data. If you want to use the b37 data, add

export TENX_REFDATA=/fdb/longranger/refdata-b37-1.2.0

after the 'module load longranger' command.

 

On Helix

Long Ranger is fairly compute- and memory-intensive, so it is not appropriate for running on Helix.

Batch job on Biowulf

The following examples use the sample data that is provided with Long Ranger. This can be copied from /usr/local/apps/longranger, as is done in the following batch script.

Create a batch input file (e.g. longranger.sh). For example:

#!/bin/bash
#   this file is called lr.bat

module load longranger/2.1.3

cd /data/$USER
mkdir longranger; cd longranger
tar xzf /usr/local/apps/longranger/tiny-bcl-2.0.0.tar.gz

longranger mkfastq --run=./tiny-bcl-2.0.0/  \
      --samplesheet=/usr/local/apps/longranger/tiny-bcl-samplesheet-2.1.0.csv
      
longranger wgs --id=sample345 --reference=/fdb/longranger/refdata-hg19-1.2.0 --sex=f \
      --localcores=$SLURM_CPUS_PER_TASK --localmem=$SLURM_MEM_PER_NODE \
      --fastqs=./H77WWBBXX/outs/fastq_path x

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=16 --mem=10g lr.bat

This job will run for about 7 minutes. The output of 'jobload' may indicate very low usage of CPU, but this is normal: the pipeline only occasionally uses multiple threads.

Interactive job on Biowulf

Allocate an interactive node and run Long Ranger on there. Sample session:

[$USER@biowulf ~]$ sinteractive --cpus-per-task=8 --mem=20g 
salloc.exe: Pending job allocation 15813673
salloc.exe: job 15813673 queued and waiting for resources
salloc.exe: job 15813673 has been allocated resources
salloc.exe: Granted job allocation 15813673
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn0198 are ready for job

[user@cn0198 ~]$ mkdir /data/$USER/longranger

[user@cn0198 ~]$  cd /data/$USER/longranger

[user@cn0198 longranger]$ tar xzf /usr/local/apps/longranger/tiny-bcl-2.0.0.tar.gz

[user@cn0198 longranger]$ module load longranger
[+] Loading cmake 3.0.2 ...
[+] Loading gcc 4.9.1 ...
[-] Unloading gcc 4.9.1 ...
[+] Loading gcc 4.9.1 ...
[+] Loading boost libraries v1.55 ...
[+] Loading LongRanger 2.1.3 ...

[user@cn0198 longranger]$ longranger mkfastq --run=./tiny-bcl-2.0.0/ --samplesheet=/usr/local/apps/longranger/tiny-bcl-samplesheet-2.1.0.csv
longranger mkfastq (2.1.3)
Copyright (c) 2017 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------

Martian Runtime - 2.1.3 (2.1.2)
Running preflight checks (please wait)...
Checking run folder...
Checking RunInfo.xml...
Checking system environment...
Checking barcode whitelist...
Checking read specification...
Checking samplesheet specs...
Checking for dual index flowcell...
2017-08-08 11:18:25 [runtime] (ready)           ID.H77WWBBXX.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
2017-08-08 11:18:28 [runtime] (split_complete)  ID.H77WWBBXX.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
2017-08-08 11:18:28 [runtime] (run:local)       ID.H77WWBBXX.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET.fork0.chnk0.main
2017-08-08 11:18:31 [runtime] (chunks_complete) ID.H77WWBBXX.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
[....]
2017-08-08 11:22:32 [runtime] (join_complete)   ID.H77WWBBXX.MAKE_FASTQS_CS.MAKE_FASTQS.MERGE_FASTQS_BY_LANE_SAMPLE

Outputs:
- Run QC metrics:        /spin1/users/$USER/longranger/H77WWBBXX/outs/qc_summary.json
- FASTQ output folder:   /spin1/users/$USER/longranger/H77WWBBXX/outs/fastq_path
- Interop output folder: /spin1/users/$USER/longranger/H77WWBBXX/outs/interop_path
- Input samplesheet:     /spin1/users/$USER/longranger/H77WWBBXX/outs/input_samplesheet.csv

Pipestance completed successfully!

Saving pipestance info to H77WWBBXX/H77WWBBXX.mri.tgz

[user@cn0198 longranger]$  longranger wgs --id=sample345 --reference=/fdb/longranger/refdata-hg19-1.2.0 --sex=f --fastqs=./H77WWBBXX/outs/fastq_path --localcores=$SLURM_CPUS_PER_TASK --localmem=$SLURM_MEM_PER_NODE
longranger wgs (2.1.3)
Copyright (c) 2017 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------

Martian Runtime - 2.1.3 (2.1.2)
Running preflight checks (please wait)...
2017-08-08 11:29:21 [runtime] (ready)           ID.sample345.PHASER_SVCALLER_CS.PHASER_SVCALLER._SNPINDEL_PHASER.SORT_GROUND_TRUTH
2017-08-08 11:29:21 [runtime] (ready)           ID.sample345.PHASER_SVCALLER_CS.PHASER_SVCALLER._LINKED_READS_ALIGNER.SETUP_CHUNKS
[...]

[user@cn0198 longranger]$ exit
exit
srun: error: cn0198: task 0: Exited with exit code 130
salloc.exe: Relinquishing job allocation 15813673
[$USER@biowulf ~]$
Documentation