High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
CRISPResso on Biowulf & Helix

Description

CRISPResso is a software pipeline for the analysis of targeted CRISPR-Cas9 sequencing data. This algorithm allows for the quantification of both non-homologous end joining (NHEJ) and homologous directed repair (HDR) occurrences.

CRISPResso automatizes and performs the following steps summarized in the figure below:

  1. filters low quality reads,
  2. trims adapters,
  3. aligns the reads to a reference amplicon,
  4. quantifies the proportion of HDR and NHEJ outcomes,
  5. quantifies frameshift/inframe mutations (if applicable) and identifies affected splice sites,
  6. produces a graphical report to visualize and quantify the indels distribution and position.

There may be multiple versions of crispresso available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail crispresso 

To select a module use

module load crispresso/[version]

where [version] is the version of choice.

 

Environment variables set

  • $PATH
  • $CRISPRESSO_DEPENDENCIES_FOLDER

 

References

  • Pinello L, Canver MC, Hoban MD, Orkin SH, Kohn DB, Bauer DE, Yuan GC. Analyzing CRISPR genome-editing experiments with CRISPResso. Nat Biotechnol. 2016 Jul 12;34(7):695-697. doi: 10.1038/nbt.3583. PubMed PMID: 27404874. PubMed

 

Documentation

 

Interactive job on Biowulf

Allocate an interactive session with sinteractive and use as shown below.

biowulf$ sinteractive --mem=10g
salloc.exe: Pending job allocation 30611331
salloc.exe: job 30611331 queued and waiting for resources
salloc.exe: job 30611331 has been allocated resources
salloc.exe: Granted job allocation 30611331
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn2694 are ready for job
srun: error: x11: no local DISPLAY defined, skipping

node$ module load crispresso

node$ CRISPResso -r1 seq1.fastq.gz -r2 seq2.fastq.gz \
		-a AATGTCCCCCAATGGGAAGTTCATCTGGCACTGCCCACAGGTGAGGAGGTCATGATCCCCTTCTGGAGCTCCCAACGGGCCGTGGTCTGGTTCATCATCTGTAAGAATGGCTTCAAGAGGCTCGGCTGTGGTT
...
...

node$ exit
biowulf$

 

Batch job on Biowulf

Create a batch script similar to the following example:

#! /bin/bash
# this file is crispresso.batch

module load crispresso
crispresso command1
crispresso command2

....

Submit to the queue with sbatch:

biowulf$ sbatch --mem=5g crispresso.batch