The mageck-vispr toolkit is used to QC, analyse, and visualize results from CRISPR/Cas9 screens. It includes the
The mageck workflow is implemented as a snakemake pipeline and runs automatically. Vispr on the other hand is essentially a web application that will run a temporary server on a compute node and the user will connect to it using a browser on his/her own computer through an ssh tunnel.
$MAGECK_VISPR_TEST_DATA
Allocate an interactive session with sinteractive and use as shown below. See our tunneling page for more details on setting up ssh tunnels.
biowulf$ sinteractive --cpus-per-task=8 --mem=16g --tunnel salloc.exe: Pending job allocation 31864544 salloc.exe: job 31864544 queued and waiting for resources salloc.exe: job 31864544 has been allocated resources salloc.exe: Granted job allocation 31864544 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn2692 are ready for job srun: error: x11: no local DISPLAY defined, skipping Created 1 generic SSH tunnel(s) from this compute node to biowulf for your use at port numbers defined in the $PORTn ($PORT1, ...) environment variables. Please create a SSH tunnel from your workstation to these ports on biowulf. On Linux/MacOS, open a terminal and run: ssh -L 46763:localhost:46763 biowulf.nih.gov For Windows instructions, see https://hpc.nih.gov/docs/tunneling [user@cn3144]$ module load mageck-vispr
Copy test data and run mageck pipeline
[user@cn3144]$ cp -r ${MAGECK_VISPR_TEST_DATA}/esc-testdata . [user@cn3144]$ cd esc-testdata [user@cn3144]$ mageck-vispr init ./test_workflow --reads \ reads/ERR376998.subsample.fastq \ reads/ERR376999.subsample.fastq \ reads/ERR377000.subsample.fastq [user@cn3144]$ tree test_workflow test_workflow/ |-- [user 2.9K] config.yaml |-- [user 7.3K] README.txt `-- [user 5.8K] Snakefile 0 directories, 3 files [user@cn3144]$ cd test_workflow
Before running the workflow it is necessary to edit the automatically generated config file. The generated file contains many comments. Here is the edited file with the comments stripped for readability:
library: ../yusa_library.csv species: mus_musculus assembly: mm10 targets: genes: true sgrnas: update-efficiency: false trim-5: AUTO len: AUTO samples: esc1: - ../reads/ERR376999.subsample.fastq esc2: - ../reads/ERR377000.subsample.fastq plasmid: - ../reads/ERR376998.subsample.fastq experiments: "ESC-MLE": designmatrix: ../designmatrix.txt # the following setting is required for 0.5.6 correct_cnv: false
Once the config file has been modified to reflect the experimental
design, run the pipeline. Note that snakemake is used to run this locally, not
by submitting tasks as cluster jobs. Note that the snakemake installation
used for mageck-vispr has been renamed to mageck-vispr-snakemake
to avoid interfering with the general use snakemake.
[user@cn3144]$ mageck-vispr-snakemake --cores=$SLURM_CPUS_PER_TASK
Next, start the vispr server for visualization. Note that sinteractive sets the
$PORT1
variable to the port selected for tunneling.
[user@cn3144]$ cd test_workflow [user@cn3144]$ vispr server --port $PORT1 --host=localhost results/ESC-MLE.vispr.yaml Loading data. Starting server. Open: go to localhost:39335 in your browser. Note: Safari and Internet Explorer are currently unsupported. Close: hit Ctrl-C in this terminal.
On your local workstation, create an ssh tunnel to biowulf as describe in our tunneling documentation.
Create a batch input file (e.g. mageck-vispr.sh), which uses the input file 'mageck-vispr.in'. For example:
Create a batch script for an existing config file similar to the following example:
#! /bin/bash # this file is mageck.batch module load mageck-vispr/0.5.4 || exit 1 cd /path/to/workdir mageck-vispr-snakemake --cores=$SLURM_CPUS_PER_TASK
Submit to the queue with sbatch:
biowulf$ sbatch --cpus-per-task=8 --mem=16g mageck.batch