High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
MAGeCK: Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout

MAGeCK is Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) method for prioritizing single-guide RNAs, genes and pathways in genome-scale CRISPR/Cas9 knockout screens. It demonstrates better performance compared with other methods, identifies both positively and negatively selected genes simultaneously, and reports robust results across different experimental conditions.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive 
[user@@cn3316 ~]$ module load MAGeCK
[+] Loading MAGeCK  0.5.7  on cn3316
[+] Loading gcc  7.2.0  ... 
[+] Loading GSL 2.4 for GCC 7.2.0 ... 
[+] Loading openmpi 3.0.0  for GCC 7.2.0 
[+] Loading R 3.5.0_build2 
[+] Loading TeX 2018 
Prepare sample input data:
[user@@cn3316 ~]$ cp -r $MAGECK_DATA/demo2/* . 
Run GAGeCK and produce PDF report files:
[user@@cn3316 ~]$ mageck count -l library.txt -n demo --sample-label L1,CTRL  --fastq test1.fastq test2.fastq --pdf-report 
INFO  @ Thu, 06 Dec 2018 14:29:21: Parameters: /usr/local/apps/MAGeCK/0.5.7/bin/mageck count -l library.txt -n demo --sample-label L1,CTRL --fastq test1.fastq test2.fastq --pdf-report 
INFO  @ Thu, 06 Dec 2018 14:29:21: Welcome to MAGeCK v0.5.7. Command: count 
INFO  @ Thu, 06 Dec 2018 14:29:21: Loading 2550 predefined sgRNAs. 
WARNING @ Thu, 06 Dec 2018 14:29:21: There are 0 sgRNAs with duplicated sequences. 
INFO  @ Thu, 06 Dec 2018 14:29:21: Parsing FASTQ file test1.fastq... 
INFO  @ Thu, 06 Dec 2018 14:29:21: Determining the trim-5 length of FASTQ file test1.fastq... 
INFO  @ Thu, 06 Dec 2018 14:29:21: Possible gRNA lengths:20 
INFO  @ Thu, 06 Dec 2018 14:29:22: Processing 0M reads ... 
INFO  @ Thu, 06 Dec 2018 14:29:22: Read length:30 
INFO  @ Thu, 06 Dec 2018 14:29:22: Total tested reads: 2500, mapped: 1453(0.5812) 
INFO  @ Thu, 06 Dec 2018 14:29:22: --trim-5 test data: (trim_length reads fraction) 
INFO  @ Thu, 06 Dec 2018 14:29:22: 0	1453	1.0 
INFO  @ Thu, 06 Dec 2018 14:29:22: Auto determination of trim5 results: 0 
INFO  @ Thu, 06 Dec 2018 14:29:22: Possible gRNA lengths:20 
INFO  @ Thu, 06 Dec 2018 14:29:22: Processing 0M reads .. 
INFO  @ Thu, 06 Dec 2018 14:29:22: Total: 2500. 
INFO  @ Thu, 06 Dec 2018 14:29:22: Mapped: 1453. 
INFO  @ Thu, 06 Dec 2018 14:29:22: Parsing FASTQ file test2.fastq... 
INFO  @ Thu, 06 Dec 2018 14:29:22: Determining the trim-5 length of FASTQ file test2.fastq... 
INFO  @ Thu, 06 Dec 2018 14:29:22: Possible gRNA lengths:20 
INFO  @ Thu, 06 Dec 2018 14:29:22: Processing 0M reads ... 
INFO  @ Thu, 06 Dec 2018 14:29:22: Read length:30 
INFO  @ Thu, 06 Dec 2018 14:29:22: Total tested reads: 2500, mapped: 1471(0.5884) 
INFO  @ Thu, 06 Dec 2018 14:29:22: --trim-5 test data: (trim_length reads fraction) 
INFO  @ Thu, 06 Dec 2018 14:29:22: 0	1471	1.0 
INFO  @ Thu, 06 Dec 2018 14:29:22: Auto determination of trim5 results: 0 
INFO  @ Thu, 06 Dec 2018 14:29:22: Possible gRNA lengths:20 
INFO  @ Thu, 06 Dec 2018 14:29:22: Processing 0M reads .. 
INFO  @ Thu, 06 Dec 2018 14:29:22: Total: 2500. 
INFO  @ Thu, 06 Dec 2018 14:29:22: Mapped: 1471. 
WARNING @ Thu, 06 Dec 2018 14:29:22: Sample 0 has zero median count, so median normalization is not possible. Switch to total read count normalization. 
WARNING @ Thu, 06 Dec 2018 14:29:22: Sample 0 has too many zero-count sgRNAs (0.5003921568627451), and median normalization is unstable. Switch to total read count normalization. 
WARNING @ Thu, 06 Dec 2018 14:29:22: Sample 1 has too many zero-count sgRNAs (0.47019607843137257), and median normalization is unstable. Switch to total read count normalization. 
INFO  @ Thu, 06 Dec 2018 14:29:22: Final size factor: 1.006194081211287 0.9938817131203264 
INFO  @ Thu, 06 Dec 2018 14:29:22: Summary of file test1.fastq: 
INFO  @ Thu, 06 Dec 2018 14:29:22: label	L1 
INFO  @ Thu, 06 Dec 2018 14:29:22: reads	2500 
INFO  @ Thu, 06 Dec 2018 14:29:22: mappedreads	1453 
INFO  @ Thu, 06 Dec 2018 14:29:22: totalsgrnas	2550 
INFO  @ Thu, 06 Dec 2018 14:29:22: zerosgrnas	1276 
INFO  @ Thu, 06 Dec 2018 14:29:22: giniindex	0.5266899931488773 
INFO  @ Thu, 06 Dec 2018 14:29:22: Summary of file test2.fastq: 
INFO  @ Thu, 06 Dec 2018 14:29:22: label	CTRL 
INFO  @ Thu, 06 Dec 2018 14:29:22: reads	2500 
INFO  @ Thu, 06 Dec 2018 14:29:22: mappedreads	1471 
INFO  @ Thu, 06 Dec 2018 14:29:22: totalsgrnas	2550 
INFO  @ Thu, 06 Dec 2018 14:29:22: zerosgrnas	1199 
INFO  @ Thu, 06 Dec 2018 14:29:22: giniindex	0.4930917310247763 
INFO  @ Thu, 06 Dec 2018 14:29:22: Loading Rnw template file: /usr/local/Anaconda/envs_app/MAGeCK/0.5.7/lib/python3.6/site-packages/mageck/fastq_template.Rnw. 
INFO  @ Thu, 06 Dec 2018 14:29:23: Running command: cd ./; Rscript demo_countsummary.R 
INFO  @ Thu, 06 Dec 2018 14:29:25: Command message: 
INFO  @ Thu, 06 Dec 2018 14:29:25:   Writing to file demo_countsummary.tex 
INFO  @ Thu, 06 Dec 2018 14:29:25:   Processing code chunks with options ... 
...

[user@@cn3316 ~]$ mageck test -k demo.count.txt -t L1 -c CTRL -n demo --pdf-report
INFO  @ Thu, 06 Dec 2018 14:32:45: Welcome to MAGeCK v0.5.7. Command: test 
INFO  @ Thu, 06 Dec 2018 14:32:45: Loading count table from demo.count.txt  
INFO  @ Thu, 06 Dec 2018 14:32:45: Processing 1 lines.. 
INFO  @ Thu, 06 Dec 2018 14:32:45: Loaded 2550 records. 
INFO  @ Thu, 06 Dec 2018 14:32:45: Loading R template file: /usr/local/Anaconda/envs_app/MAGeCK/0.5.7/lib/python3.6/site-packages/mageck/plot_template.RTemplate. 
INFO  @ Thu, 06 Dec 2018 14:32:45: Loading R template file: /usr/local/Anaconda/envs_app/MAGeCK/0.5.7/lib/python3.6/site-packages/mageck/plot_template_indvgene.RTemplate. 
INFO  @ Thu, 06 Dec 2018 14:32:45: Loading Rnw template file: /usr/local/Anaconda/envs_app/MAGeCK/0.5.7/lib/python3.6/site-packages/mageck/plot_template.Rnw. 
INFO  @ Thu, 06 Dec 2018 14:32:45: Treatment samples:L1 
INFO  @ Thu, 06 Dec 2018 14:32:45: Treatment sample index:0 
INFO  @ Thu, 06 Dec 2018 14:32:45: Control samples:CTRL 
INFO  @ Thu, 06 Dec 2018 14:32:45: Control sample index:1 
WARNING @ Thu, 06 Dec 2018 14:32:45: Sample 0 has too many zero-count sgRNAs (0.47019607843137257), and median normalization is unstable. Switch to total read count normalization. 
WARNING @ Thu, 06 Dec 2018 14:32:45: Sample 1 has zero median count, so median normalization is not possible. Switch to total read count normalization. 
WARNING @ Thu, 06 Dec 2018 14:32:45: Sample 1 has too many zero-count sgRNAs (0.5003921568627451), and median normalization is unstable. Switch to total read count normalization. 
INFO  @ Thu, 06 Dec 2018 14:32:45: Final size factor: 0.9938817131203264 1.006194081211287 
INFO  @ Thu, 06 Dec 2018 14:32:45: Adjusted variance calculation: 0.0 for raw variance, 1.0 for modeling 
INFO  @ Thu, 06 Dec 2018 14:32:45: Before RRA, 0 sgRNAs are removed with zero counts in both group(s). 
INFO  @ Thu, 06 Dec 2018 14:32:45: Use qnorm to reversely calculate sgRNA scores ... 
INFO  @ Thu, 06 Dec 2018 14:32:45: Running command: RRA -i demo.plow.txt -o demo.gene.low.txt -p 0.05 --skip-gene NA --skip-gene na  
INFO  @ Thu, 06 Dec 2018 14:32:46: Command message: 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Welcome to RRA v 0.5.7. 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Skipping gene NA for permutation ... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Skipping gene na for permutation ... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Reading input file... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Summary: 2550 sgRNAs, 2356 genes, 1 lists; skipped sgRNAs:0 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Computing lo-values for each group... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Computing false discovery rate... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Permuting genes with 1 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Permuting genes with 2 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Permuting genes with 3 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Permuting genes with 4 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Number of genes under FDR adjustment: 2356 
INFO  @ Thu, 06 Dec 2018 14:32:46:   Saving to output file... 
INFO  @ Thu, 06 Dec 2018 14:32:46:   RRA completed. 
INFO  @ Thu, 06 Dec 2018 14:32:46:    
INFO  @ Thu, 06 Dec 2018 14:32:46: End command message. 
INFO  @ Thu, 06 Dec 2018 14:32:46: Running command: RRA -i demo.phigh.txt -o demo.gene.high.txt -p 0.01019607843137255 --skip-gene NA --skip-gene na  
INFO  @ Thu, 06 Dec 2018 14:32:47: Command message: 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Welcome to RRA v 0.5.7. 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Skipping gene NA for permutation ... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Skipping gene na for permutation ... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Reading input file... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Summary: 2550 sgRNAs, 2356 genes, 1 lists; skipped sgRNAs:0 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Computing lo-values for each group... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Computing false discovery rate... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Permuting genes with 1 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Permuting genes with 2 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Permuting genes with 3 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Permuting genes with 4 sgRNAs... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Number of genes under FDR adjustment: 2356 
INFO  @ Thu, 06 Dec 2018 14:32:47:   Saving to output file... 
INFO  @ Thu, 06 Dec 2018 14:32:47:   RRA completed. 
INFO  @ Thu, 06 Dec 2018 14:32:47:    
INFO  @ Thu, 06 Dec 2018 14:32:47: End command message. 
INFO  @ Thu, 06 Dec 2018 14:32:47: Loading top 10 genes from demo.gene.low.txt: LPHN3,SIRPB1,UNC119,CADPS,CACNA1G,ESM1,TMEM198,LDLR,CFH,OAS1 
INFO  @ Thu, 06 Dec 2018 14:32:47: Loading top 10 genes from demo.gene.high.txt: TRA2B,RUFY3,ST8SIA4,TMPRSS11E,FABP3,PPAP2C,DDX3Y,TNFSF12,NEURL4,TTLL6 
INFO  @ Thu, 06 Dec 2018 14:32:47: Running command: rm demo.plow.txt 
INFO  @ Thu, 06 Dec 2018 14:32:47: Running command: rm demo.phigh.txt 
INFO  @ Thu, 06 Dec 2018 14:32:47: Running command: rm demo.gene.low.txt 
INFO  @ Thu, 06 Dec 2018 14:32:47: Running command: rm demo.gene.high.txt 
INFO  @ Thu, 06 Dec 2018 14:32:47: Running command: cd ./; Rscript demo.R 
INFO  @ Thu, 06 Dec 2018 14:32:50: Command message: 
INFO  @ Thu, 06 Dec 2018 14:32:50:   null device  
INFO  @ Thu, 06 Dec 2018 14:32:50:             1  
INFO  @ Thu, 06 Dec 2018 14:32:50:   Writing to file demo_summary.tex 
INFO  @ Thu, 06 Dec 2018 14:32:50:   Processing code chunks with options ... 
INFO  @ Thu, 06 Dec 2018 14:32:50:    1 : keep.source term verbatim (label = funcdef, demo_summary.Rnw:27) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    2 : keep.source term tex (label = tab1, demo_summary.Rnw:37) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    3 : keep.source term verbatim (demo_summary.Rnw:77) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    4 : keep.source term verbatim pdf  (demo_summary.Rnw:83) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    5 : keep.source term verbatim pdf  (demo_summary.Rnw:201) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    6 : keep.source term verbatim pdf  (demo_summary.Rnw:345) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    7 : keep.source term verbatim pdf  (demo_summary.Rnw:489) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    8 : keep.source term verbatim (demo_summary.Rnw:567) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    9 : keep.source term verbatim pdf  (demo_summary.Rnw:573) 
INFO  @ Thu, 06 Dec 2018 14:32:50:   10 : keep.source term verbatim pdf  (demo_summary.Rnw:691) 
INFO  @ Thu, 06 Dec 2018 14:32:50:   11 : keep.source term verbatim pdf  (demo_summary.Rnw:835) 
INFO  @ Thu, 06 Dec 2018 14:32:50:   12 : keep.source term verbatim pdf  (demo_summary.Rnw:979) 
INFO  @ Thu, 06 Dec 2018 14:32:50:    
INFO  @ Thu, 06 Dec 2018 14:32:50:   You can now run (pdf)latex on ‘demo_summary.tex’ 
INFO  @ Thu, 06 Dec 2018 14:32:50:    
INFO  @ Thu, 06 Dec 2018 14:32:50: End command message. 
INFO  @ Thu, 06 Dec 2018 14:32:50: Running command: cd ./; rm -rf demo_summary-*.pdf 
INFO  @ Thu, 06 Dec 2018 14:32:50: Command message: 
INFO  @ Thu, 06 Dec 2018 14:32:50:    
INFO  @ Thu, 06 Dec 2018 14:32:50: End command message. 
INFO  @ Thu, 06 Dec 2018 14:32:50: Running command: cd ./; rm -rf demo_summary.aux 
INFO  @ Thu, 06 Dec 2018 14:32:50: Command message: 
INFO  @ Thu, 06 Dec 2018 14:32:50:    
INFO  @ Thu, 06 Dec 2018 14:32:50: End command message. 
INFO  @ Thu, 06 Dec 2018 14:32:50: Running command: cd ./; rm -rf demo_summary.tex 
INFO  @ Thu, 06 Dec 2018 14:32:51: Command message: 
INFO  @ Thu, 06 Dec 2018 14:32:51:    
INFO  @ Thu, 06 Dec 2018 14:32:51: End command message. 
INFO  @ Thu, 06 Dec 2018 14:32:51: Running command: cd ./; rm -rf demo_summary.toc 
INFO  @ Thu, 06 Dec 2018 14:32:51: Command message: 
INFO  @ Thu, 06 Dec 2018 14:32:51:    

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. mageck.sh). For example:

mageck test -k sample.txt -t HL60.final,KBM7.final -c HL60.initial,KBM7.initial  -n demo
mageck run --fastq test1.fastq test2.fastq -l library.txt -n demo --sample-label L1,CTRL -t L1 -c 
CTRL
mageck count -l library.txt -n demo --sample-label L1,CTRL  --fastq test1.fastq test2.fastq --pdf-re
port
mageck test -k demo.count.txt -t L1 -c CTRL -n demo --pdf-report
mageck mle -k leukemia.new.csv -d designmat.txt -n beta_leukemia --cnv-norm cnv_data.txt --permutati
on-round 2
mageck test -k sample.txt -t HL60.final,KBM7.final -c HL60.initial,KBM7.initial -n demo4 --cnv-norm 
cnv_data.txt --cell-line HL60_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE 

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#] mageck.sh