ricopili on Biowulf
RICOPILI stands for Rapid Imputation and COmputational PIpeLIne for GWAS.
The software consists of four independent modules: Preimputation (QC), PCA, Imputation, and Meta-Analysis. These modules are not meant to be used linearly. For example, it may be necessary to repeat QC after discovering your data contains multiple populations from PCA.
References:
- Lam M, Awasthi S, Watson HJ, Goldstein J, Panagiotaropoulou G, Trubetskoy V, Karlsson R, Frei O, Fan CC, De Witte W, Mota NR, Mullins N, Brügger K, Lee SH, Wray NR, Skarabis N, Huang H, Neale B, Daly MJ, Mattheisen M, Walters R, Ripke S. RICOPILI: Rapid Imputation for COnsortias PIpeLIne. Bioinformatics. 2020 Feb 1;36(3):930-933. doi: 10.1093/bioinformatics/btz633.Pubmed
Documentation
Important Notes
- Module Name: ricopili (see the modules page for more information)
- Multithreaded
- Environment variables set: copy config and log to home folder, and edit config file before running the program.
- config file in
$RICOPILI_TEST_DATA/ricopili.conf
- log file in
$RICOPILI_TEST_DATA/log
- config file in
- Example files in
$RICOPILI_TEST_DATA
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --mem=12g --cpus-per-task=24 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ module load ricopili [user@cn3144]$ cp $RICOPILI_TEST_DATA/ricopili.conf ~/ [user@cn3144]$ mkdir -p ~/ricopili; cp -r $RICOPILI_TEST_DATA/log ~/ricopili/ [user@cn3144]$ mkdir -p /data/$USER/ricopili [user@cn3144]$ cd /data/$USER/ricopili/; tar -zxvf $RICOPILI_TEST_DATA/test_sample.tar.gz [user@cn3144]$ cd test_samle [user@cn3144]$ preimp_dir --disease sim --outname hapgen_5cohorts switched on SERIAL mode because of configuration file _ _ _ _ _ __(_) ___ ___ _ __ (_) (_) | '__| |/ __/ _ \| '_ \| | | | | | | | (_| (_) | |_) | | | | |_| |_|\___\___/| .__/|_|_|_| |_| ####################################################################### ####################################################################### ## ### ## preimp_dir - module of ricopili pipeline ### ## version: 2019_Jun_25.001 ### ## ### ## https://sites.google.com/a/broadinstitute.org/ricopili/home ### ## Stephan Ripke: sripke@broadinstitute.org ### ## ### ####################################################################### ####################################################################### Running job: plague switched on SERIAL mode because of configuration file _ _ _ _ _ __(_) ___ ___ _ __ (_) (_) | '__| |/ __/ _ \| '_ \| | | | | | | | (_| (_) | |_) | | | | |_| |_|\___\___/| .__/|_|_|_| |_| ####################################################################### ####################################################################### ## ### ## preimp_dir - module of ricopili pipeline ### ## version: 2019_Jun_25.001 ### ## ### ## https://sites.google.com/a/broadinstitute.org/ricopili/home ### ## Stephan Ripke: sripke@broadinstitute.org ### ## ### ####################################################################### ####################################################################### Running job: qc ----------------------------------------------------- switched on SERIAL mode because of configuration file _ _ _ _ _ __(_) ___ ___ _ __ (_) (_) | '__| |/ __/ _ \| '_ \| | | | | | | | (_| (_) | |_) | | | | |_| |_|\___\___/| .__/|_|_|_| |_| ####################################################################### ####################################################################### ## ### ## preimp_dir - module of ricopili pipeline ### ## version: 2019_Jun_25.001 ### ## ### ## https://sites.google.com/a/broadinstitute.org/ricopili/home ### ## Stephan Ripke: sripke@broadinstitute.org ### ## ### ####################################################################### ####################################################################### Running job: finished ################################################################## ##### CONGRATULATIONS: ##### preimp_pipeline finished successfully: ##### preimp_dir --disease sim --outname hapgen_5cohorts ##### now start with imputation pipeline (see README in subdir qc/) ##### have a look at the wiki page ##### https://sites.google.com/a/broadinstitute.org/ricopili/ ################################################################## [user@cn3144]$ cd qc [user@cn3144]$ pcaer --prefercase --preferfam --out pca_hap1a sim_sim5_eur_yourini-qc1.bim ---------------------------------------------------- witched on SERIAL mode because of configuration file _ _ _ _ _ __(_) ___ ___ _ __ (_) (_) | '__| |/ __/ _ \| '_ \| | | | | | | | (_| (_) | |_) | | | | |_| |_|\___\___/| .__/|_|_|_| |_| ####################################################################### ####################################################################### ## ### ## pcaer - module of ricopili pipeline ### ## version: 2019_Jun_25.001 ### ## ### ## https://sites.google.com/a/broadinstitute.org/ricopili/home ### ## Stephan Ripke: sripke@broadinstitute.org ### ## ### ####################################################################### ####################################################################### Running job: me300 ----------------------------------------------------- Running job: finished [user@cn3144]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Batch job
Most jobs should be run as batch jobs.
Create a batch input file (e.g. ricopili_run.sh). Remember you still need to set up the config files first. For example:
#!/bin/bash set -e module load ricopili cd /data/$USER/ricopili/test_sample preimp_dir --disease sim --outname hapgen_5cohorts
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=24 --mem=12g ricopili_run.sh