MiXeR is Causal Mixture Model for GWAS summary statistics. The version(1.3) installed here contains a Python port of MiXeR, wrapping the same C/C++ core. Also data preprocessing code sumstats.py is included too.
python /opt/mixer/precimed/mixer.py --help
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --cpus-per-task=10 --mem=10G
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ module load mixer
[user@cn3144]$ mkdir /data/$USER/mixer_test/
[user@cn3144]$ cd /data/$USER/mixer_test/
[user@cn3144]$ cp -r ${MIXER_TEST_DATA:-none}/* .
[user@cn3144]$ python /opt/python_convert/sumstats.py csv --sumstats GWAS_EA_excl23andMe.txt.gz --out SSGAC_EDU_2018_no23andMe.csv --force --auto --head 5 --n-val 766345
***********************************************************************
* sumstats.py: utilities for GWAS summary statistics
* Version 1.0.0
* (C) 2016-2018 Oleksandr Frei and Alexey A. Shadrin
* Norwegian Centre for Mental Disorders Research / University of Oslo
* GNU General Public License v3
***********************************************************************
Call:
./sumstats.py csv \
--sumstats GWAS_EA_excl23andMe.txt.gz \
--out SSGAC_EDU_2018_no23andMe.csv \
--force \
--auto \
--head 5 \
--n-val 766345.0
...
Analysis finished at Thu Jul 29 10:32:54 2021
Total time elapsed: 1.0m:57.59s
[user@cn3144]$ python /opt/python_convert/sumstats.py zscore --sumstats SSGAC_EDU_2018_no23andMe.csv | \
> python /opt/python_convert/sumstats.py qc --exclude-ranges 6:26000000-34000000 --out SSGAC_EDU_2018_no23andMe_noMHC.csv --force
***********************************************************************
* sumstats.py: utilities for GWAS summary statistics
* Version 1.0.0
* (C) 2016-2018 Oleksandr Frei and Alexey A. Shadrin
* Norwegian Centre for Mental Disorders Research / University of Oslo
* GNU General Public License v3
***********************************************************************
Call:
./sumstats.py qc \
--out SSGAC_EDU_2018_no23andMe_noMHC.csv \
--force \
--exclude-ranges ['6:26000000-34000000']
...
Analysis finished at Thu Jul 29 10:38:04 2021
Total time elapsed: 3.0m:5.710000000000008s
[user@cn3144]$ python /opt/mixer/precimed/mixer.py ld \
--lib /opt/mixer/src/build/lib/libbgmg.so \
--bfile 1000G_EUR_Phase3_plink/1000G.EUR.QC.22 \
--out 1000G_EUR_Phase3_plink/1000G.EUR.QC.22.run4.ld \
--r2min 0.05 --ldscore-r2min 0.05 --ld-window-kb 30000
INFO:root:__init__(lib_name=/opt/mixer/src/build/lib/libbgmg.so, context_id=0)
INFO:root:init_log(1000G_EUR_Phase3_plink/1000G.EUR.QC.22.run4.ld.log)
INFO:root:log_message(***********************************************************************
* mixer.py: Univariate and Bivariate Causal Mixture for GWAS
* Version 1.2.0
* (c) 2016-2020 Oleksandr Frei, Alexey A. Shadrin, Dominic Holland
* Norwegian Centre for Mental Disorders Research / University of Oslo
* Center for Multimodal Imaging and Genetics / UCSD
* GNU General Public License v3
***********************************************************************
Call:
./mixer.py ld \
--out 1000G_EUR_Phase3_plink/1000G.EUR.QC.22.run4.ld \
--lib /opt/mixer/src/build/lib/libbgmg.so \
--bfile 1000G_EUR_Phase3_plink/1000G.EUR.QC.22 \
--ldscore-r2min 0.05 \
--ld-window-kb 30000.0
)
INFO:root:__init__(lib_name=/opt/mixer/src/build/lib/libbgmg.so, context_id=0)
INFO:root:log_message(Done)
[user@cn3144]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$
Create a batch input file (e.g. mixer.sh). For example:
#!/bin/bash
#SBATCH --job-name=mixer_run
#SBATCH --time=2:00:00
#SBATCH --partition=norm
#SBATCH --nodes=1
#SBATCH --array=1-20
cd /data/$USER/mixer_test
module load mixer
# The example here only use chr22, please change accordingly for your own dataset.
python /opt/mixer/precimed/mixer.py snps \
--lib /opt/mixer/src/build/lib/libbgmg.so \
--bim-file 1000G_EUR_Phase3_plink/1000G.EUR.QC.@.bim \
--ld-file 1000G_EUR_Phase3_plink/1000G.EUR.QC.@.run4.ld \
--out 1000G_EUR_Phase3_plink/1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.rep${SLURM_ARRAY_TASK_ID}.snps \
--chr2use 22 \
--maf 0.05 --subset 2000000 --r2 0.8 --seed 1
python /opt/mixer/precimed/mixer.py fit1 \
--trait1-file SSGAC_EDU_2018_no23andMe_noMHC.csv.gz \
--out SSGAC_EDU_2018_no23andMe_noMHC.fit.rep${SLURM_ARRAY_TASK_ID}\
--extract 1000G_EUR_Phase3_plink/1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.rep${SLURM_ARRAY_TASK_ID}.snps \
--bim-file 1000G_EUR_Phase3_plink/1000G.EUR.QC.@.bim \
--ld-file 1000G_EUR_Phase3_plink/1000G.EUR.QC.@.run4.ld \
--chr2use 22 \
--lib /opt/mixer/src/build/lib/libbgmg.so
python /opt/mixer/precimed/mixer.py test1 \
--trait1-file SSGAC_EDU_2018_no23andMe_noMHC.csv.gz \
--load-params-file SSGAC_EDU_2018_no23andMe_noMHC.fit.rep${SLURM_ARRAY_TASK_ID}.json \
--out SSGAC_EDU_2018_no23andMe_noMHC.test.rep${SLURM_ARRAY_TASK_ID} \
--bim-file 1000G_EUR_Phase3_plink/1000G.EUR.QC.@.bim \
--ld-file 1000G_EUR_Phase3_plink/1000G.EUR.QC.@.run4.ld \
--chr2use 22 \
--lib /opt/mixer/src/build/lib/libbgmg.so
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=20 --mem=2g mixer.sh