TensoorQTL leverages general-purpose libraries and graphics processing units (GPUs) to achieve high efficiency of computations at low costR. Using PyTorch or TensorFlow it allows > 200-fold decreases in runtime and ~ 5–10-fold reductions in cost when running on GPUs relative to CPUs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=24g --gres=gpu:p100:1,lscratch:50 -c8 [user@cn4199 ~]$ module load TensorQTL [+] Loading singularity 3.8.5-1 [+] Loading cuDNN/7.6.5/CUDA-10.2 libraries... [+] Loading CUDA Toolkit 10.2.89 ... [+] Loading TensorQTL 1.0.7 ...Usage:
[user@cn4199 ~]$ tensorqtl -h
usage: tensorqtl [-h] [--mode {cis,cis_nominal,cis_independent,cis_susie,trans,trans_susie}] [--covariates COVARIATES]
[--paired_covariate PAIRED_COVARIATE] [--permutations PERMUTATIONS] [--interaction INTERACTION]
[--cis_output CIS_OUTPUT] [--phenotype_groups PHENOTYPE_GROUPS] [--window WINDOW]
[--pval_threshold PVAL_THRESHOLD] [--maf_threshold MAF_THRESHOLD]
[--maf_threshold_interaction MAF_THRESHOLD_INTERACTION] [--dosages] [--return_dense] [--return_r2]
[--best_only] [--output_text] [--batch_size BATCH_SIZE] [--chunk_size CHUNK_SIZE]
[--susie_loci SUSIE_LOCI] [--disable_beta_approx] [--warn_monomorphic] [--max_effects MAX_EFFECTS]
[--fdr FDR] [--qvalue_lambda QVALUE_LAMBDA] [--seed SEED] [-o OUTPUT_DIR]
genotype_path phenotypes prefix
tensorQTL: GPU-based QTL mapper
positional arguments:
genotype_path Genotypes in PLINK format
phenotypes Phenotypes in BED format (.bed, .bed.gz, .bed.parquet), or optionally for 'trans' mode, parquet or
tab-delimited.
prefix Prefix for output file names
options:
-h, --help show this help message and exit
--mode {cis,cis_nominal,cis_independent,cis_susie,trans,trans_susie}
Mapping mode. Default: cis
--covariates COVARIATES
Covariates file, tab-delimited, covariates x samples
--paired_covariate PAIRED_COVARIATE
Single phenotype-specific covariate. Tab-delimited file, phenotypes x samples
--permutations PERMUTATIONS
Number of permutations. Default: 10000
--interaction INTERACTION
Interaction term(s)
--cis_output CIS_OUTPUT
Output from 'cis' mode with q-values. Required for independent cis-QTL mapping.
--phenotype_groups PHENOTYPE_GROUPS
Phenotype groups. Header-less TSV with two columns: phenotype_id, group_id
--window WINDOW Cis-window size, in bases. Default: 1000000.
--pval_threshold PVAL_THRESHOLD
Output only significant phenotype-variant pairs with a p-value below threshold. Default: 1e-5 for
trans-QTL
--maf_threshold MAF_THRESHOLD
Include only genotypes with minor allele frequency >= maf_threshold. Default: 0
--maf_threshold_interaction MAF_THRESHOLD_INTERACTION
MAF threshold for interactions, applied to lower and upper half of samples
--dosages Load dosages instead of genotypes (only applies to PLINK2 bgen input).
--return_dense Return dense output for trans-QTL.
--return_r2 Return r2 (only for sparse trans-QTL output)
--best_only Only write lead association for each phenotype (interaction mode only)
--output_text Write output in txt.gz format instead of parquet (trans-QTL mode only)
--batch_size BATCH_SIZE
GPU batch size (trans-QTLs only). Reduce this if encountering OOM errors.
--chunk_size CHUNK_SIZE
For cis-QTL mapping, load genotypes into CPU memory in chunks of chunk_size variants, or by
chromosome if chunk_size is 'chr'.
--susie_loci SUSIE_LOCI
Table (parquet or tsv) with loci to fine-map (phenotype_id, chr, pos) with mode 'trans_susie'.
--disable_beta_approx
Disable Beta-distribution approximation of empirical p-values (not recommended).
--warn_monomorphic Warn if monomorphic variants are found.
--max_effects MAX_EFFECTS
Maximum number of non-zero effects in the SuSiE regression model.
--fdr FDR FDR for cis-QTLs
--qvalue_lambda QVALUE_LAMBDA
lambda parameter for pi0est in qvalue.
--seed SEED Seed for permutations.
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Output directory
Running the test example:
[user@cn4199 ~]$ git clone https://github.com/broadinstitute/tensorqtl
[user@cn4199 ~]$ cd tensorqtl/example
[user@cn4199 ~]$ module load jupyter
[user@cn4199 ~]$ jupyter nbconvert --to script tensorqtl_examples.ipynb
[user@cn4199 ~]$ python-tqtl tensorqtl_examples.py &
[user@cn4199 ~]$ nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM4-80GB On | 00000000:46:00.0 Off | 0 |
| N/A 36C P0 133W / 400W | 2492MiB / 81920MiB | 25% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1733667 C /usr/bin/python3 2478MiB |
+---------------------------------------------------------------------------------------+
torch: 2.1.2+cu121 (CUDA 12.1), device: cuda
pandas 2.1.4
cis-QTL mapping: nominal associations for all variant-phenotype pairs
* 445 samples
* 301 phenotypes
* 26 covariates
* 367759 variants
* cis-window: ±1,000,000
* checking phenotypes: 301/301
* Computing associations
Mapping chromosome chr18
processing phenotype 301/301
time elapsed: 0.02 min
* writing output
done.
cis-QTL mapping: empirical p-values for phenotypes
* 445 samples
* 301 phenotypes
* 26 covariates
* 367759 variants
* cis-window: ±1,000,000
* using seed 123456
* checking phenotypes: 301/301
* computing permutations
processing phenotype 301/301
Time elapsed: 0.19 min
done.
Computing q-values
* Number of phenotypes tested: 301
* Correlation between Beta-approximated and empirical p-values: 1.0000
* Calculating q-values with lambda = 0.850
* Proportion of significant phenotypes (1-pi0): 0.76
* QTL phenotypes @ FDR 0.05: 205
* min p-value threshold @ FDR 0.05: 0.135284
trans-QTL mapping
* 445 samples
* 19836 phenotypes
* 26 covariates
* 367759 variants
processing batch 37/37
elapsed time: 0.01 min
* 210838 variants passed MAF >= 0.05 filtering
done.
[user@cn4199 ~]$ exit
salloc.exe: Relinquishing job allocation 59748321
[user@biowulf ~]$