TensoorQTL leverages general-purpose libraries and graphics processing units (GPUs) to achieve high efficiency of computations at low costR. Using PyTorch or TensorFlow it allows > 200-fold decreases in runtime and ~ 5–10-fold reductions in cost when running on GPUs relative to CPUs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=24g --gres=gpu:p100:1,lscratch:50 -c8 [user@cn4199 ~]$ module load TensorQTL [+] Loading singularity 3.8.5-1 [+] Loading cuDNN/7.6.5/CUDA-10.2 libraries... [+] Loading CUDA Toolkit 10.2.89 ... [+] Loading TensorQTL 1.0.7 ...Usage:
[user@cn4199 ~]$ tensorqtl -h usage: tensorqtl [-h] [--mode {cis,cis_nominal,cis_independent,cis_susie,trans,trans_susie}] [--covariates COVARIATES] [--paired_covariate PAIRED_COVARIATE] [--permutations PERMUTATIONS] [--interaction INTERACTION] [--cis_output CIS_OUTPUT] [--phenotype_groups PHENOTYPE_GROUPS] [--window WINDOW] [--pval_threshold PVAL_THRESHOLD] [--maf_threshold MAF_THRESHOLD] [--maf_threshold_interaction MAF_THRESHOLD_INTERACTION] [--dosages] [--return_dense] [--return_r2] [--best_only] [--output_text] [--batch_size BATCH_SIZE] [--chunk_size CHUNK_SIZE] [--susie_loci SUSIE_LOCI] [--disable_beta_approx] [--warn_monomorphic] [--max_effects MAX_EFFECTS] [--fdr FDR] [--qvalue_lambda QVALUE_LAMBDA] [--seed SEED] [-o OUTPUT_DIR] genotype_path phenotypes prefix tensorQTL: GPU-based QTL mapper positional arguments: genotype_path Genotypes in PLINK format phenotypes Phenotypes in BED format (.bed, .bed.gz, .bed.parquet), or optionally for 'trans' mode, parquet or tab-delimited. prefix Prefix for output file names options: -h, --help show this help message and exit --mode {cis,cis_nominal,cis_independent,cis_susie,trans,trans_susie} Mapping mode. Default: cis --covariates COVARIATES Covariates file, tab-delimited, covariates x samples --paired_covariate PAIRED_COVARIATE Single phenotype-specific covariate. Tab-delimited file, phenotypes x samples --permutations PERMUTATIONS Number of permutations. Default: 10000 --interaction INTERACTION Interaction term(s) --cis_output CIS_OUTPUT Output from 'cis' mode with q-values. Required for independent cis-QTL mapping. --phenotype_groups PHENOTYPE_GROUPS Phenotype groups. Header-less TSV with two columns: phenotype_id, group_id --window WINDOW Cis-window size, in bases. Default: 1000000. --pval_threshold PVAL_THRESHOLD Output only significant phenotype-variant pairs with a p-value below threshold. Default: 1e-5 for trans-QTL --maf_threshold MAF_THRESHOLD Include only genotypes with minor allele frequency >= maf_threshold. Default: 0 --maf_threshold_interaction MAF_THRESHOLD_INTERACTION MAF threshold for interactions, applied to lower and upper half of samples --dosages Load dosages instead of genotypes (only applies to PLINK2 bgen input). --return_dense Return dense output for trans-QTL. --return_r2 Return r2 (only for sparse trans-QTL output) --best_only Only write lead association for each phenotype (interaction mode only) --output_text Write output in txt.gz format instead of parquet (trans-QTL mode only) --batch_size BATCH_SIZE GPU batch size (trans-QTLs only). Reduce this if encountering OOM errors. --chunk_size CHUNK_SIZE For cis-QTL mapping, load genotypes into CPU memory in chunks of chunk_size variants, or by chromosome if chunk_size is 'chr'. --susie_loci SUSIE_LOCI Table (parquet or tsv) with loci to fine-map (phenotype_id, chr, pos) with mode 'trans_susie'. --disable_beta_approx Disable Beta-distribution approximation of empirical p-values (not recommended). --warn_monomorphic Warn if monomorphic variants are found. --max_effects MAX_EFFECTS Maximum number of non-zero effects in the SuSiE regression model. --fdr FDR FDR for cis-QTLs --qvalue_lambda QVALUE_LAMBDA lambda parameter for pi0est in qvalue. --seed SEED Seed for permutations. -o OUTPUT_DIR, --output_dir OUTPUT_DIR Output directoryRunning the test example:
[user@cn4199 ~]$ git clone https://github.com/broadinstitute/tensorqtl [user@cn4199 ~]$ cd tensorqtl/example [user@cn4199 ~]$ module load jupyter [user@cn4199 ~]$ jupyter nbconvert --to script tensorqtl_examples.ipynb [user@cn4199 ~]$ python-tqtl tensorqtl_examples.py & [user@cn4199 ~]$ nvidia-smi +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA A100-SXM4-80GB On | 00000000:46:00.0 Off | 0 | | N/A 36C P0 133W / 400W | 2492MiB / 81920MiB | 25% Default | | | | Disabled | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1733667 C /usr/bin/python3 2478MiB | +---------------------------------------------------------------------------------------+ torch: 2.1.2+cu121 (CUDA 12.1), device: cuda pandas 2.1.4 cis-QTL mapping: nominal associations for all variant-phenotype pairs * 445 samples * 301 phenotypes * 26 covariates * 367759 variants * cis-window: ±1,000,000 * checking phenotypes: 301/301 * Computing associations Mapping chromosome chr18 processing phenotype 301/301 time elapsed: 0.02 min * writing output done. cis-QTL mapping: empirical p-values for phenotypes * 445 samples * 301 phenotypes * 26 covariates * 367759 variants * cis-window: ±1,000,000 * using seed 123456 * checking phenotypes: 301/301 * computing permutations processing phenotype 301/301 Time elapsed: 0.19 min done. Computing q-values * Number of phenotypes tested: 301 * Correlation between Beta-approximated and empirical p-values: 1.0000 * Calculating q-values with lambda = 0.850 * Proportion of significant phenotypes (1-pi0): 0.76 * QTL phenotypes @ FDR 0.05: 205 * min p-value threshold @ FDR 0.05: 0.135284 trans-QTL mapping * 445 samples * 19836 phenotypes * 26 covariates * 367759 variants processing batch 37/37 elapsed time: 0.01 min * 210838 variants passed MAF >= 0.05 filtering done. [user@cn4199 ~]$ exit salloc.exe: Relinquishing job allocation 59748321 [user@biowulf ~]$