High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
GCTA on Biowulf & Helix

Description

GCTA (Genome-wide Complex Trait Analysis) was originally designed to estimate the proportion of phenotypic variance explained by genome- or chromosome-wide SNPs for complex traits (the GREML method), and has subsequently extended for many other analyses to better understand the genetic architecture of complex traits. GCTA currently supports the following functionalities:

There may be multiple versions of GCTA available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail GCTA 

To select a module use

module load GCTA/[version]

where [version] is the version of choice.

GCTA is a multithreaded application. Make sure to match the number of cpus requested with the number of threads.

Environment variables set

References

Documentation

Batch job on Biowulf

Create a batch script similar to the following example:

#! /bin/bash
# this file is GCTA.batch
module load GCTA
gcta --bfile test --make-grm --out test

Submit to the queue with sbatch:

biowulf$ sbatch --cpus-per-task=1 --mem-per-cpus=2g GCTA.batch
Swarm of jobs on Biowulf

Create a swarm command file similar to the following example:

# this file is GCTA.swarm
gcta --bfile test2 --make-grm --out test2
gcta --bfile test3 --make-grm --out test3
gcta --bfile test4 --make-grm --out test4

And submit to the queue with swarm

biowulf$ swarm -f GCTA.swarm --module GCTA -g 2
Interactive job on Biowulf

Allocate an interactive session with sinteractive and use as described above

biowulf$ sinteractive --mem=4g --cpus-per-task=2
node$ module load GCTA
node$ cp $GCTAHOME/test.* .
node$ gcta --bfile test --make-grm --out test
node$ exit
biowulf$