High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
GiniClust

GiniClust is a clustering method implemented in Python and R for detecting rare cell-types from large-scale single-cell gene expression data.

GiniClust can be applied to datasets originating from different platforms, such as multiplex qPCR data, traditional single-cell RNAseq or newly emerging UMI-based single-cell RNAseq, e.g. inDrops and Drop-seq.

GiniClust is created and maintained by the GC Yuan Lab at Harvard University and the Dana-Farber Cancer Institute and comes with a graphical user interface for convenience.

References:

There are multiple versions of GiniClust available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail GiniClust

To select a module, type

module load GiniClust/[ver]

where [ver] is the version of choice.

Environment variables set:

Interactive job

GiniClust can be run directly from the command line:

[node]$ module load GiniClust
[node]$ ln -s $GINICLUSTHOME/sample_data/Data_GBM.csv .
[node]$ Rscript $GINICLUSTHOME/GiniClust_Main.R -f Data_GBM.csv -t RNA-seq -o GBM_results

GiniClust can also be run via an X11 GUI (see https://hpc.nih.gov/docs/connect.html for details about creating a graphical session on HPC systems):

[node]$ GiniClust.py
Batch job

Create a batch input file (e.g. GiniClust.sh), which uses the input file 'GiniClust.in'. For example:

#!/bin/bash
module load GiniClust
Rscript $GINICLUSTHOME/GiniClust_Main.R -f my_input_data.csv -t RNA-seq -o GBM_results

Submit this job using the Slurm sbatch command.

[biowulf]$ sbatch --cpus-per-task=1 GiniClust.sh
Documentation