Biowulf High Performance Computing at the NIH
vireoSNP on Biowulf

Vireo (Variational inference for reconstructing ensemble origins), a Bayesian method to demultiplex pooled scRNA-seq data without genotype reference. Vireo is primarily designed demultiplexing cells into donors by modelling of expressed alleles. It supports a variety of settings of donor genotype (from entirely missing, to partially missing, to fully observed). As a general cell clustering methods by allelic ratio (equivalent to genotyping), Vireo is applicable for more settings besides donor demultiplexing, including reconstruction of somatic clones, see vireoSNP_clones.ipynb for example on mitochondral mutations.

Documentation

https://vireosnp.readthedocs.io/en/latest/

Important Notes
Submitting an interactive job

Allocate an interactive session and run the interactive job there.

[biowulf]$ sinteractive  --mem=5g
salloc.exe: Granted job allocation 789523
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn0135 are ready for job

[cn0135]$ cd /data/$USER/

[cn0135]$ module load vireosnp

[cn0135]$ cp -r /usr/local/apps/vireo/examples .

[cn0135]$ cd examples

[cn0135]$ bash demo.sh

[cn0135]$ exit
salloc.exe: Job allocation 789523 has been revoked.
[biowulf]$

Submitting a single batch job

1. Create a script file (myscript) similar to the one below

#! /bin/bash
# myscript
set -e

module load vireosnp || exit 1
cd /data/$USER/test/
vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2

2. Submit the script on biowulf:

[biowulf]$ sbatch --mem=5g myscript

Using Swarm

Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.

Set up a swarm command file (eg /data/$USER/cmdfile).

cd /data/$USER/dir1; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...
cd /data/$USER/dir2; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...
cd /data/$USER/dir3; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...
...
cd /data/$USER/dir20; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...

submit the swarm job:

$ swarm -f cmdfile --module vireosnp -g 5

For more information regarding running swarm, see swarm.html