vireoSNP on Biowulf

Vireo (Variational inference for reconstructing ensemble origins), a Bayesian method to demultiplex pooled scRNA-seq data without genotype reference. Vireo is primarily designed demultiplexing cells into donors by modelling of expressed alleles. It supports a variety of settings of donor genotype (from entirely missing, to partially missing, to fully observed). As a general cell clustering methods by allelic ratio (equivalent to genotyping), Vireo is applicable for more settings besides donor demultiplexing, including reconstruction of somatic clones, see vireoSNP_clones.ipynb for example on mitochondral mutations.


Submitting an interactive job

Allocate an interactive session and run the interactive job there.

Submitting a single batch job

1. Create a script file (myscript) similar to the one below

#! /bin/bash
# myscript
set -e

module load vireosnp || exit 1
cd /data/$USER/test/
vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2

2. Submit the script on biowulf:

[biowulf]$ sbatch --mem=5g myscript

Using Swarm

Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.

Set up a swarm command file (eg /data/$USER/cmdfile).

cd /data/$USER/dir1; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...
cd /data/$USER/dir2; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...
cd /data/$USER/dir3; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...
cd /data/$USER/dir20; vireo -c $CELL_DIR -N 4 -o $OUT_DIR --randSeed 2 ...

submit the swarm job:

$ swarm -f cmdfile --module vireosnp -g 5

