split-pipe: high sensitivity single cell RNA sequencing with split pool barcoding
Split-pool combinatorial barcoding makes it possible to scale projects to hundreds of samples and millions of cells, overcoming limitations of previous droplet based technologies. Spipe (split-pipe) implements combinatorial barcoding method for single cell RNA sequencing (scRNA-seq) with dramatically improved sensitivity.
References:
- Vuong Tran, Efthymia Papalexi, Sarah Schroeder, Grace Kim, Ajay Sapre, Joey Pangallo,
Alex Sova, Peter Matulich, Lauren Kenyon, Zeynep Sayar, Ryan Koehler, Daniel Diaz,
Archita Gadkari, Kamy Howitz, Maria Nigos, Charles M. Roco, and Alexander B. Rosenberg
High sensitivity single cell RNA sequencing with split pool barcoding
bioRxiv preprint doi: https://doi.org/10.1101/2022.08.27.505512
Documentation
Important Notes
- Module Name: spipe (see the modules page for more information)
- Unusual environment variables set
- SPIPE_HOME installation directory
- SPIPE_BIN executable directory
- SPIPE_DATA sample data directory
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive [user@cig 3335 ~]$ module load spipe [+] Loading singularity 4.0.1 on cn3335 [+] Loading spipe 1.3.1 [user@cn3335 ~]$ split-pipe -h usage: split-pipe [-h] [-m MODE] [-c CHEMISTRY] [--kit KIT] [-p PARFILE] [--run_name RUN_NAME] [--fq1 FQ1] [--fq2 FQ2] [--output_dir OUTPUT_DIR] [--genome_dir GENOME_DIR] [--parent_dir PARENT_DIR] [--targeted_list TARGETED_LIST] [--sample SAMPLE_NAME WELLS] [--samp_list SAMP_LIST] [--samp_sltab SAMP_SLTAB] [--genome_name [GENOME_NAME ...]] [--genes [GENES ...]] [--fasta [FASTA ...]] [--gfasta GENOME_NAME FASTA] [--sublibraries [SUBLIBRARIES ...]] [--sublib_list SUBLIB_LIST] [--sublib_pref SUBLIB_PREF] [--sublib_suff SUBLIB_SUFF] [--tscp_use TSCP_USE] [--tscp_min TSCP_MIN] [--tscp_max TSCP_MAX] [--cell_use CELL_USE] [--cell_est CELL_EST] [--cell_xf CELL_XF] [--cell_min CELL_MIN] [--cell_max CELL_MAX] [--cell_list CELL_LIST] [--crispr] [--crsp_guides CRSP_GUIDES] [--crsp_read_thresh CRSP_READ_THRESH] [--crsp_tscp_thresh CRSP_TSCP_THRESH] [--crsp_max_mm] [--crsp_use_star] [--immune_check] [--bcr_analysis] [--tcr_analysis] [--immune_genome IMMUNE_GENOME] [--use_imgt_db] [--immune_read_thresh IMMUNE_READ_THRESH] [--no_save_anndata] [--kit_list] [--chem_list] [--bc_list] [--bc_round_set ROUND NAME] [--rseed RSEED] [--nthreads NTHREADS] [--no_keep_going] [--reuse] [--keep_temps] [--one_step] [--until_step UNTIL_STEP] [--clear_runproc] [--start_timeout START_TIMEOUT] [--kit_score_skip] [--dryrun] [-e] [-V] SplitPipe data processing pipeline v1.3.1 options: -h, --help show this help message and exit -m MODE, --mode MODE Mode dictates process(s) to run; REQUIRED; See -explain -c CHEMISTRY, --chemistry CHEMISTRY Set chemistry version for data --kit KIT Set kit and kit-specific parameters -p PARFILE, --parfile PARFILE Parameter file --run_name RUN_NAME Name for run / sublibrary --fq1 FQ1 fastq1 - mRNA reads --fq2 FQ2 fastq2 - Reads containing barcodes and polyN --output_dir OUTPUT_DIR Output dir (created as needed) --genome_dir GENOME_DIR Path containing reference genome --parent_dir PARENT_DIR Path to output_dir to use as parent; Use existing cell calls, etc --targeted_list TARGETED_LIST Target enrichment gene list; csv file withand/or --sample SAMPLE_NAME WELLS Add sample_name and well range; See '--explain' for format --samp_list SAMP_LIST Get samples from file with per line; See --explain --samp_sltab SAMP_SLTAB Get samples from SampleLoadingTable excel file --genome_name [GENOME_NAME ...] mkref name(s) of genome(s)/species --genes [GENES ...] mkref GTF file(s) with gene annotations --fasta [FASTA ...] mkref fasta file(s) for genome(s) --gfasta GENOME_NAME FASTA mkref genome-fasta file; Gene info taken from fasta header line --sublibraries [SUBLIBRARIES ...] Paths to output directories of each sublibrary (Combine mode only) --sublib_list SUBLIB_LIST File listing sublibrary paths, one per line (Combine mode only) --sublib_pref SUBLIB_PREF Sublibrary list paths prefix (Combine mode only) --sublib_suff SUBLIB_SUFF Sublibrary list paths suffix (Combine mode only) --tscp_use TSCP_USE Transcript cutoff to use (Not calculated; given) --tscp_min TSCP_MIN Transcript cutoff min value (Limit for filtered DGE) --tscp_max TSCP_MAX Transcript cutoff max value (Limit for filtered DGE) --cell_use CELL_USE Cell count to use (+/- X-fold for filtered DGE) --cell_est CELL_EST Cell count estimate (Min to X-fold for filtered DGE) --cell_xf CELL_XF Cell estimate X-fold factor (For filtered DGE) --cell_min CELL_MIN Cell count minimum (Lower limit for filtered DGE) --cell_max CELL_MAX Cell count maximum (Upper limit for filtered DGE) --cell_list CELL_LIST List of cell barcodes to use (No tscp cutoff calculated) --crispr Run CRISPR analysis, mapping guide RNA to parent dir cells --crsp_guides CRSP_GUIDES File with crispr guides and 5' 3' context sequences; csv --crsp_read_thresh CRSP_READ_THRESH Minimum reads to qualify crispr transcripts --crsp_tscp_thresh CRSP_TSCP_THRESH Minimum transcripts to qualify crispr guide --crsp_max_mm Maximum mismatch (Hamming distance) for crispr guide mapping --crsp_use_star Use STAR for crispr guide aligment --immune_check Check immune database (BCR / TCR) installation status --bcr_analysis Run BCR analysis --tcr_analysis Run TCR analysis --immune_genome IMMUNE_GENOME Immune (BCR / TCR) genome name --use_imgt_db Use IMGT databse for immune (BCR / TCR) analysis --immune_read_thresh IMMUNE_READ_THRESH Minimum reads to qualify immune transcripts --no_save_anndata Do not save anndata h5ad files --kit_list List valid kit names and chemistry versions --chem_list List valid kit names and chemistry versions --bc_list List installed barcode sets --bc_round_set ROUND NAME Specify barcode use as , where N = 1,2,3 --rseed RSEED Random number seed --nthreads NTHREADS Number of threads to use (default = number of CPUs) --no_keep_going Turn off keep_going (Stop on any error) --reuse Reuse existing files if found (vs generate fresh) --keep_temps Keep temp files --one_step Do one step (mode) of pipeline, then stop --until_step UNTIL_STEP Run until this step (mode) then stop --clear_runproc Clear run process def files (Only); Need output_dir --start_timeout START_TIMEOUT Time for statup env check steps; Zero to skip --kit_score_skip Ignore kit score failure; WARNING Use with caution! --dryrun Dry run; Only setup and report status; Saves run process file -e, --explain Explain assumptions and usage details -V, --version show program's version number and exit [user@cn3335 ~]$ exit user@biowulf]$