Split-pool combinatorial barcoding makes it possible to scale projects to hundreds of samples and millions of cells, overcoming limitations of previous droplet based technologies. Spipe (split-pipe) implements combinatorial barcoding method for single cell RNA sequencing (scRNA-seq) with dramatically improved sensitivity.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive [user@cig 3335 ~]$ module load spipe [+] Loading singularity 4.0.1 on cn3335 [+] Loading spipe 1.3.1 [user@cn3335 ~]$ split-pipe -h usage: split-pipe [-h] [-m MODE] [-c CHEMISTRY] [--kit KIT] [-p PARFILE] [--run_name RUN_NAME] [--fq1 FQ1] [--fq2 FQ2] [--output_dir OUTPUT_DIR] [--genome_dir GENOME_DIR] [--parent_dir PARENT_DIR] [--targeted_list TARGETED_LIST] [--sample SAMPLE_NAME WELLS] [--samp_list SAMP_LIST] [--samp_sltab SAMP_SLTAB] [--genome_name [GENOME_NAME ...]] [--genes [GENES ...]] [--fasta [FASTA ...]] [--gfasta GENOME_NAME FASTA] [--sublibraries [SUBLIBRARIES ...]] [--sublib_list SUBLIB_LIST] [--sublib_pref SUBLIB_PREF] [--sublib_suff SUBLIB_SUFF] [--tscp_use TSCP_USE] [--tscp_min TSCP_MIN] [--tscp_max TSCP_MAX] [--cell_use CELL_USE] [--cell_est CELL_EST] [--cell_xf CELL_XF] [--cell_min CELL_MIN] [--cell_max CELL_MAX] [--cell_list CELL_LIST] [--crispr] [--crsp_guides CRSP_GUIDES] [--crsp_read_thresh CRSP_READ_THRESH] [--crsp_tscp_thresh CRSP_TSCP_THRESH] [--crsp_max_mm] [--crsp_use_star] [--immune_check] [--bcr_analysis] [--tcr_analysis] [--immune_genome IMMUNE_GENOME] [--use_imgt_db] [--immune_read_thresh IMMUNE_READ_THRESH] [--no_save_anndata] [--kit_list] [--chem_list] [--bc_list] [--bc_round_set ROUND NAME] [--rseed RSEED] [--nthreads NTHREADS] [--no_keep_going] [--reuse] [--keep_temps] [--one_step] [--until_step UNTIL_STEP] [--clear_runproc] [--start_timeout START_TIMEOUT] [--kit_score_skip] [--dryrun] [-e] [-V] SplitPipe data processing pipeline v1.3.1 options: -h, --help show this help message and exit -m MODE, --mode MODE Mode dictates process(s) to run; REQUIRED; See -explain -c CHEMISTRY, --chemistry CHEMISTRY Set chemistry version for data --kit KIT Set kit and kit-specific parameters -p PARFILE, --parfile PARFILE Parameter file --run_name RUN_NAME Name for run / sublibrary --fq1 FQ1 fastq1 - mRNA reads --fq2 FQ2 fastq2 - Reads containing barcodes and polyN --output_dir OUTPUT_DIR Output dir (created as needed) --genome_dir GENOME_DIR Path containing reference genome --parent_dir PARENT_DIR Path to output_dir to use as parent; Use existing cell calls, etc --targeted_list TARGETED_LIST Target enrichment gene list; csv file withand/or --sample SAMPLE_NAME WELLS Add sample_name and well range; See '--explain' for format --samp_list SAMP_LIST Get samples from file with per line; See --explain --samp_sltab SAMP_SLTAB Get samples from SampleLoadingTable excel file --genome_name [GENOME_NAME ...] mkref name(s) of genome(s)/species --genes [GENES ...] mkref GTF file(s) with gene annotations --fasta [FASTA ...] mkref fasta file(s) for genome(s) --gfasta GENOME_NAME FASTA mkref genome-fasta file; Gene info taken from fasta header line --sublibraries [SUBLIBRARIES ...] Paths to output directories of each sublibrary (Combine mode only) --sublib_list SUBLIB_LIST File listing sublibrary paths, one per line (Combine mode only) --sublib_pref SUBLIB_PREF Sublibrary list paths prefix (Combine mode only) --sublib_suff SUBLIB_SUFF Sublibrary list paths suffix (Combine mode only) --tscp_use TSCP_USE Transcript cutoff to use (Not calculated; given) --tscp_min TSCP_MIN Transcript cutoff min value (Limit for filtered DGE) --tscp_max TSCP_MAX Transcript cutoff max value (Limit for filtered DGE) --cell_use CELL_USE Cell count to use (+/- X-fold for filtered DGE) --cell_est CELL_EST Cell count estimate (Min to X-fold for filtered DGE) --cell_xf CELL_XF Cell estimate X-fold factor (For filtered DGE) --cell_min CELL_MIN Cell count minimum (Lower limit for filtered DGE) --cell_max CELL_MAX Cell count maximum (Upper limit for filtered DGE) --cell_list CELL_LIST List of cell barcodes to use (No tscp cutoff calculated) --crispr Run CRISPR analysis, mapping guide RNA to parent dir cells --crsp_guides CRSP_GUIDES File with crispr guides and 5' 3' context sequences; csv --crsp_read_thresh CRSP_READ_THRESH Minimum reads to qualify crispr transcripts --crsp_tscp_thresh CRSP_TSCP_THRESH Minimum transcripts to qualify crispr guide --crsp_max_mm Maximum mismatch (Hamming distance) for crispr guide mapping --crsp_use_star Use STAR for crispr guide aligment --immune_check Check immune database (BCR / TCR) installation status --bcr_analysis Run BCR analysis --tcr_analysis Run TCR analysis --immune_genome IMMUNE_GENOME Immune (BCR / TCR) genome name --use_imgt_db Use IMGT databse for immune (BCR / TCR) analysis --immune_read_thresh IMMUNE_READ_THRESH Minimum reads to qualify immune transcripts --no_save_anndata Do not save anndata h5ad files --kit_list List valid kit names and chemistry versions --chem_list List valid kit names and chemistry versions --bc_list List installed barcode sets --bc_round_set ROUND NAME Specify barcode use as , where N = 1,2,3 --rseed RSEED Random number seed --nthreads NTHREADS Number of threads to use (default = number of CPUs) --no_keep_going Turn off keep_going (Stop on any error) --reuse Reuse existing files if found (vs generate fresh) --keep_temps Keep temp files --one_step Do one step (mode) of pipeline, then stop --until_step UNTIL_STEP Run until this step (mode) then stop --clear_runproc Clear run process def files (Only); Need output_dir --start_timeout START_TIMEOUT Time for statup env check steps; Zero to skip --kit_score_skip Ignore kit score failure; WARNING Use with caution! --dryrun Dry run; Only setup and report status; Saves run process file -e, --explain Explain assumptions and usage details -V, --version show program's version number and exit [user@cn3335 ~]$ exit user@biowulf]$