Biowulf High Performance Computing at the NIH
Spatial Transcriptomics (ST) pipeline: processing and analyzing the raw files generated with the Spatial Transcriptomics method.

The Spatial Transcriptomics (ST) pipeline contains the tools and scripts needed to process and analyze the raw files generated with the Spatial Transcriptomics technology in FASTQ format to generate datasets for down-stream analysis. The ST pipeline can also be used to process single cell RNASeq data as long as a file with barcodes identifying each cell is provided. The ST Pipeline has been optimized for speed, robustness and it is very easy to use with many parameters to adjust all the settings. The ST Pipeline is fully parallel and has constant memory use.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --mem=32g --scus-per-task=14
[user@cn2396 ~]$ module load st_pipeline
[+] Loading STAR  2.7.2a
[+] Loading samtools 1.9  ...
[+] Loading st_pipeline  1.7.6
[user@cn2396 ~]$ git clone https://github.com/SpatialTranscriptomicsResearch/st_pipeline
[user@cn2396 ~]$ st_pipeline/tests/adaptors_test.py 
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

[user@cn2396 ~]$ st_pipeline/tests/clustering_test.py 
.....
----------------------------------------------------------------------
Ran 5 tests in 0.007s

OK
[user@cn2396 ~]$ st_pipeline/tests/pipeline_run_test.py 
ST Pipeline Test Temporary directory /tmp/st_pipeline_test_tempwkhmybsm
ST Pipeline Test Temporary output /tmp/st_pipeline_test_outputy3afhq3h
ST Pipeline Test Log file /tmp/st_pipeline_test_log355dpnfg
ST Pipeline Test Downloading genome files...
ST Pipeline Test Creating genome index...
Oct 24 09:21:21 ..... started STAR run
Oct 24 09:21:21 ... starting to generate Genome files
WARNING: --genomeSAindexNbases 14 is too large for the genome size=58720256, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 11              Oct 24 09:21:22
Oct 24 09:21:22 ... starting to sort Suffix Array. This may take a long time...
Oct 24 09:21:23 ... sorting Suffix Array chunks and saving them to disk...
Oct 24 09:21:36 ... loading chunks from disk, packing SA...
Oct 24 09:21:37 ... finished generating suffix array
Oct 24 09:21:37 ... generating Suffix Array index
Oct 24 09:21:57 ... completed Suffix Array index
Oct 24 09:21:57 ... writing Genome to disk ...
Oct 24 09:21:57 ... writing Suffix Array to disk ...
Oct 24 09:21:58 ... writing SAindex to disk
Oct 24 09:21:59 ..... finished successfully
ST Pipeline Test Creating contaminant genome index...
Oct 24 09:21:59 ..... started STAR run
Oct 24 09:21:59 ... starting to generate Genome files
WARNING: --genomeSAindexNbases 14 is too large for the genome size=524288, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 8                 Oct 24 09:21:59
Oct 24 09:21:59 ... starting to sort Suffix Array. This may take a long time...
Oct 24 09:21:59 ... sorting Suffix Array chunks and saving them to disk...
Oct 24 09:21:59 ... loading chunks from disk, packing SA...
Oct 24 09:21:59 ... finished generating suffix array
Oct 24 09:21:59 ... generating Suffix Array index
Oct 24 09:22:02 ... completed Suffix Array index
Oct 24 09:22:02 ... writing Genome to disk ...
Oct 24 09:22:02 ... writing Suffix Array to disk ...
Oct 24 09:22:02 ... writing SAindex to disk
Oct 24 09:22:03 ..... finished successfully
/usr/local/apps/st_pipeline/1.7.6/lib/python3.6/site-packages/stpipeline/common/utils.py:91: DeprecationWarning: 'U' mode is deprecated
  return open(filename, atrib)
[bam_sort_core] merging from 0 files and 55 in-memory blocks...
input_reads_forward: 100000
input_reads_reverse: 100000
reads_after_trimming_forward: 66202
reads_after_trimming_reverse: 66202
reads_after_rRNA_trimming: 0
reads_after_mapping: 0
reads_after_annotation: 8771
reads_after_demultiplexing: 8447
reads_after_duplicates_removal: 8420
unique_events: 5900
genes_found: 642
duplicates_found: 351
pipeline_version: 0.6.1
mapper_tool: STAR 2.4.0
annotation_tool: HTSeq 0.6.1
demultiplex_tool: TAGGD 0.2.2
input_parameters:
max_genes_feature: 78
min_genes_feature: 1
max_reads_feature: 192.0
min_reads_feature: 1.0
max_reads_unique_event: 34.0
min_reads_unique_event: 0.0
avergage_gene_feature: 6.203995793901156
average_reads_feature: 8.853838065194532
.ST Pipeline Test Remove temporary output /tmp/st_pipeline_test_outputy3afhq3h
ST Pipeline Test Remove temporary directory /tmp/st_pipeline_test_tempwkhmybsm

----------------------------------------------------------------------
Ran 1 test in 186.432s

OK
End the interactive session:
[user@cn2396 ~]$ exit
[user@biowulf ~]$