CytoTRACE: predicting differentiation state of cells from single-cell RNA-sequencing data.

CytoTRACE (Cellular (Cyto) Trajectory Reconstruction Analysis using gene Counts and Expression) is a computational method that predicts the differentiation state of cells from single-cell RNA-sequencing data. CytoTRACE leverages a simple, yet robust, determinant of developmental potential—the number of detectably expressed genes per cell, or gene counts. We have validated CytoTRACE on ~150K single-cell transcriptomes spanning 315 cell phenotypes, 52 lineages, 14 tissue types, 9 scRNA-seq platforms, and 5 species.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

[user@biowulf]$ sinteractive  --mem=20g --gres=lscratch:20 -c4 
[user@cn0861 ~]$ module load cytotrace   
[+] Loading singularity  3.10.5  on cn2372
[+] Loading cytotrace  0.3.4
[user@cn0861 ~]$ mkdir -p /data/$USER/CytoTrace && cd /data/$USER/CytoTrace 
Download sample data:
[user@cn0861 ~]$ wget https://cytotrace.stanford.edu/dataset_marrow10x.txt 
[user@cn0861 ~]$ wget wget https://cytotrace.stanford.edu/dataset_marrowplate.txt
Run CytoTRACE on these data:
[user@cn0861 ~]$ R-ct 
> library(CytoTRACE) 
Welcome to the CytoTRACE R package, a tool for the unbiased prediction of differentiation states in scRNA-seq data. For more information about this method, please visit https://cytotrace.stanford.edu or email us at cytotrace@gmail.com.
> CytoTRACE(read.table("dataset_marrow10x.txt")) 
The number of cells in your dataset exceeds 3,000. CytoTRACE will now be run in fast mode (see documentation). You can multi-thread this run using the 'ncores' flag. To disable fast mode, please indicate 'enableFast = FALSE'.
CytoTRACE will be run on 3 sub-sample(s) of approximately 1142 cells each using 1 / 1 core(s)
Pre-processing data and generating similarity matrix...
Calculating gene counts signature...
Smoothing values with NNLS regression and diffusion...
Calculating genes associated with CytoTRACE...
...
$gcsGenes
         Eif5a           Ybx1           Ppia          Rps17          Cox8a
  8.930324e-01   8.805865e-01   8.792047e-01   8.780918e-01   8.739148e-01
         Snrpg          Rpl28           Rps2         Atp5g2           Rpsa
  8.717692e-01   8.711641e-01   8.658732e-01   8.639576e-01   8.627667e-01
...
Counts
X10X_P7_2_AAACCTGCAGTAACGG X10X_P7_2_AAACGGGAGGACGAAA
                      2301                       2627
X10X_P7_2_AAACGGGAGGTACTCT X10X_P7_2_AAACGGGAGGTGCTTT
                      3183                       1034
...
Gm106                            0.0000000                  0.0000000
Rpl7                             2.8272534                  2.2531575
Rdh10                            0.0000000                  0.0000000
                X10X_P7_3_TTTGTCAAGCGCTCCA X10X_P7_3_TTTGTCAAGGCAGTCA
Mrpl15                           0.2984039                  0.0000000
Lypla1                           0.2984039                  0.0000000
Tcea1                            0.0000000                  0.0000000
Atp6v1h                          0.0000000                  0.0000000
Rb1cc1                           0.5455397                  0.1230472
Pcmtd1                           0.0000000                  0.0000000
Rrs1                             0.5455397                  0.0000000
Adhfe1                           0.0000000                  0.0000000
Mybl1                            0.0000000                  0.0000000
...
Terf1                            0.0000000
Gm106                            0.0000000
Rpl7                             3.9047688
Rdh10                            0.0000000
 [ reached getOption("max.print") -- omitted 13488 rows ]

Warning message:
In CytoTRACE(read.table("dataset_marrow10x.txt")) :
  9 genes have zero expression in the matrix and were filtered
>  iCytoTRACE(list(read.table("dataset_marrow10x.txt"), read.table("dataset_marrowplate.txt")))
Would you like to create a default Python environment for the reticulate package? (Yes/no/cancel) no
Found 13453 genes among all datasets
[[0.         0.65976072]
 [0.         0.        ]]
Processing datasets (0, 1)
Found 13453 genes among all datasets
[[0.         0.65976072]
 [0.         0.        ]]
Processing datasets (0, 1)
The number of cells in your integrated dataset is less than 10,000. Fast mode has been disabled.
CytoTRACE will be run on 1 sub-sample(s) of approximately 7869 cells each using 1 / 1 core(s)
Calculating genes associated with iCytoTRACE...
$exprMatrix
...
X10X_P7_2_GATGAAACACATTCGA -2.893816e-03
X10X_P7_2_GATGAAAGTGACAAAT  3.646599e-03
X10X_P7_2_GATGAAAGTGCACTTA -3.304935e-03
X10X_P7_2_GATGAAAGTTACGTCA  5.292558e-03
X10X_P7_2_GATGAGGAGCACCGCT  4.832055e-03
X10X_P7_2_GATGAGGAGGTGCACA  5.018071e-03
X10X_P7_2_GATGAGGCAGTCGATT -2.733117e-04
X10X_P7_2_GATGAGGCATGGTTGT  4.190494e-03
X10X_P7_2_GATGAGGGTTGATTGC -1.724375e-02
X10X_P7_2_GATGAGGTCAACACCA  5.849512e-03
X10X_P7_2_GATGAGGTCCTCCTAG -5.341377e-03
X10X_P7_2_GATGCTAAGTCACGCC -4.335768e-03
X10X_P7_2_GATGCTACATGGGAAC  8.643952e-03
X10X_P7_2_GATGCTAGTACCTACA  3.564311e-03
X10X_P7_2_GATTCAGTCACTCCTG  1.188439e-02
X10X_P7_2_GCAAACTAGATGAGAG -2.990178e-03
X10X_P7_2_GCAAACTAGCCTTGAT  1.439598e-02
X10X_P7_2_GCAAACTGTTCTGTTT -1.817505e-02
X10X_P7_2_GCAAACTTCGACAGCC -5.349718e-03
X10X_P7_2_GCAATCACAGTCGTGC  7.415567e-03
X10X_P7_2_GCAATCATCGGAGCAA -2.271764e-02
X10X_P7_2_GCAATCATCTAACCGA -1.381307e-02
X10X_P7_2_GCACATACATGGATGG -1.717161e-02
X10X_P7_2_GCACATATCTGAGGGA  9.387268e-05
X10X_P7_2_GCACTCTAGTGCCAGA  5.458449e-03
X10X_P7_2_GCACTCTCAATGGATA -2.407448e-03
X10X_P7_2_GCACTCTGTACTTGAC  4.371174e-03
X10X_P7_2_GCAGCCAAGGCAAAGA -1.165523e-03
X10X_P7_2_GCAGCCACAAGTCATC -8.224130e-03
X10X_P7_2_GCAGCCACATGATCCA -1.164582e-02
X10X_P7_2_GCAGCCAGTAGATTAG -3.261392e-04
X10X_P7_2_GCAGCCATCGGAGCAA -2.124615e-03
X10X_P7_2_GCAGCCATCGGCTACG  1.063609e-02
X10X_P7_2_GCAGTTAAGGAGTACC -6.749588e-03
X10X_P7_2_GCAGTTACAATAACGA  4.061926e-03
X10X_P7_2_GCAGTTATCTCAACTT -5.714046e-03
X10X_P7_2_GCATACACATGGGACA  4.263237e-03
X10X_P7_2_GCATACAGTGACTACT -1.923917e-02
X10X_P7_2_GCATACATCGTGACAT  5.930570e-03
X10X_P7_2_GCATGATAGAGACTTA  2.220203e-02
X10X_P7_2_GCATGATCAGTTAACC -3.296641e-03
X10X_P7_2_GCATGATTCCGCGTTT -1.341521e-02
X10X_P7_2_GCATGATTCTGAGGGA  8.092239e-03
X10X_P7_2_GCATGCGAGAAACCAT  1.087689e-02
X10X_P7_2_GCATGCGAGGTTCCTA  3.337084e-02
X10X_P7_2_GCATGCGCACGAGAGT -4.072700e-02
X10X_P7_2_GCATGCGCATGAAGTA -5.383050e-03
X10X_P7_2_GCATGTAAGGTGACCA -8.580112e-03
X10X_P7_2_GCATGTACAAAGGCGT -7.766561e-03
X10X_P7_2_GCATGTACAGTCTTCC -3.473503e-03
X10X_P7_2_GCATGTAGTCCAGTAT -1.260405e-02
X10X_P7_2_GCATGTATCACTTATC  2.053231e-02
X10X_P7_2_GCCAAATAGATCGGGT  5.880604e-03
X10X_P7_2_GCCTCTACACTCAGGC -2.391912e-03
X10X_P7_2_GCCTCTACATAAGACA  1.152437e-02
X10X_P7_2_GCGAGAATCTTCCTTC -1.009419e-03
X10X_P7_2_GCGCAACGTAAAGGAG -5.744237e-03
X10X_P7_2_GCGCAACGTATAAACG -6.318104e-03
X10X_P7_2_GCGCAACGTGTGGTTT  3.011639e-03
X10X_P7_2_GCGCAGTAGGCTACGA  5.207578e-03
X10X_P7_2_GCGCAGTAGTCATCCA  1.798909e-02
X10X_P7_2_GCGCAGTTCCCTAATT  3.193344e-03
X10X_P7_2_GCGCCAACAGCCAATT  2.698768e-02
X10X_P7_2_GCGCCAACAGGTCTCG -1.462139e-02
X10X_P7_2_GCGCCAACATCACAAC  1.835279e-04
 [ reached getOption("max.print") -- omitted 6870 rows ]

$filteredCells
character(0)
[user@cn0861 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226