DeCiFer is an algorithm that simultaneously selects mutation multiplicities and clusters somatic single-nucleotide variants (SNVs) by their corresponding descendant cell fractions (DCF), a statistic that quantifies the proportion of cells which acquired the SNV or whose ancestors acquired the SNV. DCF is related to the commonly used cancer cell fraction (CCF) but further accounts for SNVs which are lost due to deleterious somatic copy-number aberrations (CNAs), identifying clusters of SNVs which occur in the same phylogenetic branch of tumour evolution.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=24g --cpus-per-task=24 --gres=lscratch:20 [user@cn3335 ~]$module load decifer [+] Loading singularity 3.10.3 on cn0802 [+] Loading decifer 2.1.3 ...Set up references:
[user@biowulf]$ decifer -h
usage: decifer [-h] -p PURITYFILE [--betabinomial] [-i SNPFILE] [-s SEGFILE] [-v SENSITIVITY]
[-R RESTARTS_BB] [-x SKIP] [--ccf] [-k MINK] [-K MAXK] [-r RESTARTS] [-t MAXIT]
[-e ELBOW] [--binarysearch] [--record] [-j JOBS] [-o OUTPUT]
[--statetrees STATETREES] [--seed SEED] [--debug] [--printallk] [--conservativeCIs]
[--vafdevfilter VAFDEVFILTER] [--silhouette]
INPUT
DeCiFer.
positional arguments:
INPUT Input file in DeCiFer format.
optional arguments:
-h, --help show this help message and exit
-p PURITYFILE, --purityfile PURITYFILE
File with purity of each sample (TSV file in two columns`SAMPLE PURITY`)
--betabinomial Use betabinomial likelihood to cluster mutations (default: binomial)
-i SNPFILE, --snpfile SNPFILE
File with precisions for betabinomial fit (default: binomial likelihood)
-s SEGFILE, --segfile SEGFILE
File with precisions for betabinomial fit (default: binomial likelihood)
-v SENSITIVITY, --sensitivity SENSITIVITY
Sensitivity E to exclude SNPs with 0.5 - E <= BAF < 0.5, for estimating
betabinomial parameters (default: 0.1)
-R RESTARTS_BB, --restarts_bb RESTARTS_BB
Maximum size of brute-force search, when fitting betabinomial parameters
(default: 1e4)
-x SKIP, --skip SKIP Numbers to skip in the brute-force search, when fitting betabinomial
parameters (default: 10)
--ccf Run with CCF instead of DCF (default: False)
-k MINK, --mink MINK Minimum number of clusters, which must be at least 2 (default: 2)
-K MAXK, --maxk MAXK Maximum number of clusters (default: 12)
-r RESTARTS, --restarts RESTARTS
Number of restarts (default: 20)
-t MAXIT, --maxit MAXIT
Maximum number of iterations per restart (default: 200)
-e ELBOW, --elbow ELBOW
Elbow sensitivity, lower values increase sensitivity (default: 0.06)
--binarysearch Use binary-search model selection (default: False, iterative is used; use
binary search when considering large numbers of clusters
--record Record objectives (default: False)
-j JOBS, --jobs JOBS Number of parallele jobs to use (default: equal to number of available
processors)
-o OUTPUT, --output OUTPUT
Output prefix (default: ./decifer)
--statetrees STATETREES
Filename of state-trees file (default: use state_trees.txt in the package)
--seed SEED Random-generator seed (default: None)
--debug single-threaded mode for development/debugging
--printallk Print all results for each value of K explored by DeCiFer
--conservativeCIs Beta: compute CIs using DCF point values assigned to cluster instead of
cluster likelihood function
--vafdevfilter VAFDEVFILTER
Filter poorly fit SNVs with VAFs that are more than this number of
standard deviations away from the cluster center VAF (default 1.5)
--silhouette Beta: select the number of clusters using a silhouette score
...
Downloading mutation input file:
[user@cn3335 ~]$ mkdir -p data [user@cn3335 ~]$ curl -L 'https://raw.githubusercontent.com/raphael-group/decifer-data/main/input/prostate/mutations/A12.decifer.input.tsv' > data/mutations.tsvDownloading purity input file:
[user@cn3335 ~]$ curl -L 'https://raw.githubusercontent.com/raphael-group/decifer-data/main/input/prostate/purity/A12.purity.txt' > data/purity.tsvRunning DeCiFer:
[user@cn3335 ~]$ decifer data/mutations.tsv -p data/purity.tsv -k 5 -K 8 -r 20 --seed 17 -j 24
Arguments:
input : data/mutations.tsv
mink : 5
maxk : 8
maxit : 200
purity : data/purity.tsv
restarts : 20
elbow : 0.06
iterative : True
record : False
J : 128
output : ./decifer
ccf : False
betabinomial : False
snpfile : None
segfile : None
restarts_bb : 10000
threshold : 0.1
skip : 10
statetrees : /opt/conda/lib/python3.9/site-packages/decifer/state_trees.txt
debug : False
printallk : False
conservativeCIs : False
vafdevfilter : 1.5
silhouette : False
0 D
1 C
2 A
Using iterative model selection
Progress: |------------------------------| 1.2% Complete [[2022-Nov-22 16:37:38]Completed 2 for k=5
Progress: |------------------------------| 2.5% Complete [[2022-Nov-22 16:37:38]Completed 1 for k=5
Progress: |█-----------------------------| 3.8% Complete [[2022-Nov-22 16:37:40]Completed 5 for k=5
Progress: |█-----------------------------| 5.0% Complete [[2022-Nov-22 16:37:40]Completed 4 for k=5
Progress: |█-----------------------------| 6.2% Complete [[2022-Nov-22 16:37:40]Completed 3 for k=5
Progress: |██----------------------------| 7.5% Complete [[2022-Nov-22 16:37:40]Completed 0 for k=5
Progress: |██----------------------------| 8.8% Complete [[2022-Nov-22 16:37:48]Completed 1 for k=6
Progress: |███---------------------------| 10.0% Complete [[2022-Nov-22 16:37:59]Completed 0 for k=7
Progress: |███---------------------------| 11.2% Complete [[2022-Nov-22 16:37:59]Completed 4 for k=7
Progress: |███---------------------------| 12.5% Complete [[2022-Nov-22 16:37:59]Completed 1 for k=7
Progress: |████--------------------------| 13.8% Complete [[2022-Nov-22 16:38:04]Completed 3 for k=6
Progress: |████--------------------------| 15.0% Complete [[2022-Nov-22 16:38:04]Completed 0 for k=6
Progress: |████--------------------------| 16.2% Complete [[2022-Nov-22 16:38:05]Completed 4 for k=6
Progress: |█████-------------------------| 17.5% Complete [[2022-Nov-22 16:38:05]Completed 2 for k=6
Progress: |█████-------------------------| 18.8% Complete [[2022-Nov-22 16:38:08]Completed 6 for k=5
...
Progress: |████████████████████████████--| 95.0% Complete [[2022-Nov-22 16:41:16]Completed 17 for k=7
Progress: |████████████████████████████--| 96.2% Complete [[2022-Nov-22 16:41:19]Completed 11 for k=6
Progress: |█████████████████████████████-| 97.5% Complete [[2022-Nov-22 16:41:33]Completed 9 for k=8
Progress: |█████████████████████████████-| 98.8% Complete [[2022-Nov-22 16:41:40]Completed 18 for k=6
Progress: |██████████████████████████████| 100.0% Complete [[2022-Nov-22 16:42:01]Completed 19 for k=8
[Iterations: 17]]
[user@cn3335 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$