Biowulf High Performance Computing at the NIH
Scramble: a tool for mobile element insertion detection

Scramble is a mobile element insertion (MEI) detection tool. It identifies clusters of soft clipped reads in a BAM file, builds consensus sequences, aligns to representative L1Ta, AluYa5, and SVA-E sequences, and outputs MEI calls.


Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf ~]$ sinteractive --mem=4g
salloc.exe: Pending job allocation 56730292
salloc.exe: job 56730292 queued and waiting for resources
salloc.exe: job 56730292 has been allocated resources
salloc.exe: Granted job allocation 56730292
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3148 are ready for job
[user@cn3148 ~]$ module load Scramble 
[+] Loading singularity  3.5.3  on cn3148
[+] Loading Scramble 0.0.20190211.82c78b9  ...
Copy sample data into your current directory:
[user@cn3148 ~]$ cp $SCRAMBLE_DATA/* .
You can run Scramble in two different ways.

1) When you execute the scramble command without arguments, a new shell will be opened for you within a Singularity container:
[user@cn3148 ~]$ scramble
Your environment will change and you will have access to a different set of commands and executables. For example, you can run the command:
Singularity>  Rscript --vanilla /app/cluster_analysis/bin/SCRAMble.R \
                        --out-name ${PWD}/sample.mei.txt \
                        --cluster-file ${PWD}/sample_cluster.txt \
                        --install-dir /app/cluster_analysis/bin \
                        --mei-refs /app/cluster_analysis/resources/MEI_consensus_seqs.fa \
                        --ref /app/validation/test.fa \
                        --eval-dels \
                        --eval-meis \

Running sample: /gpfs/gsfs8/users/apptest2/SCRAMBLE_TEST/sample_cluster.txt 
Running scramble with options:
blastRef : /app/validation/test.fa 
clusterFile : /gpfs/gsfs8/users/apptest2/SCRAMBLE_TEST/sample_cluster.txt 
deletions : TRUE 
indelScore : 80 
INSTALL.DIR : /app/cluster_analysis/bin 
mei.refs : /app/cluster_analysis/resources/MEI_consensus_seqs.fa 
meis : TRUE 
meiScore : 50 
minDelLen : 50 
nCluster : 5 
no.vcf : TRUE 
outFilePrefix : /gpfs/gsfs8/users/apptest2/SCRAMBLE_TEST/sample.mei.txt 
pctAlign : 90 
polyAdist : 100 
polyAFrac : 0.75 
Useful Functions Loaded
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append,, basename, cbind, colnames,
    dirname,, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax,, pmin,, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:


Loading required package: IRanges
Loading required package: XVector

Attaching package: ‘Biostrings’

The following object is masked from ‘package:base’:


Done analyzing l1 
Done analyzing sva 
Done analyzing alu 
Done analyzing l1 
Done analyzing sva 
Done analyzing alu 
Sample had 0 MEI(s)
Done analyzing MEIs
532 clusters out of 927 were removed due to simple sequence
Number of alignments meeting thresholds: 395 
Number of best alignments: 0 
[1] "Two-End-Deletions: Working on contig 1"
[1] "Two-End-Deletions: Working on contig 10"
[1] "Two-End-Deletions: Working on contig 11"
[1] "Two-End-Deletions: Working on contig 12"
[1] "Two-End-Deletions: Working on contig 13"
[1] "Two-End-Deletions: Working on contig 14"
[1] "Two-End-Deletions: Working on contig 15"
[1] "Two-End-Deletions: Working on contig 16"
[1] "Two-End-Deletions: Working on contig 17"
[1] "Two-End-Deletions: Working on contig 18"
[1] "Two-End-Deletions: Working on contig 19"
[1] "Two-End-Deletions: Working on contig 2"
[1] "Two-End-Deletions: Working on contig 20"
[1] "Two-End-Deletions: Working on contig 22"
[1] "Two-End-Deletions: Working on contig 3"
[1] "Two-End-Deletions: Working on contig 4"
[1] "Two-End-Deletions: Working on contig 5"
[1] "Two-End-Deletions: Working on contig 6"
[1] "Two-End-Deletions: Working on contig 7"
[1] "Two-End-Deletions: Working on contig 8"
[1] "Two-End-Deletions: Working on contig GL000220.1"
[1] "Two-End-Deletions: Working on contig hs37d5"
[1] "Two-End-Deletions: Working on contig X"
[1] "finished one end dels"
Sample had 0 deletions
Done analyzing deletions
Warning message:
In predict.BLAST(bl, seq, BLAST_args = "-dust no") :
  BLAST did not return a match!
Please remember to exit this new shell when you are finished with your session.
Singularity> exit

2) Alternatively, you can run the Rscript or other command(s) directly from the Linux shell, but in this case the command(s) must be preceded by scramble. For example:
[user@cn3148 ~]$ scramble Rscript --vanilla /app/cluster_analysis/bin/SCRAMble.R \
                        --out-name ${PWD}/sample.mei.txt \
                        --cluster-file ${PWD}/sample_cluster.txt \
                        --install-dir /app/cluster_analysis/bin \
                        --mei-refs /app/cluster_analysis/resources/MEI_consensus_seqs.fa \
                        --ref /app/validation/test.fa \
                        --eval-dels \
                        --eval-meis \
Exit the interactive shell:
[user@cn3148 ~]$ exit
salloc.exe: Relinquishing job allocation 49998864
[user@biowulf ~]$