ImReP is a novel computational method for rapid and accurate profiling of the adaptive immune repertoire from regular RNA-Seq data. It is able to efficiently extract TCR- and BCR-derived reads from RNA-Seq data. ImReP can also accurately assemble the complementary determining regions 3 (CDR3s), the most variable regions of B and T cell receptors, and determine their antigen specificity.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g
[user@cn3316 ~]$ module load imrep
[+] Loading singularity 4.0.1 on cn3316
[+] Loading imrep 0.8
[user@cn3316 ~]$ imrep.py -h
usage: python2 imrep.py [-h] [--fastq] [--bam] [--chrFormat2] [--hg38]
[-a ALLREADS] [--digGold] [-s SPECIES] [-o OVERLAPLEN]
[--noOverlapStep] [--extendedOutput] [-c CHAINS]
[--noCast] [-f FILTERTHRESHOLD]
[--minOverlap1 MINOVERLAP1]
[--minOverlap2 MINOVERLAP2] [--misMatch1 MISMATCH1]
[--misMatch2 MISMATCH2]
reads_file output_clones
optional arguments:
-h, --help show this help message and exit
Necessary Inputs:
reads_file unmapped reads in .fasta (default) or .fastq (if flag
--fastq is set) or .bam (if --bam or --digGold is set)
output_clones output file with CDR3 clonotypes
Optional Inputs:
--fastq a binary flag used to indicate that the input file
with unmapped reads is in fastq format
--bam a binary flag used to indicate that the input file is
a BAM file mapped and unmapped reads
...
Advanced Inputs:
--minOverlap1 MINOVERLAP1
minimal overlap between the reads and A) the left part
of V gene (before C amino acid) and B) the right part
of J gene (after W for IGH and F for all other
chains), default is 4
--minOverlap2 MINOVERLAP2
minimal overlap between the reads and A) the right
part of V gene (after C amino acid) and B) the left
part of J gene (before W for IGH and F for all other
chains), default is 1
...
[user@cn3316 ~]$ suffix_tree.py
SIMPLE TEST
leaf: "mississippi$" : "mississippi$"
leaf: "ississippi$" : "ssippi$"
leaf: "issippi$" : "ppi$"
leaf: "ippi$" : "ppi$"
leaf: "i$" : "$"
leaf: "ssissippi$" : "ssippi$"
leaf: "ssippi$" : "ppi$"
leaf: "sissippi$" : "ssippi$"
leaf: "sippi$" : "ppi$"
leaf: "ppi$" : "pi$"
leaf: "pi$" : "i$"
leaf: "$" : "$"
inner: "ssi"
inner: "i"
inner: "si"
inner: "i"
inner: "s"
inner: "p"
inner: ""
done.
GENERALISED TEST
----------------------------------------------------------------------
0 [0:1] x |x|abxa
0 [3:4] x xab|x|a
1 [3:4] x bab|x|ba
----------------------------------------------------------------------
0 [1:4] abx x|abx|a
1 [1:4] abx b|abx|ba
----------------------------------------------------------------------
0 [1:2] a x|a|bxa
1 [1:2] a b|a|bxba
0 [4:5] a xabx|a|
1 [5:6] a babxb|a|
----------------------------------------------------------------------
0 [2:4] bx xa|bx|a
1 [2:4] bx ba|bx|ba
----------------------------------------------------------------------
0 [2:3] b xa|b|xa
1 [2:3] b ba|b|xba
1 [0:1] b |b|abxba
1 [4:5] b babx|b|a
======================================================================
----------------------------------------------------------------------
0 [1:4] abx x|abx|a
1 [1:4] abx b|abx|ba
----------------------------------------------------------------------
0 [2:4] bx xa|bx|a
1 [2:4] bx ba|bx|ba
======================================================================
done.
End the interactive session:
[user@cn3316 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$