Tandem repeat variations (TR-CNVs) are copy number variations of repeat units of tandem repeats (TRs). TRsv detects TR-CNVs within TR regions and structural variations (SVs) and short indels outside TR regions separately using long reads. The TR regions are defined using either a pre-built human TR bed file or user-prepared TR bed file. TRsv improves TR-CNV detection by assembling fragmented insertion and deletion alignments within a TR region of one long read alignment and searching for the TR units in the insertion sequence. In addition, TRsv improves SV detection derived from non-HiFi reads (PacBio CLR and Nanopore long reads) through machine learning-based filtering.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --gres lscratch:1 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load trsv [user@cn3144 ~]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144 ~]$ cp -r $TRSV_HOME/test_data/* . [user@cn3144 ~]$ TRsv call -c config_hifi.txt [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. trsv.sh). For example:
#!/bin/bash set -e module load trsv cp -r $TRSV_HOME/test_data/* . TRsv call -c config_hifi.txt
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] trsv.sh
Create a swarmfile (e.g. trsv.swarm). For example:
TRsv call -c config1.txt TRsv call -c config2.txt TRsv call -c config3.txt
Submit this job using the swarm command.
swarm -f trsv.swarm [-g #] [-t #] --module trsvwhere
| -g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
| -t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
| --module trsv | Loads the TRsv module for each subjob in the swarm |