ROADIES is a fully automated pipeline to infer species trees starting from raw genome assemblies. It incorporates a unique strategy of randomly sampling segments of the input genomes to generate gene trees. This eliminates the need for predefining a set of loci, limiting the analyses to a fixed number of genes, and performing the cumbersome gene annotation and/or whole genome alignment steps. ROADIES also eliminates the need to infer orthology by leveraging existing discordance-aware methods that allow multicopy genes.
OBAllocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=20g -c 16 --gres=lscratch:50 [user@cn4278 ~]$ module load roadies [+] Loading singularity 4.2.2 on cn0094 [+] Loading roadies 0.1.10 ...a [user@cn4278 ~]$ wget https://github.com/TurakhiaLab/ROADIES/archive/refs/tags/v0.1.10.tar.gz [user@cn4278 ~]$ tar -zxf v0.1.10.tar.gz && rm -f v0.1.10.tar.gz && cd ROADIES-0.1.10 [user@cn4278 ~]$ chmod +x ./workflow/scripts/* [user@cn4278 ~]$ mkdir -p output_files/genetreesDownload test data for 11 Drosophila genomes:
[user@cn4278 ~]$ mkdir -p test/test_data && cat test/input_genome_links.txt | xargs -I {} sh -c 'wget -O test/test_data/$(basename {}) {}'Compile sampling executable:
[user@cn4278 ~]$ mkdir -p ./workflow/scripts/sampling/build [user@cn4278 ~]$ cd ./workflow/scripts/sampling/build [user@cn4278 ~]$ cmake .. -DZLIB_LIBRARY=/usr/local/apps/roadies/0.1.10/conda/lib/libz.so [user@cn4278 ~]$ make [user@cn4278 ~]$ cd ../../../..Run the ROADIES pipeline:
user@cn4278 ~]$ python run_roadies.py --cores 16 --noconverge Unlocking working directory. snakemake --cores 16 --config mode=accurate config_path=config/config.yaml num_threads=4 deep_mode=Fal plete Config file config/config.yaml is extended by additional config specified via the command line. Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 16 Rules claiming more threads will be scaled down. Job stats: job count --------------- ------- all 1 filtermsa 250 lastz 11 lastz2fasta 1 mergeTrees 1 pasta 250 raxmlng 250 sequence_merge 1 sequence_select 11 total 776 Select jobs to execute... Failed to solve scheduling problem with ILP solver. Falling back to greedy solver. Run Snakemake with for debugging the problem. [Wed Aug 27 07:36:41 2025] rule sequence_select: input: test/test_data/droAna2.fa.gz output: output_files/samples/droAna2_temp.fa jobid: 9 benchmark: output_files/benchmarks/droAna2.sample.txt reason: Missing output files: output_files/samples/droAna2_temp.fa; Params have changed since last wildcards: sample=droAna2 threads: 4 resources: tmpdir=/tmp ... ./workflow/scripts/sampling/build/sampling -i test/test_data/droWil1.fa.gz -o output_files/samples/dro /usr/bin/bash: /opt/conda/envs/roadies_env/lib/libtinfo.so.6: no version information available (requir We are starting to sample test/test_data/droMoj3.fa.gz ./workflow/scripts/sampling/build/sampling -i test/test_data/droMoj3.fa.gz -o output_files/samples/dro Number of regions: 26 ID START: 181, ID END: 206 Region length: 500 Input file: test/test_data/droVir3.fa.gz Output file: output_files/samples/droVir3_temp.fa Number of resampling: 38 real 0m0.657s user 0m0.573s sys 0m0.074s Number of regions: 21 ID START: 92, ID END: 112 Region length: 500 Input file: test/test_data/droMoj3.fa.gz Output file: output_files/samples/droMoj3_temp.fa Number of resampling: 20 real 0m0.645s user 0m0.568s sys 0m0.058s [Wed Aug 27 07:36:42 2025] Finished job 16. ... [user@cn4278 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 ... ASTRAL for PaRalogs and Orthologs III (ASTRAL-Pro3) *** NOW with integrated CASTLES-Pro *** Version: v1.23.3.6 #Genetrees: 52 #Duploss: 245 #Species: 11 #Rounds: 4 #Samples: 4 #Threads: 16 #NNI moves:0/42 ((((((droSec1,droSim1),(droEre2,droYak2)),droAna2),(((droMoj3,droVir3),droGri2),droWil1)),dp4),droPer1); #NNI moves:0/42 ((((((((droVir3,droMoj3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droYak2,droEre2)),droSec1),droSim1); #NNI moves:0/42 ((((((((droMoj3,droVir3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droSec1,droSim1)),droEre2),droYak2); #NNI moves:0/42 (((((((droSim1,droSec1),(droYak2,droEre2)),droAna2),(dp4,droPer1)),droWil1),(droMoj3,droVir3)),droGri2); Initial score: 7923 Initial tree: ((((((droSim1,droSec1),(droYak2,droEre2)),droAna2),(dp4,droPer1)),((droMoj3,droVir3),droGri2)),droWil1); *** Subsample Process *** #NNI moves:0/42 ((((((((droMoj3,droVir3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droEre2,droYak2)),droSim1),droSec1); #NNI moves:0/42 ((((((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(droPer1,dp4)),droWil1),droGri2),droVir3),droMoj3); #NNI moves:0/42 (((((((droYak2,droEre2),(droSec1,droSim1)),droAna2),(droPer1,dp4)),droWil1),(droMoj3,droVir3)),droGri2); #NNI moves:0/42 ((((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(((droVir3,droMoj3),droGri2),droWil1)),dp4),droPer1); Current score: 7923 Current tree: ((((droVir3,droMoj3),droGri2),((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(dp4,droPer1))),droWil1); Final Tree: ((((droVir3,droMoj3),droGri2),((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(dp4,droPer1))),droWil1); #EqQuartets: 8625 Score: 7923 Species tree created [user@biowulf ~]$