ROADIES: Reference-free Orthology-free Annotation-free DIscordance aware Estimation of Species tree

ROADIES is a fully automated pipeline to infer species trees starting from raw genome assemblies. It incorporates a unique strategy of randomly sampling segments of the input genomes to generate gene trees. This eliminates the need for predefining a set of loci, limiting the analyses to a fixed number of genes, and performing the cumbersome gene annotation and/or whole genome alignment steps. ROADIES also eliminates the need to infer orthology by leveraging existing discordance-aware methods that allow multicopy genes.

OB

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --mem=20g -c 16 --gres=lscratch:50
[user@cn4278 ~]$ module load roadies    
[+] Loading singularity  4.2.2  on cn0094
[+] Loading roadies 0.1.10  ...a
[user@cn4278 ~]$ wget https://github.com/TurakhiaLab/ROADIES/archive/refs/tags/v0.1.10.tar.gz
[user@cn4278 ~]$ tar -zxf v0.1.10.tar.gz && rm -f v0.1.10.tar.gz && cd ROADIES-0.1.10
[user@cn4278 ~]$ chmod +x ./workflow/scripts/*
[user@cn4278 ~]$ mkdir -p output_files/genetrees
Download test data for 11 Drosophila genomes:
[user@cn4278 ~]$ mkdir -p test/test_data && cat test/input_genome_links.txt | xargs -I {} sh -c 'wget -O test/test_data/$(basename {}) {}'
Compile sampling executable:
[user@cn4278 ~]$ mkdir -p ./workflow/scripts/sampling/build 
[user@cn4278 ~]$ cd ./workflow/scripts/sampling/build
[user@cn4278 ~]$ cmake .. -DZLIB_LIBRARY=/usr/local/apps/roadies/0.1.10/conda/lib/libz.so 
[user@cn4278 ~]$ make 
[user@cn4278 ~]$ cd ../../../..
Run the ROADIES pipeline:
user@cn4278 ~]$ python run_roadies.py --cores 16 --noconverge
Unlocking working directory.
snakemake --cores 16 --config mode=accurate config_path=config/config.yaml num_threads=4 deep_mode=Fal        plete
Config file config/config.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 16
Rules claiming more threads will be scaled down.
Job stats:
job                count
---------------  -------
all                    1
filtermsa            250
lastz                 11
lastz2fasta            1
mergeTrees             1
pasta                250
raxmlng              250
sequence_merge         1
sequence_select       11
total                776

Select jobs to execute...
Failed to solve scheduling problem with ILP solver. Falling back to greedy solver. Run Snakemake with         for debugging the problem.

[Wed Aug 27 07:36:41 2025]
rule sequence_select:
    input: test/test_data/droAna2.fa.gz
    output: output_files/samples/droAna2_temp.fa
    jobid: 9
    benchmark: output_files/benchmarks/droAna2.sample.txt
    reason: Missing output files: output_files/samples/droAna2_temp.fa; Params have changed since last
    wildcards: sample=droAna2
    threads: 4
    resources: tmpdir=/tmp
...
./workflow/scripts/sampling/build/sampling -i test/test_data/droWil1.fa.gz -o output_files/samples/dro
/usr/bin/bash: /opt/conda/envs/roadies_env/lib/libtinfo.so.6: no version information available (requir
We are starting to sample test/test_data/droMoj3.fa.gz
./workflow/scripts/sampling/build/sampling -i test/test_data/droMoj3.fa.gz -o output_files/samples/dro
Number of regions: 26
ID START: 181, ID END: 206
Region length: 500
Input file: test/test_data/droVir3.fa.gz
Output file: output_files/samples/droVir3_temp.fa
Number of resampling: 38

real    0m0.657s
user    0m0.573s
sys     0m0.074s
Number of regions: 21
ID START: 92, ID END: 112
Region length: 500
Input file: test/test_data/droMoj3.fa.gz
Output file: output_files/samples/droMoj3_temp.fa
Number of resampling: 20

real    0m0.645s
user    0m0.568s
sys     0m0.058s
[Wed Aug 27 07:36:42 2025]
Finished job 16.
...
[user@cn4278 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
...
ASTRAL for PaRalogs and Orthologs III (ASTRAL-Pro3)
*** NOW with integrated CASTLES-Pro ***
Version: v1.23.3.6
#Genetrees: 52
#Duploss: 245
#Species: 11
#Rounds: 4
#Samples: 4
#Threads: 16
#NNI moves:0/42
((((((droSec1,droSim1),(droEre2,droYak2)),droAna2),(((droMoj3,droVir3),droGri2),droWil1)),dp4),droPer1);
#NNI moves:0/42
((((((((droVir3,droMoj3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droYak2,droEre2)),droSec1),droSim1);
#NNI moves:0/42
((((((((droMoj3,droVir3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droSec1,droSim1)),droEre2),droYak2);
#NNI moves:0/42
(((((((droSim1,droSec1),(droYak2,droEre2)),droAna2),(dp4,droPer1)),droWil1),(droMoj3,droVir3)),droGri2);
Initial score: 7923
Initial tree: ((((((droSim1,droSec1),(droYak2,droEre2)),droAna2),(dp4,droPer1)),((droMoj3,droVir3),droGri2)),droWil1);
*** Subsample Process ***
#NNI moves:0/42
((((((((droMoj3,droVir3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droEre2,droYak2)),droSim1),droSec1);
#NNI moves:0/42
((((((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(droPer1,dp4)),droWil1),droGri2),droVir3),droMoj3);
#NNI moves:0/42
(((((((droYak2,droEre2),(droSec1,droSim1)),droAna2),(droPer1,dp4)),droWil1),(droMoj3,droVir3)),droGri2);
#NNI moves:0/42
((((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(((droVir3,droMoj3),droGri2),droWil1)),dp4),droPer1);
Current score: 7923
Current tree: ((((droVir3,droMoj3),droGri2),((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(dp4,droPer1))),droWil1);
Final Tree: ((((droVir3,droMoj3),droGri2),((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(dp4,droPer1))),droWil1);
#EqQuartets: 8625
Score: 7923
Species tree created
[user@biowulf ~]$