ROADIES is a fully automated pipeline to infer species trees starting from raw genome assemblies. It incorporates a unique strategy of randomly sampling segments of the input genomes to generate gene trees. This eliminates the need for predefining a set of loci, limiting the analyses to a fixed number of genes, and performing the cumbersome gene annotation and/or whole genome alignment steps. ROADIES also eliminates the need to infer orthology by leveraging existing discordance-aware methods that allow multicopy genes.
OBAllocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=20g -c 16 --gres=lscratch:50 [user@cn4278 ~]$ module load roadies [+] Loading singularity 4.2.2 on cn0094 [+] Loading roadies 0.1.10 ...a [user@cn4278 ~]$ wget https://github.com/TurakhiaLab/ROADIES/archive/refs/tags/v0.1.10.tar.gz [user@cn4278 ~]$ tar -zxf v0.1.10.tar.gz && rm -f v0.1.10.tar.gz && cd ROADIES-0.1.10 [user@cn4278 ~]$ chmod +x ./workflow/scripts/* [user@cn4278 ~]$ mkdir -p output_files/genetreesDownload test data for 11 Drosophila genomes:
[user@cn4278 ~]$ mkdir -p test/test_data && cat test/input_genome_links.txt | xargs -I {} sh -c 'wget -O test/test_data/$(basename {}) {}'
Compile sampling executable:
[user@cn4278 ~]$ mkdir -p ./workflow/scripts/sampling/build [user@cn4278 ~]$ cd ./workflow/scripts/sampling/build [user@cn4278 ~]$ cmake .. -DZLIB_LIBRARY=/usr/local/apps/roadies/0.1.10/conda/lib/libz.so [user@cn4278 ~]$ make [user@cn4278 ~]$ cd ../../../..Run the ROADIES pipeline:
user@cn4278 ~]$ python run_roadies.py --cores 16 --noconverge
Unlocking working directory.
snakemake --cores 16 --config mode=accurate config_path=config/config.yaml num_threads=4 deep_mode=Fal plete
Config file config/config.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 16
Rules claiming more threads will be scaled down.
Job stats:
job count
--------------- -------
all 1
filtermsa 250
lastz 11
lastz2fasta 1
mergeTrees 1
pasta 250
raxmlng 250
sequence_merge 1
sequence_select 11
total 776
Select jobs to execute...
Failed to solve scheduling problem with ILP solver. Falling back to greedy solver. Run Snakemake with for debugging the problem.
[Wed Aug 27 07:36:41 2025]
rule sequence_select:
input: test/test_data/droAna2.fa.gz
output: output_files/samples/droAna2_temp.fa
jobid: 9
benchmark: output_files/benchmarks/droAna2.sample.txt
reason: Missing output files: output_files/samples/droAna2_temp.fa; Params have changed since last
wildcards: sample=droAna2
threads: 4
resources: tmpdir=/tmp
...
./workflow/scripts/sampling/build/sampling -i test/test_data/droWil1.fa.gz -o output_files/samples/dro
/usr/bin/bash: /opt/conda/envs/roadies_env/lib/libtinfo.so.6: no version information available (requir
We are starting to sample test/test_data/droMoj3.fa.gz
./workflow/scripts/sampling/build/sampling -i test/test_data/droMoj3.fa.gz -o output_files/samples/dro
Number of regions: 26
ID START: 181, ID END: 206
Region length: 500
Input file: test/test_data/droVir3.fa.gz
Output file: output_files/samples/droVir3_temp.fa
Number of resampling: 38
real 0m0.657s
user 0m0.573s
sys 0m0.074s
Number of regions: 21
ID START: 92, ID END: 112
Region length: 500
Input file: test/test_data/droMoj3.fa.gz
Output file: output_files/samples/droMoj3_temp.fa
Number of resampling: 20
real 0m0.645s
user 0m0.568s
sys 0m0.058s
[Wed Aug 27 07:36:42 2025]
Finished job 16.
...
[user@cn4278 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
...
ASTRAL for PaRalogs and Orthologs III (ASTRAL-Pro3)
*** NOW with integrated CASTLES-Pro ***
Version: v1.23.3.6
#Genetrees: 52
#Duploss: 245
#Species: 11
#Rounds: 4
#Samples: 4
#Threads: 16
#NNI moves:0/42
((((((droSec1,droSim1),(droEre2,droYak2)),droAna2),(((droMoj3,droVir3),droGri2),droWil1)),dp4),droPer1);
#NNI moves:0/42
((((((((droVir3,droMoj3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droYak2,droEre2)),droSec1),droSim1);
#NNI moves:0/42
((((((((droMoj3,droVir3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droSec1,droSim1)),droEre2),droYak2);
#NNI moves:0/42
(((((((droSim1,droSec1),(droYak2,droEre2)),droAna2),(dp4,droPer1)),droWil1),(droMoj3,droVir3)),droGri2);
Initial score: 7923
Initial tree: ((((((droSim1,droSec1),(droYak2,droEre2)),droAna2),(dp4,droPer1)),((droMoj3,droVir3),droGri2)),droWil1);
*** Subsample Process ***
#NNI moves:0/42
((((((((droMoj3,droVir3),droGri2),droWil1),(dp4,droPer1)),droAna2),(droEre2,droYak2)),droSim1),droSec1);
#NNI moves:0/42
((((((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(droPer1,dp4)),droWil1),droGri2),droVir3),droMoj3);
#NNI moves:0/42
(((((((droYak2,droEre2),(droSec1,droSim1)),droAna2),(droPer1,dp4)),droWil1),(droMoj3,droVir3)),droGri2);
#NNI moves:0/42
((((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(((droVir3,droMoj3),droGri2),droWil1)),dp4),droPer1);
Current score: 7923
Current tree: ((((droVir3,droMoj3),droGri2),((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(dp4,droPer1))),droWil1);
Final Tree: ((((droVir3,droMoj3),droGri2),((((droSim1,droSec1),(droEre2,droYak2)),droAna2),(dp4,droPer1))),droWil1);
#EqQuartets: 8625
Score: 7923
Species tree created
[user@biowulf ~]$