RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively performing a series of Subtree Pruning and Regrafting (SPR) moves, which allows to quickly navigate to the best-known ML tree. RAxML-NG is a successor of RAxML (Stamatakis 2014) and leverages the highly optimized likelihood computation implemented in libpll (Flouri et al. 2014). RAxML-NG offers improvements in speed, flexibility and user-friendliness over the previous RAxML versions. It also implements some of the features previously available in ExaML (Kozlov et al. 2015), including checkpointing and efficient load balancing for partitioned alignments (Kobert et al. 2014).
RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference Alexey M Kozlov, Diego Darriba, Tomás Flouri, Benoit Morel, Alexandros Stamatakis . Bioinformatics, Volume 35, Issue 21, 1 November 2019, Pages 4453-4455, https://doi.org/10.1093/bioinformatics/btz305
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --cpus-per-task=4 --time=3:00:00
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144 ~]$ module load raxml-ng
[user@cn3144 ~]$ cp $RAXMLNG_EXAMPLE_DATA/myoglobins597.phy .
[user@cn3144 ~]$ raxml-ng --msa myoglobin597.phy --model GTR+G --threads $SLURM_CPUS_PER_TASK
RAxML-NG v. 2.0.1 released on 24.04.2026 by The Exelixis Lab.
Developed by: Oleksiy M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth,
Julia Haag, Anastasis Togkousidis, Julius Wiegert, Christoph Stelz.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml
...
Analysis options:
run mode: ML tree search (adaptive)
start tree(s): adaptive
random seed: 1779138192
tip-inner: OFF
pattern compression: ON
per-rate scalers: OFF
site repeats: ON
logLH epsilon: general: 10.000000, brlen-triplet: 1000.000000
stopping rule: OFF
fast spr radius: AUTO
spr subtree cutoff: 1.000000
fast CLV updates: ON
branch lengths: proportional (ML estimate, algorithm: NR-FAST)
FreeRate optimization method: AUTO
SIMD kernels: AVX2
parallelization: coarse-grained (auto), PTHREADS (4 threads), thread pinning: OFF
[00:00:00] Reading alignment from file: myoglobins597.phy
[00:00:00] Loaded alignment with 593 taxa and 6608 sites
Alignment comprises 1 partitions and 6561 patterns
Partition 0: noname
Model: GTR+FO+G4m
Alignment sites / patterns: 6608 / 6561
Gaps: 80.46 %
Invariant sites: 2.35 %
[00:00:00] Adaptive mode: Predicting difficulty of the MSA ...
[00:00:00] Parallel parsimony: 24 trees with 4 threads
[00:00:02] Predicted difficulty: 0.38
NOTE: Binary MSA file created: myoglobins597.phy.raxml.rba
[00:00:02] Generating 4 random starting tree(s) with 593 taxa
[00:00:02] Generating 9 parsimony starting tree(s) with 593 taxa
Parallelization scheme: 1 worker(s) x 4 thread(s)
Parallel reduction/worker buffer size: 1 KB / 0 KB
[00:00:02] Data distribution: max. partitions/sites/weight per thread: 1 / 1641 / 26256
[00:00:02] Data distribution: max. searches per worker: 13
Starting ML tree search with 13 distinct starting trees
[00:00:02 -960718.286592] Heuristic: adaptive, stopping rules: off, epsilon = 10.000000
[00:00:02 -960718.286592] Initial branch length optimization
[00:00:05 -820998.403654] Model parameter optimization (eps = 10.000000)
[00:00:30 -803175.442517] FAST spr round 1 (radius: 25)
...
Optimized model parameters:
Partition 0: noname
Rate heterogeneity: GAMMA (4 cats, mean), alpha: 12.814487 (ML), weights&rates: (0.250000,0.670584) (0.250000,0.888457) (0.250000,1.067170) (0.250000,1.373789)
Base frequencies (ML): 0.326955 0.174933 0.201353 0.296759
Substitution rates (ML): 1.025417 1.702307 2.008450 0.624967 1.311043 1.000000
Final LogLikelihood: -495348.559145
AIC score: 993081.118291 / AICc score: 993606.346730 / BIC score: 1001181.993579
Free parameters (model + branch lengths): 1192
WARNING: Best ML tree contains 17 near-zero branches!
...
Elapsed time: 9006.222 seconds
Consumed energy: 875.736 Wh (= 4 km in an electric car, or 22 km with an e-scooter!)
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$
#!/bin/bash set -e module load raxml-ng raxml-ng --msa myoglobin597.phy --model GTR+G --threads $SLURM_CPUS_PER_TASK
Submit this job using the SLURM sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] [--time=DD-HH:MM:SS] raxml-ng.sh
Checkpointing is built in to raxml-ng. Thus, if you have a longrunning raxml-ng job that is terminated because you did not specify a long enough walltime, or if your job is going to require more than the Biowulf 10-day max walltime, you can resubmit the job and it will restart from the last checkpoint. See the advanced tutorial for details.
You can take advantage of this feature to run a chain of jobs, each of which will pick up where the previous one terminated, using job dependencies. For example:
biowulf% sbatch --cpus-per-task=16 --mem=20g --time=8-00:00:00 myjobscript 1111 biowulf% sbatch --depend=afterany:1111 --cpus-per-task=16 --mem=20g --time=8-00:00:00 myjobscript 2222 biowulf% sbatch --depend=afterany:2222 --cpus-per-task=16 --mem=20g --time=8-00:00:00 myjobscript 3333Each of these jobs will run for 8 days. When job 1111 terminates, job 2222 will start up from the last checkpoint file, and likewise for job 3333. The 3 jobs will utilize a total walltime of 24 days.
Create a swarmfile (e.g. raxml-ng.swarm). For example:
raxml-ng --msa file1.phy --model GTR+G --threads $SLURM_CPUS_PER_TASK raxml-ng --msa file2.phy --model GTR+G --threads $SLURM_CPUS_PER_TASK raxml-ng --msa file3.phy --model GTR+G --threads $SLURM_CPUS_PER_TASK [...]
Submit this job using the swarm command.
swarm -f raxml-ng.swarm -g 20 -t 8 --module raxml-ngwhere
| -g 20 | 20 Gigabytes of memory required for each process (1 line in the swarm command file) |
| -t 8 | 8 threads/CPUs required for each process (1 line in the swarm command file). |
| --module raxml-ng | Loads the raxml-ng module for each subjob in the swarm |