Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 2;38(15):3725-3733.
doi: 10.1093/bioinformatics/btac396.

NetRAX: accurate and fast maximum likelihood phylogenetic network inference

Affiliations

NetRAX: accurate and fast maximum likelihood phylogenetic network inference

Sarah Lutteropp et al. Bioinformatics. .

Abstract

Motivation: Phylogenetic networks can represent non-treelike evolutionary scenarios. Current, actively developed approaches for phylogenetic network inference jointly account for non-treelike evolution and incomplete lineage sorting (ILS). Unfortunately, this induces a very high computational complexity and current tools can only analyze small datasets.

Results: We present NetRAX, a tool for maximum likelihood (ML) inference of phylogenetic networks in the absence of ILS. Our tool leverages state-of-the-art methods for efficiently computing the phylogenetic likelihood function on trees, and extends them to phylogenetic networks via the notion of 'displayed trees'. NetRAX can infer ML phylogenetic networks from partitioned multiple sequence alignments and returns the inferred networks in Extended Newick format. On simulated data, our results show a very low relative difference in Bayesian Information Criterion (BIC) score and a near-zero unrooted softwired cluster distance to the true, simulated networks. With NetRAX, a network inference on a partitioned alignment with 8000 sites, 30 taxa and 3 reticulations completes within a few minutes on a standard laptop.

Availability and implementation: Our implementation is available under the GNU General Public License v3.0 at https://github.com/lutteropp/NetRAX.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Left: A phylogenetic network with two reticulation nodes. Right: A displayed tree of the phylogenetic network on the left. The probability of displaying the highlighted tree is the product p*q over the respective reticulation probabilities
Fig. 2.
Fig. 2.
Two displayed trees in a phylogenetic network. Both displayed trees induce the same topology after collapsing single-child nodes. They only differ in some branch lengths. For example, in the left tree, the branch length between the root node and leaf A is b3+b5, and in the right tree it is b1+b5. Under the unlinked branches model, NetRAX would simply return a tree. For phylogenetic tree likelihood computation, simple paths are collapsed and the tree is transformed into an unrooted tree. Therefore, b2 is not considered in the left displayed tree
Fig. 3.
Fig. 3.
We reroot the displayed trees at node u before optimizing branch (u, v). In the first rerooted displayed tree, the node y is a parent of node x and we need to recompute the CLVs on the path (u, w, y, x). In the second rerooted displayed tree, the node y is a child of node x and we need to recompute the CLVs on the path (u, w, z, x)
Fig. 4.
Fig. 4.
Overview of the NetRAX network search algorithm for a single start network. The table on the bottom left shows what move type is tried next when a wave for the current move type did not yield a better network. We loop through arc removal -> rNNI -> rSPR -> arc insertion move waves and repeat this as long as we find a better network. We terminate the search if the waves for arc removal, rNNI, rSPR, arc insertion do all not find a network with improved BIC score
Fig. 5.
Fig. 5.
Number of inferred reticulations, unrooted SCD and relative BIC difference for 50 datasets with 30 taxa and 3 reticulations each. Top: starting from 3 random and 3 maximum parsimony trees. Bottom: starting from a RAxML-NG ML tree
Fig. 6.
Fig. 6.
Number of inferred reticulations, unrooted SCD and relative BIC difference for 50 simulated datasets with 40 taxa and 4 reticulations each, starting from the RAxML-NG ML tree

References

    1. Allen-Savietta C. (2020) Estimating Phylogenetic Networks from Concatenated Sequence Alignments. The University of Wisconsin-Madison, Madison, Wisconsin, USA.
    1. Ané C. (2021) Phylonetworks Users Google Group Discussion. https://groups.google.com/g/phylonetworks-users/c/KCu45cDRy_Q/m/RLpaZJaj... (14 August 2021, date last accessed).
    1. Blair C., Ané C. (2019) Phylogenetic trees and networks can serve as powerful and complementary approaches for analysis of genomic data. Syst. Biol., 69, 593–601. - PubMed
    1. Burbrink F.T., Gehara M. (2018) The biogeography of deep time phylogenetic reticulation. Syst. Biol., 67, 743–755. - PubMed
    1. Cao Z. et al. (2019) Practical aspects of phylogenetic network analysis using phylonet. BioRxiv, page 746362.

Publication types