nT4X and nT4M: Novel Time Non-reversible Mixture Amino Acid Substitution Models
- PMID: 39832000
- DOI: 10.1007/s00239-024-10230-8
nT4X and nT4M: Novel Time Non-reversible Mixture Amino Acid Substitution Models
Abstract
One of the most important and difficult challenges in the research of molecular evolution is modeling the process of amino acid substitutions. Although single-matrix models, such as the LG model, are popular, their capability to properly capture the heterogeneity of the substitution process across sites is still questioned. Several mixture models with multiple matrices have been introduced and shown to offer advantages over single-matrix models. Current general mixture models assume the reversibility of the evolutionary process, implying that substitution rates between any two amino acids are equal in both forward and backward directions. This assumption is not based on biological properties but rather on computational simplicity. The well-known hypothesis is that more realistic models can yield more accurate evolutionary inferences; therefore, our aim is to estimate more biologically realistic models. To this end, we relax the assumption of reversibility and introduce two new general non-reversible 4-matrix mixture models, called nT4M and nT4X. Using alignments from HSSP and TreeBASE databases as data, our newly estimated models outperformed all single-matrix models and almost all reversible mixture models. Moreover, the new non-reversible mixture models enable us to infer rooted trees.
Keywords: Amino acid substitution models; Maximum likelihood estimation method; Mixture models; Time non-reversible models.
© 2025. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Conflict of interest statement
Declarations. Conflict of interest: The authors have no financial or non-financial interests that are directly or indirectly related.
References
-
- Abadi S, Azouri D, Pupko T, Mayrose I (2019) Model selection may not be a mandatory step for phylogeny reconstruction. Nat Commun. https://doi.org/10.1038/s41467-019-08822-w - DOI - PubMed - PMC
-
- Akaike H (1974) A new look at the statistical model identification. Selected papers of Hirotugu Akaike. Springer, pp 215–222 - DOI
-
- Arenas M, Dos Santos HG, Posada D, Bastolla U (2013) Protein evolution along phylogenetic histories under structurally constrained substitution models. Bioinformatics 29:3020–3028. https://doi.org/10.1093/bioinformatics/btt530 - DOI - PubMed
-
- Baele G, Van De Peer Y, Vansteelandt S (2010) Using non-reversible context-dependent evolutionary models to study substitution patterns in primate non-coding sequences. J Mol Evol 71:34–50. https://doi.org/10.1007/s00239-010-9362-y - DOI - PubMed
-
- Bettisworth B, Stamatakis A (2021) Root Digger: a root placement program for phylogenetic trees. BMC Bioinformatics. https://doi.org/10.1186/s12859-021-03956-5 - DOI - PubMed - PMC