Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 21;64(5):1513-1525.
doi: 10.1093/icb/icae056.

Practical Guidance and Workflows for Identifying Fast Evolving Non-Coding Genomic Elements Using PhyloAcc

Affiliations

Practical Guidance and Workflows for Identifying Fast Evolving Non-Coding Genomic Elements Using PhyloAcc

Gregg W C Thomas et al. Integr Comp Biol. .

Abstract

Comparative genomics provides ample ways to study genome evolution and its relationship to phenotypic traits. By developing and testing alternate models of evolution throughout a phylogeny, one can estimate rates of molecular evolution along different lineages in a phylogeny and link these rates with observations in extant species, such as convergent phenotypes. Pipelines for such work can help identify when and where genomic changes may be associated with, or possibly influence, phenotypic traits. We recently developed a set of models called PhyloAcc, using a Bayesian framework to estimate rates of nucleotide substitution on different branches of a phylogenetic tree and evaluate their association with pre-defined or estimated phenotypic traits. PhyloAcc-ST and PhyloAcc-GT both allow users to define a priori a set of target lineages and then compare different models to identify loci accelerating in one or more target lineages. Whereas ST considers only one species tree across all input loci, GT considers alternate topologies for every locus. PhyloAcc-C simultaneously models molecular rates and rates of continuous trait evolution, allowing the user to ask whether the two are associated. Here, we describe these models and provide tips and workflows on how to prepare the input data and run PhyloAcc.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
The PhyloAcc model family and workflows. (A) Graphical representation of the three PhyloAcc models. PhyloAcc-ST assumes the species tree (outline) and gene tree (inner topology) are identical and compares models in which substitution rates are estimated and acceleration is restricted to specified target lineages (highlighted, dashed branches on the species tree) against those in which acceleration is not allowed or allowed on every branch. PhyloAcc-GT does the same but finds the best fitting gene tree topology to test, represented by the multiple topologies within the species tree. PhyloAcc-C assumes identical species and gene trees but co-estimates rates of molecular and continuous trait evolution without specifying target lineages in the species tree. (B) An outline of the workflow for preparing input data and running PhyloAcc on conserved non-coding elements.
Fig. 2.
Fig. 2.
The topology of the 241-way Zoonomia mammal alignment with branch lengths re-estimated using four-fold degenerate sites and branches highlighted based on their classification for the PhyloAcc runs. Orange branches labeled with asterisks are target lineages for convergence of echolocation between toothed whales and bats. This plot is generated automatically by PhyloAcc and displayed in an HTML summary file for the user (see Fig. 1B).
Fig. 3.
Fig. 3.
Relative support for the three PhyloAcc substitution rate estimates (M0, M1, and M2) on conserved elements predicted from the Zoonomia data from human chromosome 1. log BF1 (x-axis) compares marginal likelihoods between M1 (acceleration allowed only on target lineages) and M0 (no acceleration allowed), and log BF2 (y-axis) compares marginal likelihoods between M1 and M2 (acceleration allowed on any lineage). Loci above user defined cutoffs (dashed lines; in this case, 5 for both BFs) are considered to be accelerated among echolocating lineages. This plot is generated automatically by the PhyloAcc post-processing script and displayed in an HTML summary file for the user (see Fig. 1B).

Similar articles

Cited by

References

    1. Adams DC, Collyer ML. 2019. Phylogenetic comparative methods and the evolution of multivariate phenotypes. Annu Rev Ecol Evol Syst. 50:405–25.
    1. Allio R, Nabholz B, Wanke S, Chomicki G, Pérez-Escobar OA, Cotton AM, Clamens A-L, Kergoat GJ, Sperling FAH, Condamine FL. 2021. Genome-wide macroevolutionary signatures of key innovations in butterflies colonizing new host plants. Nat Commun. 12:354. - PMC - PubMed
    1. Alvarez-Jarreta J, Amos B, Aurrecoechea C, Bah S, Barba M, Barreto A, Basenko EY, Belnap R, Blevins A, Böhme U et al. 2024. Veupathdb: the eukaryotic pathogen, vector and host bioinformatics resource center in 2023. Nucleic Acids Res. 52:D808–16. - PMC - PubMed
    1. Ane C, Larget B, Baum DA, Smith SD, Rokas A. 2006. Bayesian estimation of concordance among gene trees. Mol Biol Evol. 24:412–26. - PubMed
    1. Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, Fang Q, Xie D, Feng S, Stiller J et al. 2020. Progressive cactus is a multiple-genome aligner for the thousand-genome era. Nature. 587:246–51. - PMC - PubMed

LinkOut - more resources