Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May;32(5):1342-53.
doi: 10.1093/molbev/msv022. Epub 2015 Feb 19.

Less is more: an adaptive branch-site random effects model for efficient detection of episodic diversifying selection

Affiliations

Less is more: an adaptive branch-site random effects model for efficient detection of episodic diversifying selection

Martin D Smith et al. Mol Biol Evol. 2015 May.

Abstract

Over the past two decades, comparative sequence analysis using codon-substitution models has been honed into a powerful and popular approach for detecting signatures of natural selection from molecular data. A substantial body of work has focused on developing a class of "branch-site" models which permit selective pressures on sequences, quantified by the ω ratio, to vary among both codon sites and individual branches in the phylogeny. We develop and present a method in this class, adaptive branch-site random effects likelihood (aBSREL), whose key innovation is variable parametric complexity chosen with an information theoretic criterion. By applying models of different complexity to different branches in the phylogeny, aBSREL delivers statistical performance matching or exceeding best-in-class existing approaches, while running an order of magnitude faster. Based on simulated data analysis, we offer guidelines for what extent and strength of diversifying positive selection can be detected reliably and suggest that there is a natural limit on the optimal parametric complexity for "branch-site" models. An aBSREL analysis of 8,893 Euteleostomes gene alignments demonstrates that over 80% of branches in typical gene phylogenies can be adequately modeled with a single ω ratio model, that is, current models are unnecessarily complicated. However, there are a relatively small number of key branches, whose identities are derived from the data using a model selection procedure, for which it is essential to accurately model evolutionary complexity.

Keywords: branch-site model; episodic selection; evolutionary model; model complexity; random effects model; variable selection.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
The power of aBSREL to correctly detect branches with diversifying positive selection from the simulated alignments as a function of selection strength (ω) and proportion of sites subject to selection (A), or selection strength and the length of the simulated branch (B).
F<sc>ig</sc>. 2.
Fig. 2.
Selection analyses of the extracellular domain of the mammalian CD2 receptor with the standard BSREL and the aBSREL models. Each branch b is annotated according to the inferred ωb distribution; the total length of the branch is partitioned according to the proportion of sites in a particular class (fkb), and the color of the segment depicts the magnitude of the corresponding ωkb. Branches which are thicker than others are those which have P < 0.05 (corrected for multiple testing) for rejecting the null hypothesis of all ωkb1 on that branch, that is, identified as having experienced diversifying positive selection.
F<sc>ig</sc>. 3.
Fig. 3.
Selection analyses of exon 11 of the BRCA1 gene with BSREL and the aBSREL models. Annotation is the same as in figure 2.
F<sc>ig</sc>. 4.
Fig. 4.
Correlates of signal for evolutionary process complexity in the selectome data sets. Each panel depicts the fraction of all alignments reported by aBSREL as having more than one ω rate class selected by the step-up procedure (Kb), as a function of (A) the length of the alignment (codons), censored at 2,000 due to sparse sampling afterwards (binned in increments of 50 codons); (B) branch length (expected substitutions per site [binned in increments of 0.01]); (C) the number of sequences (binned in increments of 2 sequences); (D) uncorrected P value for episodic positive selection (binned in increments of 0.005). Each point represents an average over at least 100 individual branches. Lowess smoothing polynomials (smoothing span 0.25) are shown in solid light gray.

Similar articles

Cited by

References

    1. Aguileta G, Refrégier G, Yockteng R, Fournier E, Giraud T. Rapidly evolving genes in pathogens: methods for detecting positive selection and examples among fungi, bacteria, viruses and protists. Infect Genet Evol. 2009;9:656–670. - PubMed
    1. Anisimova M, Bielawski JP, Yang Z. Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol Evol. 2001;18:1585–1592. - PubMed
    1. Anisimova M, Kosiol C. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol. 2009;26:255–271. - PubMed
    1. Anisimova M, Yang Z. Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol. 2007;24:1219–1228. - PubMed
    1. Brault AC, Huang CYH, Langevin SA, Kinney RM, Bowen RA, Ramey WN, Panella NA, Holmes EC, Powers AM, Miller BR. A single positively selected west Nile viral mutation confers increased virogenesis in American crows. Nat Genet. 2007;39:1162–1166. - PMC - PubMed

Publication types