Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr;32(4):1109-12.
doi: 10.1093/molbev/msu411. Epub 2015 Jan 9.

CodABC: a computational framework to coestimate recombination, substitution, and molecular adaptation rates by approximate Bayesian computation

Affiliations

CodABC: a computational framework to coestimate recombination, substitution, and molecular adaptation rates by approximate Bayesian computation

Miguel Arenas et al. Mol Biol Evol. 2015 Apr.

Abstract

The estimation of substitution and recombination rates can provide important insights into the molecular evolution of protein-coding sequences. Here, we present a new computational framework, called "CodABC," to jointly estimate recombination, substitution and synonymous and nonsynonymous rates from coding data. CodABC uses approximate Bayesian computation with and without regression adjustment and implements a variety of codon models, intracodon recombination, and longitudinal sampling. CodABC can provide accurate joint parameter estimates from recombining coding sequences, often outperforming maximum-likelihood methods based on more approximate models. In addition, CodABC allows for the inclusion of several nuisance parameters such as those representing codon frequencies, transition matrices, heterogeneity across sites or invariable sites. CodABC is freely available from http://code.google.com/p/codabc/, includes a GUI, extensive documentation and ready-to-use examples, and can run in parallel on multicore machines.

Keywords: approximate Bayesian computation; coding data; molecular adaptation; recombination; substitution rate.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
Accuracy of CodABC using simulated data. For each combination of ρ, θ, and ω, we present the corresponding estimates for ρ (top), ω (middle), and θ (down). Dashed lines indicate the true value. Points present the mode of the prior distributions and error bars indicate the 95% CI.
F<sc>ig</sc>. 2.
Fig. 2.
CodABC computing times. The simulated data contain 15 sequences with 900 nucleotides. The first real data set contains 22 sequences with 864 nucleotides. The second real data set contains 20 sequences with 894 nucleotides. The third real data set is the biggest and contains 55 sequences with 1,449 nucleotides. Prior distributions: ρ: U(0,50), θ: U(0,300), and ω: U(0,2). The analyses were run on an Intel Xeon CPU 2.33 GHz with 24 cores.

References

    1. Agnihotri KD, Tripathy SP, Jere AP, Kale SM, Paranjape RS. Molecular analysis of gp41 sequences of HIV type 1 subtype C from India. J Acquir Immune Defic Syndr. 2006;41:345–351. - PubMed
    1. Anisimova M, Kosiol C. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol. 2009;26:255–271. - PubMed
    1. Anisimova M, Nielsen R, Yang Z. Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003;164:1229–1236. - PMC - PubMed
    1. Arenas M, Posada D. Coalescent simulation of intracodon recombination. Genetics. 2010;184:429–437. - PMC - PubMed
    1. Arenas M, Posada D. Simulation of coding sequence evolution. In: Cannarozzi GM, Schneider A, editors. Codon evolution. Oxford: Oxford University Press; 2012. pp. 126–132.

Publication types