Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Apr;29(4):1115-23.
doi: 10.1093/molbev/msr268. Epub 2011 Dec 8.

ALF--a simulation framework for genome evolution

Affiliations

ALF--a simulation framework for genome evolution

Daniel A Dalquen et al. Mol Biol Evol. 2012 Apr.

Abstract

In computational evolutionary biology, verification and benchmarking is a challenging task because the evolutionary history of studied biological entities is usually not known. Computer programs for simulating sequence evolution in silico have shown to be viable test beds for the verification of newly developed methods and to compare different algorithms. However, current simulation packages tend to focus either on gene-level aspects of genome evolution such as character substitutions and insertions and deletions (indels) or on genome-level aspects such as genome rearrangement and speciation events. Here, we introduce Artificial Life Framework (ALF), which aims at simulating the entire range of evolutionary forces that act on genomes: nucleotide, codon, or amino acid substitution (under simple or mixture models), indels, GC-content amelioration, gene duplication, gene loss, gene fusion, gene fission, genome rearrangement, lateral gene transfer (LGT), or speciation. The other distinctive feature of ALF is its user-friendly yet powerful web interface. We illustrate the utility of ALF with two possible applications: 1) we reanalyze data from a study of selection after globin gene duplication and test the statistical significance of the original conclusions and 2) we demonstrate that LGT can dramatically decrease the accuracy of two well-established orthology inference methods. ALF is available as a stand-alone application or via a web interface at http://www.cbrg.ethz.ch/alf.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
Overview of the ALF simulation process. A root genome is evolved along a species tree. Events at the site, sequence and genome level are simulated iteratively.
F<sc>ig</sc>. 2.
Fig. 2.
The distribution of ML estimates for ω2γ from simulation with ALF (a) for one run with sequence length matching the real data (144 codons; other data shown in supplementary fig. 1, Supplementary Material online), and (b) for sequences of 10,000 codons. Data simulated under MD with ω2γ = 1, all other parameters are as in table 1.
F<sc>ig</sc>. 3.
Fig. 3.
Precision/recall of orthology predictions with different proportions of genes with a history of duplications and/or LGT. Each data point corresponds to the mean of five independent runs using the same parameters (with 95% confidence interval in both dimensions).

References

    1. Altenhoff AM, Dessimoz C. Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol. 2009;5:e1000262. - PMC - PubMed
    1. Anisimova M, Kosiol C. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol. 2009;26:255–271. - PubMed
    1. Basu MK, Carmel L, Rogozin IB, Koonin EV. Evolution of protein domain promiscuity in eukaryotes. Genome Res. 2008;18:449–461. - PMC - PubMed
    1. Beiko RG, Charlebois RL. A simulation test bed for hypotheses of genome evolution. Bioinformatics. 2007;23:825–831. - PubMed
    1. Benner SA, Cohen MA, Gonnet GH. Empirical and structural models for insertions and deletions in the divergent evolution of proteins. J Mol Biol. 1993;229:1065–1082. - PubMed

Publication types