Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul 15:11:379.
doi: 10.1186/1471-2105-11-379.

MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics

Affiliations

MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics

Raphaël Helaers et al. BMC Bioinformatics. .

Abstract

Background: The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s) but rather by practical issues such as ergonomics and/or the availability of specific functionalities.

Results: Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood), including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA) together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers.

Conclusions: The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these algorithms. MetaPIGA v2.0 gives access both to high customization for the phylogeneticist, as well as to an ergonomic interface and functionalities assisting the non-specialist for sound inference of large phylogenetic trees using nucleotide sequences. MetaPIGA v2.0 and its extensive user-manual are freely available to academics at http://www.metapiga.org.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The MetaPIGA-2.0 heuristic setting window. The user can set the parameters of the chosen heuristic: here, for the metaGA, the consensus type, the selection scheme, the operator behavior, the number of cores/processors (here, 2 cores) onto which the populations are distributed, the number of populations and the number of individuals per population, the tolerance parameter (frequency with which internal branches are affected by mutational operators even if that branch is present in all trees across all populations), and the frequency of hybridization among populations. See text for details.
Figure 2
Figure 2
The MetaPIGA-2.0 model setting window. The user can choose a substitution model and set the corresponding parameters: here, the GTR model (and estimated starting values of the rate matrix) with rate heterogeneity (discrete Gamma model) and no proportion of invariable sites has been selected automatically after performing a likelihood ratio test (lower left buttons). The user can also choose how and when intra-step optimization of target parameters (here, branch lengths, rate matrix parameters, and the alpha shape parameter of the Gamma distribution) will be performed (here, at the end of the search, using a genetic algorithm). Note that, as the metaGA is a stochastic heuristic, most of the parameters optimization occurs inter-step, i.e., across generations under the effect of operators (see Figure 3).
Figure 3
Figure 3
The MetaPIGA-2.0 operators setting window. The user selects the operators (affecting topology, branch lengths, and model parameters), their frequencies (unless they are ordered or randomly selected; upper right radio-button panel) and whether their frequencies are dynamically adapted (here, every 100 generations but never set below 4%) depending on their relative efficiencies in improving the best-tree likelihood. See text for details.
Figure 4
Figure 4
The MetaPIGA-2.0 run window. Here, a metaGA search with multiple replicates has been chosen. Hence, the run window shows, for 3 successive replicates (lower left panel), the current best-tree likelihood progression in each population, as well as (right panel) the current topology, metaGA branch support values, and average branch lengths of the consensus among the best trees (one for each population) from all replicates.
Figure 5
Figure 5
The MetaPIGA-2.0 Tree Viewer. The tree selected in the list of available trees (left panel) is show in the right panel with its likelihood and model parameters. Trees can be viewed, rerooted, and printed. Likelihood can be recomputed after changing the substitution model or optimizing model parameters. Various tools allow modifying the list of trees.

References

    1. Gabaldon T. Large-scale assignment of orthology: back to phylogenetics? Genome Biol. 2008;9(10):235. doi: 10.1186/gb-2008-9-10-235. - DOI - PMC - PubMed
    1. Li W-H. Molecular evolution. Sunderland, MA.: Sinauer; 1997.
    1. Thorne JL, Kishino H. Divergence time and evolutionary rate estimation with multilocus data. Syst Biol. 2002;51(5):689–702. doi: 10.1080/10635150290102456. - DOI - PubMed
    1. Cassens I, Vicario S, Waddell VG, Balchowsky H, Van Belle D, Ding W, Fan C, Mohan RS, Simoes-Lopes PC, Bastida R. Independent adaptation to riverine habitats allowed survival of ancient cetacean lineages. Proc Natl Acad Sci USA. 2000;97(21):11343–11347. doi: 10.1073/pnas.97.21.11343. - DOI - PMC - PubMed
    1. Thorne JL, Kishino H, Painter IS. Estimating the rate of evolution of the rate of molecular evolution. Molecular Biology and Evolution. 1998;15(12):1647–1657. - PubMed

Publication types

LinkOut - more resources