Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Oct;20(10):738-54.
doi: 10.1089/cmb.2013.0073. Epub 2013 Sep 14.

Reconciliation revisited: handling multiple optima when reconciling with duplication, transfer, and loss

Affiliations

Reconciliation revisited: handling multiple optima when reconciling with duplication, transfer, and loss

Mukul S Bansal et al. J Comput Biol. 2013 Oct.

Abstract

Phylogenetic tree reconciliation is a powerful approach for inferring evolutionary events like gene duplication, horizontal gene transfer, and gene loss, which are fundamental to our understanding of molecular evolution. While duplication-loss (DL) reconciliation leads to a unique maximum-parsimony solution, duplication-transfer-loss (DTL) reconciliation yields a multitude of optimal solutions, making it difficult to infer the true evolutionary history of the gene family. This problem is further exacerbated by the fact that different event cost assignments yield different sets of optimal reconciliations. Here, we present an effective, efficient, and scalable method for dealing with these fundamental problems in DTL reconciliation. Our approach works by sampling the space of optimal reconciliations uniformly at random and aggregating the results. We show that even gene trees with only a few dozen genes often have millions of optimal reconciliations and present an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn(2)) time per sample, where m and n denote the number of genes and species, respectively. We use these samples to understand how different optimal reconciliations vary in their node mappings and event assignments and to investigate the impact of varying event costs. We apply our method to a biological dataset of approximately 4700 gene trees from 100 taxa and observe that 93% of event assignments and 73% of mappings remain consistent across different multiple optima. Our analysis represents the first systematic investigation of the space of optimal DTL reconciliations and has many important implications for the study of gene family evolution.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Multiple optimal reconciliations. Parts (a) and (b) show two different reconciliations for the gene tree and species tree depicted in the figure. Both of the reconciliations are optimal for event costs PΔ = 1, PΘ = 3, and Ploss = 1. The reconciliation in part (a) invokes one duplication, one transfer, and two losses, while the reconciliation in part (b) invokes two transfers.
FIG. 2.
FIG. 2.
Number of optimal reconciliations for gene trees in the biological dataset. The pie chart in part (a) shows the distribution of the number of optimal reconciliations for the different gene trees in the biological dataset. The dot plot in part (b) plots the size (in the number of internal nodes) and the number of optimal reconciliations for each gene tree. Due to arithmetic overflow concerns, results are only shown for the 4699 (out of 4735) gene trees that had fewer than 1016 optima.
FIG. 3.
FIG. 3.
Mappings from transfer events. Assuming that transfers, duplications, and losses have an equal cost, any optimal reconciliation of G and S must have a cost of 2. The figure depicts two such optimal reconciliations, one in which the transfer node maps to species A and one in which it maps to the species represented by lca(A,B). These two mappings are shown by the dashed blue lines. All other nodes have identical mappings in the two reconciliations. Both reconciliations have one loss and one transfer, and the reconciliation in which the transfer node maps to lca(A,B) shows that transfer node mappings are not constrained in the manner of speciation or duplication nodes as shown in Theorem 4.1.
FIG. 4.
FIG. 4.
Stability of mappings and event assignments. The plot in part (a) shows the fraction of internal nodes from the 4699 gene trees that have the same mapping or the same event assignment across at least a certain fraction of the 500 samples. The plot in part (b) plots the fraction of the 4699 gene trees that have at least a certain fraction of their nodes with a consistent mapping or a consistent event assignment across all 500 samples.
FIG. 5.
FIG. 5.
Consistency of mappings and event assignments by number of optimal reconciliations. (a) Fraction of gene tree nodes that have a consistent mapping across all 500 samples. (b) Fraction of gene tree nodes that have a consistent event assignment across all 500 samples.

References

    1. Bansal M.S. Burleigh J.G. Eulenstein O. Wehe A. Heuristics for the gene-duplication problem: A Θ(n) speed-up for the local search. RECOMB. 2007:238–252.
    1. Bansal M.S. Alm E.J. Kellis M. Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss. Bioinformatics. 2012;28:283–291. - PMC - PubMed
    1. Bonizzoni P. Vedova G.D. Dondi R. Reconciling a gene tree to a species tree under the duplication cost model. Theor. Comput. Sci. 2005;347:36–53.
    1. Burleigh J.G. Bansal M.S. Eulenstein O., et al. Genome-scale phylogenetics: Inferring the plant tree of life from 18,896 gene trees. Syst. Biol. 2011;60:117–125. - PMC - PubMed
    1. Charleston M. Jungles: A new solution to the host-parasite phylogeny reconciliation problem. Mathematical Biosciences. 1998;149:191–223. - PubMed

Publication types

LinkOut - more resources