Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb 15;12 Suppl 1(Suppl 1):S4.
doi: 10.1186/1471-2105-12-S1-S4.

On the PATHGROUPS approach to rapid small phylogeny

Affiliations

On the PATHGROUPS approach to rapid small phylogeny

Chunfang Zheng et al. BMC Bioinformatics. .

Abstract

We present a data structure enabling rapid heuristic solution to the ancestral genome reconstruction problem for given phylogenies under genomic rearrangement metrics. The efficiency of the greedy algorithm is due to fast updating of the structure during run time and a simple priority scheme for choosing the next step. Since accuracy deteriorates for sets of highly divergent genomes, we investigate strategies for improving accuracy and expanding the range of data sets where accurate reconstructions can be expected. This includes a more refined priority system, and a two-step look-ahead, as well as iterative local improvements based on a the median version of the problem, incorporating simulated annealing. We apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Representation of the median (left) and more general small phylogeny (right) problems. Grey squares indicate given genomes, red squares those to be reconstructed. Each line connecting two genomes represents a breakpoint graph and a distance.
Figure 2
Figure 2
Priorities of all pathgroups of form [(x, a), (x, b), (x, c)] for inserting red edges, for each ancestral vertex in the median problem. Includes sketch of three paths in “x” pathgroup plus other paths involved in calculating priority. For example, completing the pathgroup [(x, y), (x, y), (x, z)] by adding the red edge xy always produces two cycles, but can set up a pathgroup with 3 potential cycles (priority 2), 2 potential cycles (priority 3) or 1 potential cycles (priority 4).
Figure 3
Figure 3
Effect of refined priority and two-step look-ahead on error. τ = total branch length in tree with known simulated ancestors, n = number of genes per genome, d = total branch length in tree with reconstructed ancestors. Averages of 10 runs.
Figure 4
Figure 4
Iterative improvement of solution to small phylogeny as a function of rearrangement rate. The median reconstruction is based on the two-step look-ahead. Each data point represents the average of 10 runs.
Figure 5
Figure 5
(left)Increase in run time due to refined priority and look-ahead. Based on 15-trees. (right) Dependence of average time for first five iterations on number of leaves in phylogeny and rearrangement rate on branches.
Figure 6
Figure 6
Phylogeny of seven yeasts [14], including pre-WGD Saccharaomyces ancestor, from gene sequences (left) and gene order (right).

Similar articles

Cited by

References

    1. Sankoff D, Blanchette M. In: Computing and Combinatorics (COCOON). 3rd Annual Conference, LNCS. Jiang T, Lee DT, editor. Vol. 1276. 1997. The median problem for breakpoints in comparative genomics; pp. 251–263. full_text.
    1. Zheng C. Pathgroups, a dynamic data structure for genome reconstruction problems. Bioinformatics. 2010;26:1587–1594. doi: 10.1093/bioinformatics/btq255. - DOI - PubMed
    1. Sankoff D, Blanchette M. Multiple genome rearrangement and breakpoint phylogeny. J Comput Biol. 1998;5:555–570. doi: 10.1089/cmb.1998.5.555. - DOI - PubMed
    1. El-Mabrouk N, Sankoff D. The reconstruction of doubled genomes. SIAM J Comput. 2003;32:754–92. doi: 10.1137/S0097539700377177. - DOI
    1. Warren R, Sankoff D. Genome aliquoting with double cut and join. BMC Bioinformatics. 2009;10(Suppl 1):S2. doi: 10.1186/1471-2105-10-S1-S2. - DOI - PMC - PubMed

Publication types

LinkOut - more resources