Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1998 Fall;5(3):555-70.
doi: 10.1089/cmb.1998.5.555.

Multiple genome rearrangement and breakpoint phylogeny

Affiliations

Multiple genome rearrangement and breakpoint phylogeny

D Sankoff et al. J Comput Biol. 1998 Fall.

Abstract

Multiple alignment of macromolecular sequences generalizes from N = 2 to N > or = 3 the comparison of N sequences which have diverged through the local processes of insertion, deletion and substitution. Gene-order sequences diverge through non-local genome rearrangement processes such as inversion (or reversal) and transposition. In this paper we show which formulations of multiple alignment have counterparts in multiple rearrangement. Based on difficulties inherent in rearrangement edit-distance calculation and interpretation, we argue for the simpler "breakpoint analysis." Consensus-based multiple rearrangement of N > or = 3 orders can be solved exactly through reduction to instances of the Travelling Salesman Problem (TSP). We propose a branch-and-bound solution to TSP particularly suited to these instances. Simulations show how non-uniqueness of the solution is attenuated with increasing numbers of data genomes. Tree-based multiple alignment can be achieved to a great degree of accuracy by decomposing the tree into a number of overlapping 3-stars centered on the non-terminal nodes, and solving the consensus-based problem iteratively for these nodes until convergence. Accuracy improves with very careful initializations at the non-terminal nodes. The degree of non-uniqueness of solutions depends on the position of the node in the tree in terms of path length to the terminal vertices.

PubMed Disclaimer

Publication types

LinkOut - more resources