Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Jan 3;103(1):129-34.
doi: 10.1073/pnas.0509737102. Epub 2005 Dec 22.

Mapping of orthologous genes in the context of biological pathways: An application of integer programming

Affiliations
Comparative Study

Mapping of orthologous genes in the context of biological pathways: An application of integer programming

Fenglou Mao et al. Proc Natl Acad Sci U S A. .

Abstract

Mapping biological pathways across microbial genomes is a highly important technique in functional studies of biological systems. Existing methods mainly rely on sequence-based orthologous gene mapping, which often leads to suboptimal mapping results because sequence-similarity information alone does not contain sufficient information for accurate identification of orthology relationship. Here we present an algorithm for pathway mapping across microbial genomes. The algorithm takes into account both sequence similarity and genomic structure information such as operons and regulons. One basic premise of our approach is that a microbial pathway could generally be decomposed into a few operons or regulons. We formulated the pathway-mapping problem to map genes across genomes to maximize their sequence similarity under the constraint that the mapped genes be grouped into a few operons, preferably coregulated in the target genome. We have developed an integer-programming algorithm for solving this constrained optimization problem and implemented the algorithm as a computer software program, p-map. We have tested p-map on a number of known homologous pathways. We conclude that using genomic structure information as constraints could greatly improve the pathway-mapping accuracy over methods that use sequence-similarity information alone.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
IP formulation for the pathway-mapping problem. Circles represent genes in the template pathway; rectangles represent candidate genes in the target genome, where target candidate genes are obtained through blast search with a specific e-value cutoff; cylinders represent operons; and cubes represent regulons. A line between a template gene and target candidate gene represents a blast hit. A line between a target candidate gene and an operon indicates that the gene belongs to this operon, and a line between an operon and a regulon indicates that the operon belongs to the regulon.

Similar articles

Cited by

References

    1. Koonin, E. V. (2001) Genome Biol. 2001;2(4): COMMENT1005. - PMC - PubMed
    1. Petsko, G. A. (2001) Genome Biol. 2001;2(2): COMMENT1002. - PMC - PubMed
    1. Jensen, R. A. (2001) Genome Biol. 2001;2(8): INTERACTIONS1002. - PMC - PubMed
    1. Mushegian, A. R. & Koonin, E. V. (1996) Proc. Natl. Acad. Sci. USA 93, 10268–10273. - PMC - PubMed
    1. Wall, D. P., Fraser, H. B. & Hirsh, A. E. (2003) Bioinformatics 19, 1710–1711. - PubMed

Publication types

LinkOut - more resources