Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2000 Oct 15;28(20):4021-8.
doi: 10.1093/nar/28.20.4021.

A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters

Affiliations
Comparative Study

A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters

H Ogata et al. Nucleic Acids Res. .

Abstract

The availability of computerized knowledge on biochemical pathways in the KEGG database opens new opportunities for developing computational methods to characterize and understand higher level functions of complete genomes. Our approach is based on the concept of graphs; for example, the genome is a graph with genes as nodes and the pathway is another graph with gene products as nodes. We have developed a simple method for graph comparison to identify local similarities, termed correlated clusters, between two graphs, which allows gaps and mismatches of nodes and edges and is especially suitable for detecting biological features. The method was applied to a comparison of the complete genomes of 10 microorganisms and the KEGG metabolic pathways, which revealed, not surprisingly, a tendency for formation of correlated clusters called FRECs (functionally related enzyme clusters). However, this tendency varied considerably depending on the organism. The relative number of enzymes in FRECs was close to 50% for Bacillus subtilis and Escherichia coli, but was <10% for SYNECHOCYSTIS: and Saccharomyces cerevisiae. The FRECs collection is reorganized into a collection of ortholog group tables in KEGG, which represents conserved pathway motifs with the information about gene clusters in all the completely sequenced genomes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A schematic representation of the graph comparison algorithm to detect correlated clusters or local similarities in two graphs, given a list of correspondences between vertices (nodes) from the two graphs. Initially, each pair of corresponding vertices is a separate cluster. Then similar clusters (shaded) are merged progressively by single linkage with a given measure of similarity.
Figure 2
Figure 2
An example of E.coli FRECs. Seven enzymes catalyzing successive reaction steps in the peptidoglycan biosynthesis pathway are located in close positions along the E.coli chromosome. Open arrows with dotted lines indicate the correspondences between the enzymes and their genes. While the figure shows a part of the genome that was detected as a FREC, a larger gene cluster associated with membrane structure and cell division proteins is found at this chromosomal location. It consists of 14 genes: the seven enzyme genes and ftsW shown here, two upstream genes (ftsLftsI) and four downstream genes (ftsQftsAftsZlpxC).
Figure 3
Figure 3
The size distribution of E.coli FRECs. The number of FRECs is plotted against the number of enzyme genes in a FREC for the cases where a FREC is identical to a known or predicted operon (filled bar), a FREC partially overlaps with an operon sharing at least two enzyme genes (shaded bar) and a FREC shares just one gene or does not correspond at all to an operon (open bar).
Figure 4
Figure 4
The number of enzyme genes in FRECs (filled bar) and the total number of enzyme genes (open bar), together with the ratio of the two, in 10 organisms (see Table 1 for abbreviations).
Figure 5
Figure 5
The ortholog group table for peptidoglycan biosynthesis. Abbreviations (see also Table 1): Rpr, Rickettsia prowazekii; Mtu, Mycobacterium tuberculosis; Ctr, Chlamydia trachomatis; Cpn, Chlamydia pneumoniae; Bbu, Borrelia burgdorferi; Tpa, Treponema pallidum; Dra, Deinococcus radiodurans; Aae, Aquifex aeolicus; Tma, Thermotoga maritima.

References

    1. Bairoch A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 45–48. - PMC - PubMed
    1. Kanehisa M. (1997) Trends Genet., 13, 375–376. - PubMed
    1. Kanehisa M. and Goto,S. (2000) Nucleic Acids Res., 28, 27–30. - PMC - PubMed
    1. Goto S., Nishioka,T. and Kanehisa,M. (2000) Nucleic Acids Res., 28, 380–382. - PMC - PubMed
    1. Karp P.D., Riley,M., Paley,S.M. and Pelligrini-Toole,A. (1996) Nucleic Acids Res., 24, 32–39. - PMC - PubMed

Publication types