Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004 Apr 19;32(7):2147-57.
doi: 10.1093/nar/gkh510. Print 2004.

Operon prediction by comparative genomics: an application to the Synechococcus sp. WH8102 genome

Affiliations
Comparative Study

Operon prediction by comparative genomics: an application to the Synechococcus sp. WH8102 genome

X Chen et al. Nucleic Acids Res. .

Abstract

We present a computational method for operon prediction based on a comparative genomics approach. A group of consecutive genes is considered as a candidate operon if both their gene sequences and functions are conserved across several phylogenetically related genomes. In addition, various supporting data for operons are also collected through the application of public domain computer programs, and used in our prediction method. These include the prediction of conserved gene functions, promoter motifs and terminators. An apparent advantage of our approach over other operon prediction methods is that it does not require many experimental data (such as gene expression data and pathway data) as input. This feature makes it applicable to many newly sequenced genomes that do not have extensive experimental information. In order to validate our prediction, we have tested the method on Escherichia coli K12, in which operon structures have been extensively studied, through a comparative analysis against Haemophilus influenzae Rd and Salmonella typhimurium LT2. Our method successfully predicted most of the 237 known operons. After this initial validation, we then applied the method to a newly sequenced and annotated microbial genome, Synechococcus sp. WH8102, through a comparative genome analysis with two other cyanobacterial genomes, Prochlorococcus marinus sp. MED4 and P.marinus sp. MIT9313. Our results are consistent with previously reported results and statistics on operons in the literature.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An outline of the comparative genomics method for operon prediction.
Figure 2
Figure 2
A simple illustration of a three-stage gene-matching graph. Each oval represents a genome, and a link between two genomes represents a pair of matched genes.
Figure 3
Figure 3
Cumulative distributions of the likelihood scores of predicted and verified operons.
Figure 4
Figure 4
The three-stage gene-matching graph for the three microbial genomes. Each link between two genomes represents a pair of matched genes.
Figure 5
Figure 5
The 446 predicted operons in WH8102 and their distributions in MED4 and MIT9313. Each link represents a candidate operon conserved between two genomes, and different operons are shown using different colors.
Figure 6
Figure 6
Distributions of the putative operon sizes and intergenic distances.
Figure 7
Figure 7
Distribution of likelihood scores of the putative operons in WH8102.

References

    1. Xu Y. (2004) Computational genome annotation. In Zhou,J., Thompson,D., Xu,Y. and Tidge,J. (eds), Microbial Functional Genomics. Wiley-LISS, Hoboken, NJ, pp. 41–66.
    1. Overbeek R., Fonstein,M., D’Souza,M., Pusch,G.D. and Maltsev,N. (1999) The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA, 96, 2896–2901. - PMC - PubMed
    1. Salgado H., Moreno-Hagelsieb,G., Smith,T. and Collado-Vides,J. (2000) Operons in Escherichia coli: genomic analyses and predictions. Proc. Natl Acad. Sci. USA, 97, 6652–6657. - PMC - PubMed
    1. Ermolaeva M.D., White,O. and Salzberg,S.L. (2001) Prediction of operons in microbial genomes. Nucleic Acids Res., 29, 1216–1221. - PMC - PubMed
    1. Craven M., Page,D., Shavlik,J., Bockhorst,J. and Glasner,J. (2000) A probabilistic learning approach to whole-genome operon prediction. Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, AAAI Press, San Diego, CA, pp. 116–127. - PubMed

Publication types

MeSH terms

Substances