Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 May 2;33(8):2521-30.
doi: 10.1093/nar/gki545. Print 2005.

Nebulon: a system for the inference of functional relationships of gene products from the rearrangement of predicted operons

Affiliations

Nebulon: a system for the inference of functional relationships of gene products from the rearrangement of predicted operons

Sarath Chandra Janga et al. Nucleic Acids Res. .

Abstract

Since operons are unstable across Prokaryotes, it has been suggested that perhaps they re-combine in a conservative manner. Thus, genes belonging to a given operon in one genome might re-associate in other genomes revealing functional relationships among gene products. We developed a system to build networks of functional relationships of gene products based on their organization into operons in any available genome. The operon predictions are based on inter-genic distances. Our system can use different kinds of thresholds to accept a functional relationship, either related to the prediction of operons, or to the number of non-redundant genomes that support the associations. We also work by shells, meaning that we decide on the number of linking iterations to allow for the complementation of related gene sets. The method shows high reliability benchmarked against knowledge-bases of functional interactions. We also illustrate the use of Nebulon in finding new members of regulons, and of other functional groups of genes. Operon rearrangements produce thousands of high-quality new interactions per prokaryotic genome, and thousands of confirmations per genome to other predictions, making it another important tool for the inference of functional interactions from genomic context.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Finding links by operon rearrangement. Operon predictions are based on a well established method which relies solely on inter-genic distances (14,15), and not on conservation of gene order. This is the main difference with other available tools (–13). Though we also incorporate fusions >99% of our links come from operon predictions alone.
Figure 2
Figure 2
(a) Distribution of KEGG links recovered in 1000 randomly shuffled networks keeping the connectivity fixed in E.coli K12. (b) Distribution of DIP links obtained in same set of 1000 random networks.
Figure 3
Figure 3
Effect of increasing thresholds on the quality of predictions. We used the fraction of predicted links whose products work within the same KEGG metabolic pathway as a measure of quality. (a) Effect of increasing the LLH to accept an operon prediction. (b) Effect of increasing the number of associations (number of times the genes are found in the same operon). The measure is far from perfect, but it does give a sense of what happens as thresholds increase. The apparently slow growth in quality with increasing LLH is due to the 0.0 threshold being high to start with. Operon predictions have a positive predictive value (true positives divided by the sum of true positives and false positives) of 0.86 at a 0.0 LLH, and of 0.93 at 1.0 LLH in E.coli K12.
Figure 4
Figure 4
Fraction of internal versus external links found in E.coli K12 network. (a) KEGG and DIP datasets. (b) Fraction of internal versus external links found in E.coli K12 network for each pathway in KEGG. Pathway identifiers mean MAP00193: ATP synthesis; MAP00632: Benzoate degradation via CoA ligation; MAP00650: Butanoate metabolism; MAP00020: Citrate cycle (TCA cycle); MAP00061: Fatty acid biosynthesis (path 1); MAP00071: Fatty acid metabolism; MAP02040: Flagellar assembly; MAP00790: Folate biosynthesis; MAP00260: Glycine, serine and threonine metabolism; MAP00010: Glycolysis/Gluconeogenesis; MAP00630: Glyoxylate and dicarboxylate metabolism; MAP00340: Histidine metabolism; MAP00300: Lysine biosynthesis; MAP00910: Nitrogen metabolism; MAP00520: Nucleotide sugars metabolism; MAP00190: Oxidative phosphorylation; MAP00770: Pantothenate and CoA biosynthesis; MAP00040: Pentose and glucuronate interconversions; MAP00030: Pentose phosphate pathway; MAP00550: Peptidoglycan biosynthesis; MAP00400: Phenylalanine, tyrosine and tryptophan biosynthesis; MAP00195: photosynthesis; MAP00860: porphyrin and chlorophyll metabolism; MAP00640: propanoate metabolism; MAP00230: purine metabolism; MAP00240: pyrimidine metabolism; MAP00720: reductive carboxylate cycle (CO2 fixation); MAP00500: starch and sucrose metabolism; MAP03070: type III secretion system; MAP00130: ubiquinone biosynthesis; MAP00220: Urea cycle and metabolism of amino groups; and MAP00290: valine, leucine and isoleucine biosynthesis.
Figure 5
Figure 5
Links to the argR gene coding for the ArgR transcription factor in E.coli K12 using a LLH threshold of 0.4 and associations found in at least one genome.
Figure 6
Figure 6
Uber-operon recovery. (a) Links to tufA in Nebulon with a LLH threshold of 0.4. (a) Minimum number of evidences set to 1. (b) Minimum number of evidences set to 2. (b) Two shells of links to flgA in Nebulon showing known and predicted associations.
Figure 7
Figure 7
Links among genes involved in Nitrogen fixation and in Nodulation of S.meliloti. Core genes refer to genes annotated as involved in these activities in S.meliloti (31), while non-core are other linked genes found by Nebulon.

References

    1. Tatusov R.L., Koonin E.V., Lipman D.J. A genomic perspective on protein families. Science. 1997;278:631–637. - PubMed
    1. Gaasterland T., Ragan M.A. Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes. Microb. Comp. Genomics. 1998;3:199–217. - PubMed
    1. Pellegrini M., Marcotte E.M., Thompson M.J., Eisenberg D., Yeates T.O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA. 1999;96:4285–4288. - PMC - PubMed
    1. Dandekar T., Snel B., Huynen M., Bork P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 1998;23:324–328. - PubMed
    1. Overbeek R., Fonstein M., D'Souza M., Pusch G.D., Maltsev N. The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA. 1999;96:2896–2901. - PMC - PubMed

Publication types

Substances