Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Jul 20;33(13):4035-9.
doi: 10.1093/nar/gki711. Print 2005.

Genome annotation errors in pathway databases due to semantic ambiguity in partial EC numbers

Affiliations

Genome annotation errors in pathway databases due to semantic ambiguity in partial EC numbers

M L Green et al. Nucleic Acids Res. .

Abstract

We report on a new type of systematic annotation error in genome and pathway databases that results from the misinterpretation of partial Enzyme Commission (EC) numbers such as '1.1.1.-'. This error results in the assignment of genes annotated with a partial EC number to many or all biochemical reactions that are annotated with the same partial EC number. That inference is faulty because of the ambiguous nature of partial EC numbers. We have observed this type of error in multiple databases, including KEGG, VIMSS and IMG, all of which assign genes to KEGG pathways. The Escherichia coli subset of the KEGG database exhibits this error for 6.8% of its gene-reaction assignments. For example, KEGG contains 17 reactions that are annotated with EC 1.1.1.-. A group of three E.coli genes, b1580 [putative dehydrogenase, NAD(P)-binding, starvation-sensing protein], b3787 (UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase) and b0207 (2,5-diketo-D-gluconate reductase B), is assigned to 15 of those reactions, despite experimental evidence indicating different single functions for two of the three genes. Furthermore, the databases (DBs) are internally inconsistent in that the description of gene functions for genes with partial EC numbers is inconsistent with the activities implied by reactions to which the genes were assigned. We infer that these inconsistencies result from the processing used to match gene products to reactions within KEGG's metabolic pathways. These errors affect scientists who use these DBs as online encyclopedias and they affect bioinformaticists who use these DBs to train and validate newly developed algorithms.

PubMed Disclaimer

References

    1. Tringe S.G., von Mering C., Kobayashi A., Salamov A.A., Chen K., Chang H.W., Podar M., Short J.M., Mathur E.J., Detter J.C., et al. Comparative metagenomics of microbial communities. Science. 2005;308:554–557. - PubMed
    1. von Mering C., Zdobnov E.M., Tsoka S., Ciccarelli F.D., Pereira-Leal J.B., Ouzounis C.A., Bork P. Genome evolution reveals biochemical networks and functional modules. Proc. Natl Acad. Sci. USA. 2003;100:15428–15433. - PMC - PubMed
    1. Wu J., Kasif S., DeLisi C. Identification of functional links between genes using phylogenetic profiles. Bioinformatics. 2003;19:1524–1530. - PubMed
    1. Yanai I., Mellor J.C., DeLisi C. Identifying functional links between genes using conserved chromosomal proximity. Trends Genet. 2002;18:176–179. - PubMed
    1. Kanehisa M., Goto S., Kawashima S., Okuno Y., Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004;32:D277–D280. - PMC - PubMed

Publication types

Associated data