Manual curation is not sufficient for annotation of genomic databases
- PMID: 17646325
- PMCID: PMC2516305
- DOI: 10.1093/bioinformatics/btm229
Manual curation is not sufficient for annotation of genomic databases
Abstract
Motivation: Knowledge base construction has been an area of intense activity and great importance in the growth of computational biology. However, there is little or no history of work on the subject of evaluation of knowledge bases, either with respect to their contents or with respect to the processes by which they are constructed. This article proposes the application of a metric from software engineering known as the found/fixed graph to the problem of evaluating the processes by which genomic knowledge bases are built, as well as the completeness of their contents.
Results: Well-understood patterns of change in the found/fixed graph are found to occur in two large publicly available knowledge bases. These patterns suggest that the current manual curation processes will take far too long to complete the annotations of even just the most important model organisms, and that at their current rate of production, they will never be sufficient for completing the annotation of all currently available proteomes.
Figures
References
-
- Acquaah-Mensah GK, Hunter L. Design and implementation of a knowledge-base for pharmacology; Proceedings of the 5th Annual Bio-Ontologies Meeting.2002.
-
- Baral C, et al. Collaborative curation of data from bio-medical texts and abstracts and its integration; Proceedings of the 2nd International Workshop on Data Integration in the Life Sciences; 2005. pp. 309–312.
-
- Beizer B. Software Testing Techniques. 2nd. International Thomson Computer Press; 1990.
-
- Beizer B. Black-Box Testing: Techniques for Functional Testing of Software and Systems. John Wiley and Sons: 1995.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
