Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 26:2015:bav116.
doi: 10.1093/database/bav116. Print 2015.

Biocuration with insufficient resources and fixed timelines

Affiliations

Biocuration with insufficient resources and fixed timelines

Raul Rodriguez-Esteban. Database (Oxford). .

Abstract

Biological curation, or biocuration, is often studied from the perspective of creating and maintaining databases that have the goal of mapping and tracking certain areas of biology. However, much biocuration is, in fact, dedicated to finite and time-limited projects in which insufficient resources demand trade-offs. This typically more ephemeral type of curation is nonetheless of importance in biomedical research. Here, I propose a framework to understand such restricted curation projects from the point of view of return on curation (ROC), value, efficiency and productivity. Moreover, I suggest general strategies to optimize these curation efforts, such as the 'multiple strategies' approach, as well as a metric called overhead that can be used in the context of managing curation resources.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Relationship between precision and overhead. As precision decreases, overhead grows quickly and inversely to precision.
Figure 2
Figure 2
(a) ‘Multiple strategies’ approach in the precision-recall space. Strategy A can take the role of ‘high recall’ strategy, while C that of ‘high precision’ and B that of ‘compromise.’ A new strategy D is inferior to the set of strategies A, B and C, because it falls into the area covered (AC) by these strategies. (b) Adjustable strategy. (c) ‘Multiple strategies’ approach involving an adjustable strategy (defined by the line) and a non-adjustable strategy (defined by the dot). (d) Adjustable strategy in the overhead-recall space.

Similar articles

Cited by

References

    1. Howe D., Costanzo M., Fey P. et al. (2008) Big data: The future of biocuration. Nature, 455, 47–50. - PMC - PubMed
    1. Donaldson I., Martin J., de Bruijn B. et al. (2003) PreBIND and Textomy–mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics, 4, 11. - PMC - PubMed
    1. Rodriguez-Esteban R., Iossifov I., Rzhetsky A. (2006) Imitating manual curation of text-mined facts in biomedicine. PLoS Comput. Biol., 2, e118. - PMC - PubMed
    1. Wang P., Morgan A.A., Zhang Q. et al. (2007) Automating document classification for the Immune Epitope Database. BMC Bioinformatics, 8, 269. - PMC - PubMed
    1. Gama-Castro S., Rinaldi F., López-Fuentes A. et al. (2014) Assisted curation of regulatory interactions and growth conditions of OxyR in E. coli K-12. Database (Oxford). - PMC - PubMed

LinkOut - more resources