Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul;41(Web Server issue):W518-22.
doi: 10.1093/nar/gkt441. Epub 2013 May 22.

PubTator: a web-based text mining tool for assisting biocuration

Affiliations

PubTator: a web-based text mining tool for assisting biocuration

Chih-Hsuan Wei et al. Nucleic Acids Res. 2013 Jul.

Abstract

Manually curating knowledge from biomedical literature into structured databases is highly expensive and time-consuming, making it difficult to keep pace with the rapid growth of the literature. There is therefore a pressing need to assist biocuration with automated text mining tools. Here, we describe PubTator, a web-based system for assisting biocuration. PubTator is different from the few existing tools by featuring a PubMed-like interface, which many biocurators find familiar, and being equipped with multiple challenge-winning text mining algorithms to ensure the quality of its automatic results. Through a formal evaluation with two external user groups, PubTator was shown to be capable of improving both the efficiency and accuracy of manual curation. PubTator is publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The PubTator homepage with five different search options.
Figure 2.
Figure 2.
The PubTator search results page. Automatically computed entities are highlighted in colours. Unlike PubMed, article abstracts can be displayed here without going to a different page.
Figure 3.
Figure 3.
The PubTator annotation page. The two radio buttons (Curatable/Not Curatable) at the top of the page is designed for document triage. The text box and the table below are used for entity annotation. The relationship table at the bottom of the page is for relationship annotation. In Mention View, each row corresponds to an entity mention. In Concept View (default), different mentions of the same concept (i.e. having the same identifier) are combined and displayed in the same row.

References

    1. Burge S, Attwood TK, Bateman A, Berardini TZ, Cherry M, O'Donovan C, Xenarios L, Gaudet P. Biocurators and biocuration: surveying the 21st century challenges. Database (Oxford) 2012;2012:bar059. - PMC - PubMed
    1. Howe D, Costanzo M, Fey P, Gojobori T, Hannick L, Hide W, Hill DP, Kania R, Schaeffer M, Pierre SS, et al. Big data: the future of biocuration. Nature. 2008;455:47–50. - PMC - PubMed
    1. Bourne PE, McEntyre J. Biocurators: contributors to the world of science. PLoS Comput. Biol. 2006;2:e142. - PMC - PubMed
    1. Vishnyakova D, Pasche E, Ruch P. Using binary classification to prioritize and curate articles for the Comparative Toxicogenomics Database. Database (Oxford) 2012;2012:bas050. - PMC - PubMed
    1. Névéol A, Wilbur WJ, Lu Z. Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE. Database (Oxford) 2012;2012:bas026. - PMC - PubMed

Publication types