Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 1:2019:bay137.
doi: 10.1093/database/bay137.

PubTerm: a web tool for organizing, annotating and curating genes, diseases, molecules and other concepts from PubMed records

Affiliations

PubTerm: a web tool for organizing, annotating and curating genes, diseases, molecules and other concepts from PubMed records

José Garcia-Pelaez et al. Database (Oxford). .

Abstract

Background and objective: Analysis, annotation and curation of biomedical scientific literature is a recurrent task in biomedical research, database curation and clinics. Commonly, the reading is centered on concepts such as genes, diseases or molecules. Database curators may also need to annotate published abstracts related to a specific topic. However, few free and intuitive tools exist to assist users in this context. Therefore, we developed PubTerm, a web tool to organize, categorize, curate and annotate a large number of PubMed abstracts related to biological entities such as genes, diseases, chemicals, species, sequence variants and other related information.

Methods: A variety of interfaces were implemented to facilitate curation and annotation, including the organization of abstracts by terms, by the co-occurrence of terms or by specific phrases. Information includes statistics on the occurrence of terms. The abstracts, terms and other related information can be annotated and categorized using user-defined categories. The session information can be saved and restored, and the data can be exported to other formats.

Results: The pipeline in PubTerm starts by specifying a PubMed query or list of PubMed identifiers. Then, the user can specify three lists of categories and specify what information will be highlighted in which colors. The user then utilizes the `term view' to organize the abstracts by gene, disease, species or other information to facilitate the annotation and categorization of terms or abstracts. Other views also facilitate the exploration of abstracts and connections between terms. We have used PubTerm to quickly and efficiently curate collections of more than 400 abstracts that mention more than 350 genes to generate revised lists of susceptibility genes for diseases. An example is provided for pulmonary arterial hypertension.

Conclusions: PubTerm saves time for literature revision by assisting with annotation organization and knowledge acquisition.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Implementation of PubTerm. Each box represents a server for computational services or a user browser. Dashed lines represent requests/responses and arrowheads represent flows of data.
Figure 2
Figure 2
Summary of PubTerm. The top scheme shows the typical pipeline. Below, the forms, views and options are shown.
Figure 3
Figure 3
Input methods for PubTerm. (A) Using a PubMed query in (1), specifying the number of records (2) and loading them up (3). (B) Using a list of PubMed IDs in (1) then loading them in (2).
Figure 4
Figure 4
Annotation and categorization. The terms (left) can be annotated by categories (1) and notes (2). The text within abstracts (right) can be marked (3), but the abstract itself can also be categorized (4) and annotated (5).
Figure 5
Figure 5
Views of the abstracts and terms. (A) Record view. (B) Term view. (C) Co-occurrence view. (D) Sentence view.

References

    1. Karp P.D. (2016) Can we replace curation with information extraction software? Database (Oxford), 2016, baw150. - PMC - PubMed
    1. Burge S., Attwood T.K., Bateman A. et al. (2012) Biocurators and biocuration: surveying the 21st century challenges. Database (Oxford), 2012, 1–7. - PMC - PubMed
    1. Lu Z. (2011) PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford), 2011, baq036. - PMC - PubMed
    1. Wei C.-H., Kao H.-Y. and Lu Z. (2013) PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res., 41, W518–W522. - PMC - PubMed
    1. Keepanasseril A. (2014) PubMed alternatives to search MEDLINE: an environmental scan. Indian J. Dent. Res., 25, 527. - PubMed