Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 1:2017:bax088.
doi: 10.1093/database/bax088.

A semantic-based workflow for biomedical literature annotation

Affiliations

A semantic-based workflow for biomedical literature annotation

Pedro Sernadela et al. Database (Oxford). .

Abstract

Computational annotation of textual information has taken on an important role in knowledge extraction from the biomedical literature, since most of the relevant information from scientific findings is still maintained in text format. In this endeavour, annotation tools can assist in the identification of biomedical concepts and their relationships, providing faster reading and curation processes, with reduced costs. However, the separate usage of distinct annotation systems results in highly heterogeneous data, as it is difficult to efficiently combine and exchange this valuable asset. Moreover, despite the existence of several annotation formats, there is no unified way to integrate miscellaneous annotation outcomes into a reusable, sharable and searchable structure. Taking up this challenge, we present a modular architecture for textual information integration using semantic web features and services. The solution described allows the migration of curation data into a common model, providing a suitable transition process in which multiple annotation data can be integrated and enriched, with the possibility of being shared, compared and reused across semantic knowledge bases.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Semantic-based architecture for scientific information integration.
Figure 2.
Figure 2.
Annotation model: sample extraction of the integration and representation of an annotation related to the ‘Alzheimer disease’.
Figure 3.
Figure 3.
Relation model: sample extraction of the integration and representation of a relatedTo annotation relationship.
Figure 4.
Figure 4.
Validation workflow overview. (1) Dataset is extracted from the NCBI database. (2) Neji and cTAKES API services were used for information extraction, generating diverse outputs and formats. Additional annotation services can be used. (3) Annotations are forwarded and integrated into a unified model and stored in an accessible knowledge base.
Figure 5.
Figure 5.
Knowledge base sample annotation model. The annotators involved share concept attributions (i.e. umls: C0027061), increasing the likelihood of being correctly identified.

Similar articles

References

    1. Rebholz-Schuhmann D., Oellrich A., Hoehndorf R. (2012) Text-mining solutions for biomedical research: enabling integrative biology. Nat. Rev. Genet., 13, 829–839. - PubMed
    1. Khare R., Leaman R., Lu Z. (2014) Accessing Biomedical Literature in the Current Information Landscape. In: Kumar V., Tipney H. (eds). Methods in Molecular Biology (Methods and Protocols), vol 1159. Humana Press, New York. - PMC - PubMed
    1. AlexGrover B.C., Haddow B. (2008) Assisted curation: does text mining really help? Pacific Symp. Biocomput, 13. - PubMed
    1. Nadkarni P.M., Ohno-Machado L., Chapman W.W. (2011) Natural language processing: an introduction. J. Am. Med. Inform. Assoc., 18, 544–551. - PMC - PubMed
    1. Campos D., Matos S., Oliveira J. (2012) Current methodologies for biomedical named entity recognition. Biol. Knowl. Discov. Handb., 839–868.

Publication types

LinkOut - more resources