Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 May 23:13:108.
doi: 10.1186/1471-2105-13-108.

Extracting semantically enriched events from biomedical literature

Affiliations

Extracting semantically enriched events from biomedical literature

Makoto Miwa et al. BMC Bioinformatics. .

Abstract

Background: Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them.

Results: Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP'09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP'09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task.

Conclusions: We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Relation (left) and event (right) representations. Three sentences are shown. Each sentence covers a different type of relation/event. Relation annotations for the three sentences are shown on the left hand side, while event annotations are shown on the right-hand side. Examples a) and d) concern Localization, b) and e) concern Binding, and c) and f) concern Regulation.
Figure 2
Figure 2
Meta-knowledge annotation scheme.* denotes the default value for each dimension.
Figure 3
Figure 3
EventMine event extraction pipeline. The diagram illustrates the pipeline model used by EventMine. The system takes as input texts in which proteins/genes have already been identified. Trigger/entity detection classifies appropriate words in each sentence as triggers or entities, argument detection finds semantic pair-wise relations among event participants and multi-argument event detection merges several relations into events. This sentence is taken from PMID: 9341193 [85].
Figure 4
Figure 4
Shortest dependency path example. This example illustrates the shortest dependency path between IEXC29S and transactivate on the dependency tree produced by the GDep beta2 parser [87] (SUB: Subject, OBJ: Object, AMOD: Modifier of Adjective or Adverbal, VMOD: Modifier of Verb, PRD: Predicative Complement). The shortest path is shown in bold.
Figure 5
Figure 5
EventMine-MK event extraction pipeline. The diagram illustrates the pipeline model used by EventMine-MK. The meta-knowledge assignment module is applied after event extraction. The functionality of the trigger/entity detector is different from the one shown in Figure 3, in that meta-knowledge clues are now additionally detected by this module. The extracted clues are used as features by the newly-added meta-knowledge extraction module.

References

    1. Ananiadou S, McNaught J (Eds) Text Mining for Biology And Biomedicine. Artech House Publishers, London, UK; 2005.
    1. Zweigenbaum P, Demner-Fushman D, Yu H, Cohen KB. Frontiers of biomedical text mining: current progress. Briefings Bioinf. 2007;8(5):358–375. - PMC - PubMed
    1. Ananiadou S, Pyysalo S, Tsujii J, Kell DB. Event extraction for systems biology by text mining the literature. Trends Biotechnol. 2010;28(7):381–390. - PubMed
    1. Airola A, Pyysalo S, Bjorne J, Pahikkala T, Ginter F, Salakoski T. All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning. BMC Bioinf. 2008;9(Suppl 11):S2. - PMC - PubMed
    1. Miwa M, Sætre R, Miyao Y, Tsujii J. A rich feature vector for protein-protein interaction extraction from multiple corpora. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009. pp. 121–130.

Publication types

LinkOut - more resources