. 2017 Dec 1;33(23):3784-3792.

doi: 10.1093/bioinformatics/btx466.

Using uncertainty to link and rank evidence from biomedical literature for model curation

Chrysoula Zerva¹, Riza Batista-Navarro¹, Philip Day², Sophia Ananiadou¹

Affiliations

¹ National Centre for Text Mining, School of Computer Science.
² Manchester Institute of Biotechnology, The University of Manchester, Manchester, UK.

PMID: 29036627
PMCID: PMC5860317
DOI: 10.1093/bioinformatics/btx466

Using uncertainty to link and rank evidence from biomedical literature for model curation

Chrysoula Zerva et al. Bioinformatics. 2017.

. 2017 Dec 1;33(23):3784-3792.

doi: 10.1093/bioinformatics/btx466.

Authors

Chrysoula Zerva¹, Riza Batista-Navarro¹, Philip Day², Sophia Ananiadou¹

Affiliations

¹ National Centre for Text Mining, School of Computer Science.
² Manchester Institute of Biotechnology, The University of Manchester, Manchester, UK.

PMID: 29036627
PMCID: PMC5860317
DOI: 10.1093/bioinformatics/btx466

Abstract

Motivation: In recent years, there has been great progress in the field of automated curation of biomedical networks and models, aided by text mining methods that provide evidence from literature. Such methods must not only extract snippets of text that relate to model interactions, but also be able to contextualize the evidence and provide additional confidence scores for the interaction in question. Although various approaches calculating confidence scores have focused primarily on the quality of the extracted information, there has been little work on exploring the textual uncertainty conveyed by the author. Despite textual uncertainty being acknowledged in biomedical text mining as an attribute of text mined interactions (events), it is significantly understudied as a means of providing a confidence measure for interactions in pathways or other biomedical models. In this work, we focus on improving identification of textual uncertainty for events and explore how it can be used as an additional measure of confidence for biomedical models.

Results: We present a novel method for extracting uncertainty from the literature using a hybrid approach that combines rule induction and machine learning. Variations of this hybrid approach are then discussed, alongside their advantages and disadvantages. We use subjective logic theory to combine multiple uncertainty values extracted from different sources for the same interaction. Our approach achieves F-scores of 0.76 and 0.88 based on the BioNLP-ST and Genia-MK corpora, respectively, making considerable improvements over previously published work. Moreover, we evaluate our proposed system on pathways related to two different areas, namely leukemia and melanoma cancer research.

Availability and implementation: The leukemia pathway model used is available in Pathway Studio while the Ras model is available via PathwayCommons. Online demonstration of the uncertainty extraction system is available for research purposes at http://argo.nactem.ac.uk/test. The related code is available on https://github.com/c-zrv/uncertainty_components.git. Details on the above are available in the Supplementary Material.

Contact: sophia.ananiadou@manchester.ac.uk.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
Event structures according to the *BioNLP* schema. Event triggers are enclosed in double-lined (green) boxes, while named entities (NEs) in single-lined (blue) ones. Arguments of events are represented by arrows above the words. We can observe that the *Regulation* event is a complex event, having the *Binding* event as its *Theme* argument

**Fig. 2.**
Uncertainty cues considered in the experiments grouped according to category (Strong/Weak speculation, frequency, Admission of lack of knowledge, Weaseling). Word clouds were generated based on BioNLP-ST and GENIA-MK

**Fig. 3.**
Relation between the influence of uncertainty cues and syntactic dependencies. Dependencies are marked with arrows above text, while the scope of the uncertainty cue *may* is marked with the red squared brackets (Color version of this figure is available at *Bioinformatics* online.)

**Fig. 4.**
Distribution of scores for (un)certainty between annot. 1 (solid colored (blue) bars) and annot. 2 (vertically stripped white bars) (Color version of this figure is available at *Bioinformatics* online.)

**Fig. 5.**
Performance in terms of precision, recall and F-score, depending on the selection of the mean average score as the upper limit of uncertainty (i.e. the value below which all scored events must be considered uncertain)

See this image and copyright information in PMC

Cited by

Testing the reproducibility and robustness of the cancer biology literature by robot.
Roper K, Abdel-Rehim A, Hubbard S, Carpenter M, Rzhetsky A, Soldatova L, King RD. Roper K, et al. J R Soc Interface. 2022 Apr;19(189):20210821. doi: 10.1098/rsif.2021.0821. Epub 2022 Apr 6. J R Soc Interface. 2022. PMID: 35382578 Free PMC article.
Automated assessment of biological database assertions using the scientific literature.
Bouadjenek MR, Zobel J, Verspoor K. Bouadjenek MR, et al. BMC Bioinformatics. 2019 Apr 29;20(1):216. doi: 10.1186/s12859-019-2801-x. BMC Bioinformatics. 2019. PMID: 31035936 Free PMC article.
Data-driven classification of the certainty of scholarly assertions.
Prieto M, Deus H, de Waard A, Schultes E, García-Jiménez B, Wilkinson MD. Prieto M, et al. PeerJ. 2020 Apr 21;8:e8871. doi: 10.7717/peerj.8871. eCollection 2020. PeerJ. 2020. PMID: 32341891 Free PMC article.
Unsupervised inference of implicit biomedical events using context triggers.
Chung JW, Yang W, Park JC. Chung JW, et al. BMC Bioinformatics. 2020 Jan 28;21(1):29. doi: 10.1186/s12859-020-3341-0. BMC Bioinformatics. 2020. PMID: 31992184 Free PMC article.
FLUTE: Fast and reliable knowledge retrieval from biomedical literature.
Holtzapple E, Telmer CA, Miskov-Zivanov N. Holtzapple E, et al. Database (Oxford). 2020 Jan 1;2020:baaa056. doi: 10.1093/database/baaa056. Database (Oxford). 2020. PMID: 32761077 Free PMC article.

See all "Cited by" articles

References

1. Ananiadou S. et al. (2015) Event-based text mining for biology and functional genomics. Brief. Funct. Genomics, 14, 213–230. - PMC - PubMed
1. Bader J.S. et al. (2004) Gaining confidence in high-throughput protein interaction networks. Nat. Biotechnol., 22, 78–85. - PubMed
1. Björne J., Salakoski T. (2011) Generalizing biomedical event extraction. Proceedings of the BioNLP, 2011workshop, 183–191.
1. Björne J., Tapio S. (2015) TEES 2.2: biomedical event extraction for diverse corpora. BMC Bioinformatics, 16, 1–20. - PMC - PubMed
1. Björne J. et al. (2010) Complex event extraction at PubMed scale. Bioinformatics, 26, 382–390. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Using uncertainty to link and rank evidence from biomedical literature for model curation

Affiliations

Using uncertainty to link and rank evidence from biomedical literature for model curation

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources