Using uncertainty to link and rank evidence from biomedical literature for model curation
- PMID: 29036627
- PMCID: PMC5860317
- DOI: 10.1093/bioinformatics/btx466
Using uncertainty to link and rank evidence from biomedical literature for model curation
Abstract
Motivation: In recent years, there has been great progress in the field of automated curation of biomedical networks and models, aided by text mining methods that provide evidence from literature. Such methods must not only extract snippets of text that relate to model interactions, but also be able to contextualize the evidence and provide additional confidence scores for the interaction in question. Although various approaches calculating confidence scores have focused primarily on the quality of the extracted information, there has been little work on exploring the textual uncertainty conveyed by the author. Despite textual uncertainty being acknowledged in biomedical text mining as an attribute of text mined interactions (events), it is significantly understudied as a means of providing a confidence measure for interactions in pathways or other biomedical models. In this work, we focus on improving identification of textual uncertainty for events and explore how it can be used as an additional measure of confidence for biomedical models.
Results: We present a novel method for extracting uncertainty from the literature using a hybrid approach that combines rule induction and machine learning. Variations of this hybrid approach are then discussed, alongside their advantages and disadvantages. We use subjective logic theory to combine multiple uncertainty values extracted from different sources for the same interaction. Our approach achieves F-scores of 0.76 and 0.88 based on the BioNLP-ST and Genia-MK corpora, respectively, making considerable improvements over previously published work. Moreover, we evaluate our proposed system on pathways related to two different areas, namely leukemia and melanoma cancer research.
Availability and implementation: The leukemia pathway model used is available in Pathway Studio while the Ras model is available via PathwayCommons. Online demonstration of the uncertainty extraction system is available for research purposes at http://argo.nactem.ac.uk/test. The related code is available on https://github.com/c-zrv/uncertainty_components.git. Details on the above are available in the Supplementary Material.
Contact: sophia.ananiadou@manchester.ac.uk.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2017. Published by Oxford University Press.
Figures





Similar articles
-
LitPathExplorer: a confidence-based visual text analytics tool for exploring literature-enriched pathway models.Bioinformatics. 2018 Apr 15;34(8):1389-1397. doi: 10.1093/bioinformatics/btx774. Bioinformatics. 2018. PMID: 29228271
-
Argo: an integrative, interactive, text mining-based workbench supporting curation.Database (Oxford). 2012 Mar 20;2012:bas010. doi: 10.1093/database/bas010. Print 2012. Database (Oxford). 2012. PMID: 22434844 Free PMC article.
-
A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text.Bioinformatics. 2013 Jul 1;29(13):i44-52. doi: 10.1093/bioinformatics/btt227. Bioinformatics. 2013. PMID: 23813008 Free PMC article.
-
Literature mining for context-specific molecular relations using multimodal representations (COMMODAR).BMC Bioinformatics. 2020 Oct 26;21(Suppl 5):250. doi: 10.1186/s12859-020-3396-y. BMC Bioinformatics. 2020. PMID: 33106154 Free PMC article. Review.
-
Text-mining solutions for biomedical research: enabling integrative biology.Nat Rev Genet. 2012 Dec;13(12):829-39. doi: 10.1038/nrg3337. Epub 2012 Nov 14. Nat Rev Genet. 2012. PMID: 23150036 Review.
Cited by
-
Testing the reproducibility and robustness of the cancer biology literature by robot.J R Soc Interface. 2022 Apr;19(189):20210821. doi: 10.1098/rsif.2021.0821. Epub 2022 Apr 6. J R Soc Interface. 2022. PMID: 35382578 Free PMC article.
-
Automated assessment of biological database assertions using the scientific literature.BMC Bioinformatics. 2019 Apr 29;20(1):216. doi: 10.1186/s12859-019-2801-x. BMC Bioinformatics. 2019. PMID: 31035936 Free PMC article.
-
Data-driven classification of the certainty of scholarly assertions.PeerJ. 2020 Apr 21;8:e8871. doi: 10.7717/peerj.8871. eCollection 2020. PeerJ. 2020. PMID: 32341891 Free PMC article.
-
Unsupervised inference of implicit biomedical events using context triggers.BMC Bioinformatics. 2020 Jan 28;21(1):29. doi: 10.1186/s12859-020-3341-0. BMC Bioinformatics. 2020. PMID: 31992184 Free PMC article.
-
FLUTE: Fast and reliable knowledge retrieval from biomedical literature.Database (Oxford). 2020 Jan 1;2020:baaa056. doi: 10.1093/database/baaa056. Database (Oxford). 2020. PMID: 32761077 Free PMC article.
References
-
- Bader J.S. et al. (2004) Gaining confidence in high-throughput protein interaction networks. Nat. Biotechnol., 22, 78–85. - PubMed
-
- Björne J., Salakoski T. (2011) Generalizing biomedical event extraction. Proceedings of the BioNLP, 2011workshop, 183–191.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources