Large-scale automated machine reading discovers new cancer-driving mechanisms
- PMID: 30256986
- PMCID: PMC6156821
- DOI: 10.1093/database/bay098
Large-scale automated machine reading discovers new cancer-driving mechanisms
Abstract
PubMed, a repository and search engine for biomedical literature, now indexes >1 million articles each year. This exceeds the processing capacity of human domain experts, limiting our ability to truly understand many diseases. We present Reach, a system for automated, large-scale machine reading of biomedical papers that can extract mechanistic descriptions of biological processes with relatively high precision at high throughput. We demonstrate that combining the extracted pathway fragments with existing biological data analysis algorithms that rely on curated models helps identify and explain a large number of previously unidentified mutually exclusive altered signaling pathways in seven different cancer types. This work shows that combining human-curated 'big mechanisms' with extracted 'big data' can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications.
Figures
References
-
- Allen J.F., Swift M. and De Beaumont W. (2008) Deep semantic analysis of text. In: Proceedings of the 2008 Conference on Semantics in Text Processing. Association for Computational Linguistics, pp. 343--354.
-
- Appelt D.E., Hobbs J.R., Bear J. et al. (1993) FASTUS: A finite-state processor for information extraction from real-world text. In: Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI). Morgan Kaufmann, San Mateo, CA.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
