Automatic extraction of biological information from scientific text: protein-protein interactions
- PMID: 10786287
Automatic extraction of biological information from scientific text: protein-protein interactions
Abstract
We describe the basic design of a system for automatic detection of protein-protein interactions extracted from scientific abstracts. By restricting the problem domain and imposing a number of strong assumptions which include pre-specified protein names and a limited set of verbs that represent actions, we show that it is possible to perform accurate information extraction. The performance of the system is evaluated with different cases of real-world interaction networks, including the Drosophila cell cycle control. The results obtained computationally are in good agreement with current biological knowledge and demonstrate the feasibility of developing a fully automated system able to describe networks of protein interactions with sufficient accuracy.
Similar articles
-
Extracting interactions between proteins from the literature.J Biomed Inform. 2008 Apr;41(2):393-407. doi: 10.1016/j.jbi.2007.11.008. Epub 2007 Dec 15. J Biomed Inform. 2008. PMID: 18207462 Review.
-
Building a protein name dictionary from full text: a machine learning term extraction approach.BMC Bioinformatics. 2005 Apr 7;6:88. doi: 10.1186/1471-2105-6-88. BMC Bioinformatics. 2005. PMID: 15817129 Free PMC article.
-
Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families.Bioinformatics. 1998;14(7):600-7. doi: 10.1093/bioinformatics/14.7.600. Bioinformatics. 1998. PMID: 9730925
-
Playing biology's name game: identifying protein names in scientific text.Pac Symp Biocomput. 2003:403-14. Pac Symp Biocomput. 2003. PMID: 12603045
-
Computational Analysis of Behavior.Annu Rev Neurosci. 2016 Jul 8;39:217-36. doi: 10.1146/annurev-neuro-070815-013845. Epub 2016 Apr 18. Annu Rev Neurosci. 2016. PMID: 27090952 Review.
Cited by
-
A method for finding communities of related genes.Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1(Suppl 1):5241-8. doi: 10.1073/pnas.0307740100. Epub 2004 Feb 2. Proc Natl Acad Sci U S A. 2004. PMID: 14757821 Free PMC article.
-
Empirical data on corpus design and usage in biomedical natural language processing.AMIA Annu Symp Proc. 2005;2005:156-60. AMIA Annu Symp Proc. 2005. PMID: 16779021 Free PMC article.
-
Auto-CORPus: A Natural Language Processing Tool for Standardizing and Reusing Biomedical Literature.Front Digit Health. 2022 Feb 15;4:788124. doi: 10.3389/fdgth.2022.788124. eCollection 2022. Front Digit Health. 2022. PMID: 35243479 Free PMC article.
-
Chapter 16: text mining for translational bioinformatics.PLoS Comput Biol. 2013 Apr;9(4):e1003044. doi: 10.1371/journal.pcbi.1003044. Epub 2013 Apr 25. PLoS Comput Biol. 2013. PMID: 23633944 Free PMC article.
-
A single kernel-based approach to extract drug-drug interactions from biomedical literature.PLoS One. 2012;7(11):e48901. doi: 10.1371/journal.pone.0048901. Epub 2012 Nov 1. PLoS One. 2012. PMID: 23133662 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Other Literature Sources
Medical