Automated extraction of information on protein-protein interactions from the biological literature
- PMID: 11238071
- DOI: 10.1093/bioinformatics/17.2.155
Automated extraction of information on protein-protein interactions from the biological literature
Abstract
Motivation: To understand biological process, we must clarify how proteins interact with each other. However, since information about protein-protein interactions still exists primarily in the scientific literature, it is not accessible in a computer-readable format. Efficient processing of large amounts of interactions therefore needs an intelligent information extraction method. Our aim is to develop an efficient method for extracting information on protein-protein interaction from scientific literature.
Results: We present a method for extracting information on protein-protein interactions from the scientific literature. This method, which employs only a protein name dictionary, surface clues on word patterns and simple part-of-speech rules, achieved high recall and precision rates for yeast (recall = 86.8% and precision = 94.3%) and Escherichia coli (recall = 82.5% and precision = 93.5%). The result of extraction suggests that our method should be applicable to any species for which a protein name dictionary is constructed.
Availability: The program is available on request from the authors.
Similar articles
-
Extraction of protein interaction information from unstructured text using a context-free grammar.Bioinformatics. 2003 Nov 1;19(16):2046-53. doi: 10.1093/bioinformatics/btg279. Bioinformatics. 2003. PMID: 14594709
-
Discovering patterns to extract protein-protein interactions from full texts.Bioinformatics. 2004 Dec 12;20(18):3604-12. doi: 10.1093/bioinformatics/bth451. Epub 2004 Jul 29. Bioinformatics. 2004. PMID: 15284092
-
Extracting human protein interactions from MEDLINE using a full-sentence parser.Bioinformatics. 2004 Mar 22;20(5):604-11. doi: 10.1093/bioinformatics/btg452. Epub 2004 Jan 22. Bioinformatics. 2004. PMID: 15033866
-
PNAD-CSS: a workbench for constructing a protein name abbreviation dictionary.Bioinformatics. 2000 Feb;16(2):169-75. doi: 10.1093/bioinformatics/16.2.169. Bioinformatics. 2000. PMID: 10842739
-
Automatic extraction of protein-protein interactions using grammatical relationship graph.BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):42. doi: 10.1186/s12911-018-0628-4. BMC Med Inform Decis Mak. 2018. PMID: 30066644 Free PMC article.
Cited by
-
UniProt-Related Documents (UniReD): assisting wet lab biologists in their quest on finding novel counterparts in a protein network.NAR Genom Bioinform. 2020 Feb 11;2(1):lqaa005. doi: 10.1093/nargab/lqaa005. eCollection 2020 Mar. NAR Genom Bioinform. 2020. PMID: 33575553 Free PMC article.
-
Using nanoinformatics methods for automatically identifying relevant nanotoxicology entities from the literature.Biomed Res Int. 2013;2013:410294. doi: 10.1155/2013/410294. Epub 2012 Dec 27. Biomed Res Int. 2013. PMID: 23509721 Free PMC article.
-
A detailed error analysis of 13 kernel methods for protein-protein interaction extraction.BMC Bioinformatics. 2013 Jan 16;14:12. doi: 10.1186/1471-2105-14-12. BMC Bioinformatics. 2013. PMID: 23323857 Free PMC article.
-
PPInterFinder--a mining tool for extracting causal relations on human proteins from literature.Database (Oxford). 2013 Jan 15;2013:bas052. doi: 10.1093/database/bas052. Print 2013. Database (Oxford). 2013. PMID: 23325628 Free PMC article.
-
Facts from text--is text mining ready to deliver?PLoS Biol. 2005 Feb;3(2):e65. doi: 10.1371/journal.pbio.0030065. PLoS Biol. 2005. PMID: 15719064 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases