RelEx--relation extraction using dependency parse trees
- PMID: 17142812
- DOI: 10.1093/bioinformatics/btl616
RelEx--relation extraction using dependency parse trees
Abstract
Motivation: The discovery of regulatory pathways, signal cascades, metabolic processes or disease models requires knowledge on individual relations like e.g. physical or regulatory interactions between genes and proteins. Most interactions mentioned in the free text of biomedical publications are not yet contained in structured databases.
Results: We developed RelEx, an approach for relation extraction from free text. It is based on natural language preprocessing producing dependency parse trees and applying a small number of simple rules to these trees. We applied RelEx on a comprehensive set of one million MEDLINE abstracts dealing with gene and protein relations and extracted approximately 150,000 relations with an estimated performance of both 80% precision and 80% recall.
Availability: The used natural language preprocessing tools are free for use for academic research. Test sets and relation term lists are available from our website (http://www.bio.ifi.lmu.de/publications/RelEx/).
Similar articles
-
Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction.Bioinformatics. 2005 Apr 15;21(8):1653-8. doi: 10.1093/bioinformatics/bti165. Epub 2004 Nov 25. Bioinformatics. 2005. PMID: 15564295
-
Extracting human protein interactions from MEDLINE using a full-sentence parser.Bioinformatics. 2004 Mar 22;20(5):604-11. doi: 10.1093/bioinformatics/btg452. Epub 2004 Jan 22. Bioinformatics. 2004. PMID: 15033866
-
BioContrasts: extracting and exploiting protein-protein contrastive relations from biomedical literature.Bioinformatics. 2006 Mar 1;22(5):597-605. doi: 10.1093/bioinformatics/btk016. Epub 2005 Dec 20. Bioinformatics. 2006. PMID: 16368768
-
Extracting interactions between proteins from the literature.J Biomed Inform. 2008 Apr;41(2):393-407. doi: 10.1016/j.jbi.2007.11.008. Epub 2007 Dec 15. J Biomed Inform. 2008. PMID: 18207462 Review.
-
Text mining and its potential applications in systems biology.Trends Biotechnol. 2006 Dec;24(12):571-9. doi: 10.1016/j.tibtech.2006.10.002. Epub 2006 Oct 12. Trends Biotechnol. 2006. PMID: 17045684 Review.
Cited by
-
Learning to explain is a good biomedical few-shot learner.Bioinformatics. 2024 Oct 1;40(10):btae589. doi: 10.1093/bioinformatics/btae589. Bioinformatics. 2024. PMID: 39360976 Free PMC article.
-
Detection and categorization of bacteria habitats using shallow linguistic analysis.BMC Bioinformatics. 2015;16 Suppl 10(Suppl 10):S5. doi: 10.1186/1471-2105-16-S10-S5. Epub 2015 Jul 13. BMC Bioinformatics. 2015. PMID: 26201262 Free PMC article.
-
An analysis of gene/protein associations at PubMed scale.J Biomed Semantics. 2011 Oct 6;2 Suppl 5(Suppl 5):S5. doi: 10.1186/2041-1480-2-S5-S5. J Biomed Semantics. 2011. PMID: 22166173 Free PMC article.
-
BertSRC: transformer-based semantic relation classification.BMC Med Inform Decis Mak. 2022 Sep 6;22(1):234. doi: 10.1186/s12911-022-01977-5. BMC Med Inform Decis Mak. 2022. PMID: 36068535 Free PMC article.
-
PIPE: a protein-protein interaction passage extraction module for BioCreative challenge.Database (Oxford). 2016 Aug 14;2016:baw101. doi: 10.1093/database/baw101. Print 2016. Database (Oxford). 2016. PMID: 27524807 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources