An environment for relation mining over richly annotated corpora: the case of GENIA
- PMID: 17134476
- PMCID: PMC1764447
- DOI: 10.1186/1471-2105-7-S3-S3
An environment for relation mining over richly annotated corpora: the case of GENIA
Abstract
Background: The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that 'digest' the published results and extract the most important information.
Results: We describe and evaluate an environment supporting the extraction of domain-specific relations, such as protein-protein interactions, from a richly-annotated corpus. We use full, deep-linguistic parsing and manually created, versatile patterns, expressing a large set of syntactic alternations, plus semantic ontology information.
Conclusion: The experiments show that our approach described is capable of delivering high-precision results, while maintaining sufficient levels of recall. The high level of abstraction of the rules used by the system, which are considerably more powerful and versatile than finite-state approaches, allows speedy interactive development and validation.
Figures
References
-
- Rinaldi F, Schneider G, Kaljurand K, Hess M, Andronis C, Konstanti O, Persidis A. Mining of Functional Relations between Genes and Proteins over Biomedical Scientific Literature using a Deep-Linguistic Approach. Journal of Artificial Intelligence in Medicine. 2006. - PubMed
-
- Daraselia N, Egorov S, Yazhuk A, Novichkova S, Yuryev A, Mazo I. Extracting Protein Function Information from MEDLINE Using a Full-Sentence Parser. In: Scheffer T, editor. Second European Workshop on Data Mining and Text Mining for Bioinformatics. Pisa, Italy: ECML/PKDD; 2004. pp. 11–18.
-
- Kim J, Ohta T, Tateisi Y, Tsujii J. GENIA Corpus – a Semantically Annotated Corpus for Bio-Textmining. Bioinformatics. 2003;19:180–182. doi: 10.1093/bioinformatics/btg1023. http://bioinformatics.oxfordjournals.org/cgi/content/abstract/19/suppl_1... - DOI - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources