An architecture for biological information extraction and representation
- PMID: 15608051
- DOI: 10.1093/bioinformatics/bti187
An architecture for biological information extraction and representation
Abstract
Motivations: Technological advances in biomedical research are generating a plethora of heterogeneous data at a high rate. There is a critical need for extraction, integration and management tools for information discovery and synthesis from these heterogeneous data.
Results: In this paper, we present a general architecture, called ALFA, for information extraction and representation from diverse biological data. The ALFA architecture consists of: (i) a networked, hierarchical, hyper-graph object model for representing information from heterogeneous data sources in a standardized, structured format; and (ii) a suite of integrated, interactive software tools for information extraction and representation from diverse biological data sources. As part of our research efforts to explore this space, we have currently prototyped the ALFA object model and a set of interactive software tools for searching, filtering, and extracting information from scientific text. In particular, we describe BioFerret, a meta-search tool for searching and filtering relevant information from the web, and ALFA Text Viewer, an interactive tool for user-guided extraction, disambiguation, and representation of information from scientific text. We further demonstrate the potential of our tools in integrating the extracted information with experimental data and diagrammatic biological models via the common underlying ALFA representation.
Contact: aditya_vailaya@agilent.com.
Similar articles
-
Distributed modules for text annotation and IE applied to the biomedical domain.Int J Med Inform. 2006 Jun;75(6):496-500. doi: 10.1016/j.ijmedinf.2005.06.011. Epub 2005 Aug 8. Int J Med Inform. 2006. PMID: 16085453
-
Protein annotation by EBIMed.Nat Biotechnol. 2006 Aug;24(8):902-3. doi: 10.1038/nbt0806-902. Nat Biotechnol. 2006. PMID: 16900125 No abstract available.
-
GeneInfoMiner--a web server for exploring biomedical literature using batch sequence ID.Bioinformatics. 2005 Aug 15;21(16):3452-3. doi: 10.1093/bioinformatics/bti559. Epub 2005 Jun 30. Bioinformatics. 2005. PMID: 15994195
-
Status of text-mining techniques applied to biomedical text.Drug Discov Today. 2006 Apr;11(7-8):315-25. doi: 10.1016/j.drudis.2006.02.011. Drug Discov Today. 2006. PMID: 16580973 Review.
-
Text mining and ontologies in biomedicine: making sense of raw text.Brief Bioinform. 2005 Sep;6(3):239-51. doi: 10.1093/bib/6.3.239. Brief Bioinform. 2005. PMID: 16212772 Review.
Cited by
-
Integrated Approaches to Drug Discovery for Oxidative Stress-Related Retinal Diseases.Oxid Med Cell Longev. 2016;2016:2370252. doi: 10.1155/2016/2370252. Epub 2016 Dec 7. Oxid Med Cell Longev. 2016. PMID: 28053689 Free PMC article. Review.
-
Integration of biological networks and gene expression data using Cytoscape.Nat Protoc. 2007;2(10):2366-82. doi: 10.1038/nprot.2007.324. Nat Protoc. 2007. PMID: 17947979 Free PMC article.
-
SDH mutations, as potential predictor of chemotherapy prognosis in small cell lung cancer patients.Discov Oncol. 2023 Jun 5;14(1):89. doi: 10.1007/s12672-023-00685-4. Discov Oncol. 2023. PMID: 37273084 Free PMC article.
-
The BioIntelligence Framework: a new computational platform for biomedical knowledge computing.J Am Med Inform Assoc. 2013 Jan 1;20(1):128-33. doi: 10.1136/amiajnl-2011-000646. Epub 2012 Aug 2. J Am Med Inform Assoc. 2013. PMID: 22859646 Free PMC article.
-
A travel guide to Cytoscape plugins.Nat Methods. 2012 Nov;9(11):1069-76. doi: 10.1038/nmeth.2212. Epub 2012 Nov 6. Nat Methods. 2012. PMID: 23132118 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials