Techniques for optimization of queries on integrated biological resources
- PMID: 15297988
- DOI: 10.1142/s0219720004000648
Techniques for optimization of queries on integrated biological resources
Abstract
Today, scientific data are inevitably digitized, stored in a wide variety of formats, and are accessible over the Internet. Scientific discovery increasingly involves accessing multiple heterogeneous data sources, integrating the results of complex queries, and applying further analysis and visualization applications in order to collect datasets of interest. Building a scientific integration platform to support these critical tasks requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web, as well as data that are locally materialized in warehouses or generated by software. The lack of efficiency of existing approaches can significantly affect the process with lengthy delays while accessing critical resources or with the failure of the system to report any results. Some queries take so much time to be answered that their results are returned via email, making their integration with other results a tedious task. This paper presents several issues that need to be addressed to provide seamless and efficient integration of biomolecular data. Identified challenges include: capturing and representing various domain specific computational capabilities supported by a source including sequence or text search engines and traditional query processing; developing a methodology to acquire and represent semantic knowledge and metadata about source contents, overlap in source contents, and access costs; developing cost and semantics based decision support tools to select sources and capabilities, and to generate efficient query evaluation plans.
Similar articles
-
Biological data integration: wrapping data and tools.IEEE Trans Inf Technol Biomed. 2002 Jun;6(2):123-8. doi: 10.1109/titb.2002.1006299. IEEE Trans Inf Technol Biomed. 2002. PMID: 12075666
-
Scaling the walls of discovery: using semantic metadata for integrative problem solving.Brief Bioinform. 2009 Mar;10(2):164-76. doi: 10.1093/bib/bbp007. Brief Bioinform. 2009. PMID: 19304872
-
AlzPharm: integration of neurodegeneration data using RDF.BMC Bioinformatics. 2007 May 9;8 Suppl 3(Suppl 3):S4. doi: 10.1186/1471-2105-8-S3-S4. BMC Bioinformatics. 2007. PMID: 17493287 Free PMC article.
-
Automation of in-silico data analysis processes through workflow management systems.Brief Bioinform. 2008 Jan;9(1):57-68. doi: 10.1093/bib/bbm056. Epub 2007 Dec 2. Brief Bioinform. 2008. PMID: 18056132 Review.
-
LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics.BMC Bioinformatics. 2007 May 9;8 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-8-S3-S5. BMC Bioinformatics. 2007. PMID: 17493288 Free PMC article. Review.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources