Challenges in integrating biological data sources
- PMID: 8634908
- DOI: 10.1089/cmb.1995.2.557
Challenges in integrating biological data sources
Abstract
Scientific data of importance to biologists reside in a number of different data sources, such as GenBank, GSDB, SWISS-PROT, EMBL, and OMIM, among many others. Some of these data sources are conventional databases implemented using database management systems (DBMSs) and others are structured files maintained in a number of different formats (e.g., ASN.1 and ACE). In addition, software packages such as sequence analysis packages (e.g., BLAST and FASTA) produce data and can therefore be viewed as data sources. To counter the increasing dispersion and heterogeneity of data, different approaches to integrating these data sources are appearing throughout the bioinformatics community. This paper surveys the technical challenges to integration, classifies the approaches, and critiques the available tools and methodologies.
Similar articles
-
Improving interoperability between microbial information and sequence databases.BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S23. doi: 10.1186/1471-2105-6-S4-S23. BMC Bioinformatics. 2005. PMID: 16351750 Free PMC article.
-
Representations of molecular pathways: an evaluation of SBML, PSI MI and BioPAX.Bioinformatics. 2005 Dec 15;21(24):4401-7. doi: 10.1093/bioinformatics/bti718. Epub 2005 Oct 18. Bioinformatics. 2005. PMID: 16234320
-
The EMBL Nucleotide Sequence Database.Nucleic Acids Res. 2004 Jan 1;32(Database issue):D27-30. doi: 10.1093/nar/gkh120. Nucleic Acids Res. 2004. PMID: 14681351 Free PMC article.
-
Automation of in-silico data analysis processes through workflow management systems.Brief Bioinform. 2008 Jan;9(1):57-68. doi: 10.1093/bib/bbm056. Epub 2007 Dec 2. Brief Bioinform. 2008. PMID: 18056132 Review.
-
Recent developments in biological sequence databases.Curr Opin Biotechnol. 1998 Feb;9(1):54-8. doi: 10.1016/s0958-1669(98)80084-0. Curr Opin Biotechnol. 1998. PMID: 9503588 Review.
Cited by
-
At the heart of computational modelling.J Physiol. 2012 Mar 15;590(6):1331-8. doi: 10.1113/jphysiol.2011.225045. Epub 2012 Jan 23. J Physiol. 2012. PMID: 22271869 Free PMC article. Review.
-
BIOZON: a system for unification, management and analysis of heterogeneous biological data.BMC Bioinformatics. 2006 Feb 15;7:70. doi: 10.1186/1471-2105-7-70. BMC Bioinformatics. 2006. PMID: 16480510 Free PMC article.
-
GlycoRDF: an ontology to standardize glycomics data in RDF.Bioinformatics. 2015 Mar 15;31(6):919-25. doi: 10.1093/bioinformatics/btu732. Epub 2014 Nov 11. Bioinformatics. 2015. PMID: 25388145 Free PMC article.
-
BioWarehouse: a bioinformatics database warehouse toolkit.BMC Bioinformatics. 2006 Mar 23;7:170. doi: 10.1186/1471-2105-7-170. BMC Bioinformatics. 2006. PMID: 16556315 Free PMC article.
-
Biopipe: a flexible framework for protocol-based bioinformatics analysis.Genome Res. 2003 Aug;13(8):1904-15. doi: 10.1101/gr.1363103. Epub 2003 Jul 17. Genome Res. 2003. PMID: 12869579 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous