BIOZON: a hub of heterogeneous biological data
- PMID: 16381854
- PMCID: PMC1347515
- DOI: 10.1093/nar/gkj153
BIOZON: a hub of heterogeneous biological data
Abstract
Biological entities are strongly related and mutually dependent on each other. Therefore, there is a growing need to corroborate and integrate data from different resources and aspects of biological systems in order to analyze them effectively. Biozon is a unified biological database that integrates heterogeneous data types such as proteins, structures, domain families, protein-protein interactions and cellular pathways, and establishes the relationships between them. All data are integrated on to a single graph schema centered around the non-redundant set of biological objects that are shared by each source. This integration results in a highly connected graph structure that provides a more complete picture of the known context of a given object that cannot be determined from any one source. Currently, Biozon integrates roughly 2 million protein sequences, 42 million DNA or RNA sequences, 32,000 protein structures, 150,000 interactions and more from sources such as GenBank, UniProt, Protein Data Bank (PDB) and BIND. Biozon augments source data with locally derived data such as 5 billion pairwise protein alignments and 8 million structural alignments. The user may form complex cross-type queries on the graph structure, add similarity relations to form fuzzy queries and rank the results based on analysis of the edge structure similar to Google PageRank, online at Biozon.org.
Figures






Similar articles
-
BIOZON: a system for unification, management and analysis of heterogeneous biological data.BMC Bioinformatics. 2006 Feb 15;7:70. doi: 10.1186/1471-2105-7-70. BMC Bioinformatics. 2006. PMID: 16480510 Free PMC article.
-
Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities.BMC Bioinformatics. 2006 Feb 15;7:71. doi: 10.1186/1471-2105-7-71. BMC Bioinformatics. 2006. PMID: 16480496 Free PMC article.
-
Atlas - a data warehouse for integrative bioinformatics.BMC Bioinformatics. 2005 Feb 21;6:34. doi: 10.1186/1471-2105-6-34. BMC Bioinformatics. 2005. PMID: 15723693 Free PMC article.
-
LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics.BMC Bioinformatics. 2007 May 9;8 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-8-S3-S5. BMC Bioinformatics. 2007. PMID: 17493288 Free PMC article. Review.
-
The RCSB Protein Data Bank: redesigned web site and web services.Nucleic Acids Res. 2011 Jan;39(Database issue):D392-401. doi: 10.1093/nar/gkq1021. Epub 2010 Oct 29. Nucleic Acids Res. 2011. PMID: 21036868 Free PMC article.
Cited by
-
Comparative microbial modules resource: generation and visualization of multi-species biclusters.PLoS Comput Biol. 2011 Dec;7(12):e1002228. doi: 10.1371/journal.pcbi.1002228. Epub 2011 Dec 1. PLoS Comput Biol. 2011. PMID: 22144874 Free PMC article.
-
Michigan molecular interactions r2: from interacting proteins to pathways.Nucleic Acids Res. 2009 Jan;37(Database issue):D642-6. doi: 10.1093/nar/gkn722. Epub 2008 Oct 31. Nucleic Acids Res. 2009. PMID: 18978014 Free PMC article.
-
BIOZON: a system for unification, management and analysis of heterogeneous biological data.BMC Bioinformatics. 2006 Feb 15;7:70. doi: 10.1186/1471-2105-7-70. BMC Bioinformatics. 2006. PMID: 16480510 Free PMC article.
-
It's the machine that matters: Predicting gene function and phenotype from protein networks.J Proteomics. 2010 Oct 10;73(11):2277-89. doi: 10.1016/j.jprot.2010.07.005. Epub 2010 Jul 15. J Proteomics. 2010. PMID: 20637909 Free PMC article. Review.
-
Collaborative mining and interpretation of large-scale data for biomedical research insights.PLoS One. 2014 Sep 30;9(9):e108600. doi: 10.1371/journal.pone.0108600. eCollection 2014. PLoS One. 2014. PMID: 25268270 Free PMC article.