BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
- PMID: 29688366
- PMCID: PMC5836265
- DOI: 10.1093/database/bay011
BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
Abstract
Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, 'high quality' curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome's utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/.Database URL: http://dataome.mensxmachina.org/.
Figures
References
-
- Rung J., Brazma A. (2012) Reuse of public genome-wide gene expression data. Nat. Rev. Genet., 14, 89–99. - PubMed
-
- Ferguson J. (2012) Description and annotation of biomedical data sets. J eSLIB, 1, 51–56.
-
- Taminau J., Steenhoff D., Coletta A.. et al. (2011) inSilicoDb : an R/Bioconductor package for accessing human Affymetrix expert-curated datasets from GEO. Bioinformatics, 27, 3204–3205. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
