Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr 2:2013:bat013.
doi: 10.1093/database/bat013. Print 2013.

curatedOvarianData: clinically annotated data for the ovarian cancer transcriptome

Affiliations

curatedOvarianData: clinically annotated data for the ovarian cancer transcriptome

Benjamin Frederick Ganzfried et al. Database (Oxford). .

Abstract

This article introduces a manually curated data collection for gene expression meta-analysis of patients with ovarian cancer and software for reproducible preparation of similar databases. This resource provides uniformly prepared microarray data for 2970 patients from 23 studies with curated and documented clinical metadata. It allows users to efficiently identify studies and patient subgroups of interest for analysis and to perform meta-analysis immediately without the challenges posed by harmonizing heterogeneous microarray technologies, study designs, expression data processing methods and clinical data formats. We confirm that the recently proposed biomarker CXCL12 is associated with patient survival, independently of stage and optimal surgical debulking, which was possible only through meta-analysis owing to insufficient sample sizes of the individual studies. The database is implemented as the curatedOvarianData Bioconductor package for the R statistical computing language, providing a comprehensive and flexible resource for clinically oriented investigation of the ovarian cancer transcriptome. The package and pipeline for producing it are available from http://bcb.dfci.harvard.edu/ovariancancer.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart of the data collection and curation pipeline. The software implementing this pipeline reproduces all steps from downloading of data to final packaging, requiring manual intervention only for identifying studies, curation of clinical metadata and documentation of the package.
Figure 2
Figure 2
Available clinical annotation. This heatmap visualizes for each curated clinical characteristic (rows) the availability in each data set (columns). Red indicates that the corresponding characteristic is available for at least one sample in the data set. See Table 2 for descriptions of these characteristics.
Figure 3
Figure 3
The database confirms CXCL12 as prognostic of overall survival in patients with ovarian cancer. Forest plot of the expression of the chemokine CXCL12 as a univariate predictor of overall survival, using all 14 data sets with applicable expression and survival information. HR indicates the factor by which overall risk of death increases with a one standard deviation increase in CXCL12 expression. A summary HR significantly larger than 1 indicates that patients with high CXCL12 levels had poor outcome and confirms in several lines of code the previously reported association between CXCL12 abundance and patient survival (9). Consideration of important clinicopathological features such as stage, grade, histology and residual disease (optimal surgical debulking) is also straightforward; examples are provided in the package vignette.

Similar articles

Cited by

References

    1. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30:207–210. - PMC - PubMed
    1. Parkinson H, Sarkans U, Kolesnikov N, et al. ArrayExpress update—an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res. 2011;39:D1002–D1004. - PMC - PubMed
    1. McDermott U, Downing JR, Stratton MR. Genomics and the continuum of cancer care. N. Engl. J. Med. 2011;364:340–350. - PubMed
    1. Taminau J, Steenhoff D, Coletta A, et al. inSilicoDb: an R/Bioconductor package for accessing human Affymetrix expert-curated datasets from GEO. Bioinformatics. 2011;27:3204–3205. - PubMed
    1. Carey VJ, Gentry J, Sarkar R, et al. SGDI: system for genomic data integration. Pac. Symp. Biocomput. 2008:141–152. - PMC - PubMed

Publication types