. 2008:2008:5660-3.

doi: 10.1109/IEMBS.2008.4650498.

Combining multiple microarray studies using bootstrap meta-analysis

Andrea B Barrett¹, John H Phan, May D Wang

Affiliations

PMID: 19164001
PMCID: PMC5003036
DOI: 10.1109/IEMBS.2008.4650498

Combining multiple microarray studies using bootstrap meta-analysis

Andrea B Barrett et al. Annu Int Conf IEEE Eng Med Biol Soc. 2008.

. 2008:2008:5660-3.

doi: 10.1109/IEMBS.2008.4650498.

Authors

Andrea B Barrett¹, John H Phan, May D Wang

Affiliation

¹ Department of Biomedical Engineering at the Georgia Institute of Technology, Atlanta, 30318 USA. andreabarrett@gatech.edu

PMID: 19164001
PMCID: PMC5003036
DOI: 10.1109/IEMBS.2008.4650498

Abstract

Microarray technology has enabled us to simultaneously measure the expression of thousands of genes. Using this high-throughput data collection, we can examine subtle genetic changes between biological samples and build predictive models for clinical applications. Although microarrays have dramatically increased the rate of data collection, sample size is still a major issue in feature selection. Previous methods show that microarray data combination is successful in improving selection when using z-scores and fold change. We propose a wrapper based gene selection technique that combines bootstrap estimated classification errors for individual genes across multiple datasets. The bootstrap is an unbiased estimator of classification error and has been shown to be effective for small sample data. Coupled with data combination across multiple data sets, we show that this meta-analytic approach improves gene selection.

PubMed Disclaimer

Figures

**Figure 1**
Distributions of normalized classification errors for the renal cancer datasets. Solid lines represent individual or combined distributions while dashed lines represent null distributions. Both the Schuetz (top left) and Jones (top right) datasets deviate significantly from the null distributions, indicating a large number of differentially expressed genes. This deviation is reflected in the combined data (bottom).

**Figure 2**
Distributions of normalized classification errors for the prostate cancer datasets (Singh, top left; Chandran, top right; combined, bottom). Compared to the renal cancer datasets, these datasets do not deviate significantly from the null distribution.

**Figure 3**
ROC curves for detecting validated reference genes for renal cancer. Red lines are combined data and dashed lines are individual datasets. For all datasets, fold change (top right) tends to detect reference genes efficiently compared to t-test (top left) and bootstrap (bottom). Combining data using fold change and bootstrap slightly improves detection efficiency. This corresponds to an increase in area under the ROC curve.

**Figure 4**
ROC curves for detecting validated reference genes for prostate cancer. Red lines are combined data and dashed lines are individual datasets. Combined data does not improve efficiency of reference gene detection when using the t-test (top left) or fold change (top right) methods. The bootstrap method (bottom) slightly increases detection performance of the reference genes for combined data.

**Figure 5**
BSA (AUCs) of individual and combined renal cancer datasets for detecting reference genes. The red bars are AUCs of the combined data. T, FC, and BS correspond to t-test, fold change, and bootstrap, respectively. S, J, and C correspond to Schuetz, Jones, bootstrap (right bars), the relevance of ranking for the combined data is at least as good, if not better, than both individual datasets.

**Figure 6**
BSA (AUCs) of individual and combined prostate cancer datasets for detecting reference genes. The red bars are AUCs of the combined data. T, FC, and BS correspond to t-test, fold change, and bootstrap, respectively. S, Ch, and C correspond to Singh, Chandran, and combined data, respectively. The bootstrap combination method (right bars) outperforms both the t-test and fold change methods.

**Figure 7**
Venn Diagrams showing overlap of genes selected by T-test, fold change, and bootstrap error (p<0.01), The left panel shows results for data combination form the renal cancer group, and the right panel shows results from the prostate cancer group.

**Figure 8**
Venn Diagrams showing overlap of selected genes selected by the individual datasets, and then by the combined meta-analysis for the T-test, Fold Change, and Bootstrap.

See this image and copyright information in PMC

Cited by

Integrated Left Ventricular Global Transcriptome and Proteome Profiling in Human End-Stage Dilated Cardiomyopathy.
Colak D, Alaiya AA, Kaya N, Muiya NP, AlHarazi O, Shinwari Z, Andres E, Dzimiri N. Colak D, et al. PLoS One. 2016 Oct 6;11(10):e0162669. doi: 10.1371/journal.pone.0162669. eCollection 2016. PLoS One. 2016. PMID: 27711126 Free PMC article.
Left ventricular global transcriptional profiling in human end-stage dilated cardiomyopathy.
Colak D, Kaya N, Al-Zahrani J, Al Bakheet A, Muiya P, Andres E, Quackenbush J, Dzimiri N. Colak D, et al. Genomics. 2009 Jul;94(1):20-31. doi: 10.1016/j.ygeno.2009.03.003. Epub 2009 Mar 28. Genomics. 2009. PMID: 19332114 Free PMC article.

References

1. Irizarry RA, et al. Multiple-laboratory comparison of microarray platforms. Nature Methods. 2005;2(5):345–350. - PubMed
1. Patterson T, et al. Performance comparison of one-color and two-color platforms within the MicroArray Quality Control (MAQC) project. Nature Biotechnology. 2006;24(9):1140–1150. - PubMed
1. Vo TM, et al. Reproducibility of Differential Gene Detection Across Multiple Microarray Studies; Proc. 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS; 2007. - PubMed
1. Xiong M, et al. Biomarker Identification by Feature Wrappers. Genome Research. 2001;11:1878–1887. - PMC - PubMed
1. Troyanskaya O, et al. Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics. 2002;18(11):1454–1461. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Combining multiple microarray studies using bootstrap meta-analysis

Affiliation

Combining multiple microarray studies using bootstrap meta-analysis

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources