Merging microarray data from separate breast cancer studies provides a robust prognostic test
- PMID: 18304324
- PMCID: PMC2409450
- DOI: 10.1186/1471-2105-9-125
Merging microarray data from separate breast cancer studies provides a robust prognostic test
Abstract
Background: There is an urgent need for new prognostic markers of breast cancer metastases to ensure that newly diagnosed patients receive appropriate therapy. Recent studies have demonstrated the potential value of gene expression signatures in assessing the risk of developing distant metastases. However, due to the small sample sizes of individual studies, the overlap among signatures is almost zero and their predictive power is often limited. Integrating microarray data from multiple studies in order to increase sample size is therefore a promising approach to the development of more robust prognostic tests.
Results: In this study, by using a highly stable data aggregation procedure based on expression comparisons, we have integrated three independent microarray gene expression data sets for breast cancer and identified a structured prognostic signature consisting of 112 genes organized into 80 pair-wise expression comparisons. A classical likelihood ratio test based on these comparisons, essentially weighted voting, achieves 88.6% sensitivity and 54.6% specificity in an independent external test set of 154 samples. The test is highly informative in assessing the risk of developing distant metastases within five years (hazard ratio 9.3 with 95% CI 2.9-29.9).
Conclusion: Rank-based features provide a stable way to integrate patient data from separate microarray studies due to invariance to data normalization, and such features can be combined into a useful predictor of distant metastases in breast cancer within a statistical modeling framework which begins to capture gene-gene interactions. Upon further confirmation on large-scale independent data, such prognostic signatures and tests could provide a powerful tool to guide adjuvant systemic treatment that could greatly reduce the cost of breast cancer treatment, both in terms of toxic side effects and health care expenditures.
Figures



Similar articles
-
Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation.BMC Bioinformatics. 2008 Jan 28;9:63. doi: 10.1186/1471-2105-9-63. BMC Bioinformatics. 2008. PMID: 18226260 Free PMC article.
-
Large-scale integration of cancer microarray data identifies a robust common cancer signature.BMC Bioinformatics. 2007 Jul 30;8:275. doi: 10.1186/1471-2105-8-275. BMC Bioinformatics. 2007. PMID: 17663766 Free PMC article.
-
Improved breast cancer prognosis through the combination of clinical and genetic markers.Bioinformatics. 2007 Jan 1;23(1):30-7. doi: 10.1093/bioinformatics/btl543. Epub 2006 Nov 26. Bioinformatics. 2007. PMID: 17130137 Free PMC article.
-
DNA microarrays.Nephron Physiol. 2005;99(3):p85-9. doi: 10.1159/000083764. Epub 2005 Feb 7. Nephron Physiol. 2005. PMID: 15703470 Review.
-
Using microarray analysis as a prognostic and predictive tool in oncology: focus on breast cancer and normal tissue toxicity.Semin Radiat Oncol. 2008 Apr;18(2):105-14. doi: 10.1016/j.semradonc.2007.10.007. Semin Radiat Oncol. 2008. PMID: 18314065 Review.
Cited by
-
Logic Learning Machine creates explicit and stable rules stratifying neuroblastoma patients.BMC Bioinformatics. 2013;14 Suppl 7(Suppl 7):S12. doi: 10.1186/1471-2105-14-S7-S12. Epub 2013 Apr 22. BMC Bioinformatics. 2013. PMID: 23815266 Free PMC article.
-
Identification of novel epithelial ovarian cancer biomarkers by cross-laboratory microarray analysis.J Huazhong Univ Sci Technolog Med Sci. 2010 Jun;30(3):354-9. doi: 10.1007/s11596-010-0356-1. Epub 2010 Jun 17. J Huazhong Univ Sci Technolog Med Sci. 2010. PMID: 20556581
-
Relative expression analysis for molecular cancer diagnosis and prognosis.Technol Cancer Res Treat. 2010 Apr;9(2):149-59. doi: 10.1177/153303461000900204. Technol Cancer Res Treat. 2010. PMID: 20218737 Free PMC article. Review.
-
Effect of data combination on predictive modeling: a study using gene expression data.AMIA Annu Symp Proc. 2010 Nov 13;2010:567-71. AMIA Annu Symp Proc. 2010. PMID: 21347042 Free PMC article.
-
Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees.BMC Bioinformatics. 2013 Mar 19;14:100. doi: 10.1186/1471-2105-14-100. BMC Bioinformatics. 2013. PMID: 23506640 Free PMC article.
References
-
- Jemal A, Siegel R, Ward E, Murray T, Xu J, Smigal C, Thun MJ. Cancer Statistics, 2006. CA Cancer J Clin. 2006;56:106–130. - PubMed
-
- van de Vijver MJ, He YD, van 't Veer LJ, Dai H, Hart AAM, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R. A Gene-Expression Signature as a Predictor of Survival in Breast Cancer. N Engl J Med. 2002;347:1999–2009. doi: 10.1056/NEJMoa021967. - DOI - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical