A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability
- PMID: 19941644
- PMCID: PMC2789744
- DOI: 10.1186/1471-2105-10-389
A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability
Abstract
Background: Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear.
Results: We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight different datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical.
Conclusion: Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments.
Figures














Similar articles
-
Integration of RNA-Seq data with heterogeneous microarray data for breast cancer profiling.BMC Bioinformatics. 2017 Nov 21;18(1):506. doi: 10.1186/s12859-017-1925-0. BMC Bioinformatics. 2017. PMID: 29157215 Free PMC article.
-
Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.Comput Methods Programs Biomed. 2013 Aug;111(2):402-9. doi: 10.1016/j.cmpb.2013.04.006. Epub 2013 May 31. Comput Methods Programs Biomed. 2013. PMID: 23731720
-
Classification across gene expression microarray studies.BMC Bioinformatics. 2009 Dec 30;10:453. doi: 10.1186/1471-2105-10-453. BMC Bioinformatics. 2009. PMID: 20042109 Free PMC article.
-
Microarrays in the 2010s: the contribution of microarray-based gene expression profiling to breast cancer classification, prognostication and prediction.Breast Cancer Res. 2011 Jun 27;13(3):212. doi: 10.1186/bcr2890. Breast Cancer Res. 2011. PMID: 21787441 Free PMC article. Review.
-
Classification of breast cancer using microarray gene expression data: A survey.J Biomed Inform. 2021 May;117:103764. doi: 10.1016/j.jbi.2021.103764. Epub 2021 Apr 6. J Biomed Inform. 2021. PMID: 33831535 Review.
Cited by
-
Prediction of breast cancer metastasis by gene expression profiles: a comparison of metagenes and single genes.Cancer Inform. 2012;11:193-217. doi: 10.4137/CIN.S10375. Epub 2012 Dec 10. Cancer Inform. 2012. PMID: 23304070 Free PMC article.
-
Single sample expression-anchored mechanisms predict survival in head and neck cancer.PLoS Comput Biol. 2012 Jan;8(1):e1002350. doi: 10.1371/journal.pcbi.1002350. Epub 2012 Jan 26. PLoS Comput Biol. 2012. PMID: 22291585 Free PMC article.
-
An integrated approach for identifying wrongly labelled samples when performing classification in microarray data.PLoS One. 2012;7(10):e46700. doi: 10.1371/journal.pone.0046700. Epub 2012 Oct 17. PLoS One. 2012. PMID: 23082127 Free PMC article.
-
Biological network-driven gene selection identifies a stromal immune module as a key determinant of triple-negative breast carcinoma prognosis.Oncoimmunology. 2015 Jun 24;5(1):e1061176. doi: 10.1080/2162402X.2015.1061176. eCollection 2016. Oncoimmunology. 2015. PMID: 26942074 Free PMC article.
-
An evaluation protocol for subtype-specific breast cancer event prediction.PLoS One. 2011;6(7):e21681. doi: 10.1371/journal.pone.0021681. Epub 2011 Jul 8. PLoS One. 2011. PMID: 21760900 Free PMC article.
References
-
- Amaratunga D, Cabrera J. Exploration and analysis of DNA microarray and protein array data. John Wiley Hoboken, NJ; 2004.
MeSH terms
LinkOut - more resources
Full Text Sources
Medical