Quantitative assessment of tissue biomarkers and construction of a model to predict outcome in breast cancer using multiple imputation
- PMID: 19352457
- PMCID: PMC2664700
- DOI: 10.4137/cin.s911
Quantitative assessment of tissue biomarkers and construction of a model to predict outcome in breast cancer using multiple imputation
Abstract
Missing data pose one of the greatest challenges in the rigorous evaluation of biomarkers. The limited availability of specimens with complete clinical annotation and quality biomaterial often leads to underpowered studies. Tissue microarray studies, for example, may be further handicapped by the loss of data points because of unevaluable staining, core loss, or the lack of tumor in the histospot. This paper presents a novel approach to these common problems in the context of a tissue protein biomarker analysis in a cohort of patients with breast cancer. Our analysis develops techniques based on multiple imputation to address the missing value problem. We first select markers using a training cohort, identifying a small subset of protein expression levels that are most useful in predicting patient survival. The best model is obtained by including both protein markers (including COX6C, GATA3, NAT1, and ESR1) and lymph node status. The use of either lymph node status or the four protein expression levels provides similar improvements in goodness-of-fit, with both significantly better than a baseline clinical model. Using the same multiple imputation strategy, we then validate the results out-of-sample on a larger independent cohort. Our approach of integrating multiple imputation with each stage of the analysis serves as an example that may be replicated or adapted in future studies with missing values.
Keywords: biomarker; breast cancer; immunohistochemistry; multiple imputation; variable selection.
Figures



Similar articles
-
Missing data treatments matter: an analysis of multiple imputation for anterior cervical discectomy and fusion procedures.Spine J. 2018 Nov;18(11):2009-2017. doi: 10.1016/j.spinee.2018.04.001. Epub 2018 Apr 9. Spine J. 2018. PMID: 29649614
-
Multiple imputation in survival models: applied on breast cancer data.Iran Red Crescent Med J. 2011 Aug;13(8):544-9. Epub 2011 Aug 1. Iran Red Crescent Med J. 2011. PMID: 22737525 Free PMC article.
-
Imputation of Gene Expression Data in Blood Cancer and Its Significance in Inferring Biological Pathways.Front Oncol. 2020 Jan 8;9:1442. doi: 10.3389/fonc.2019.01442. eCollection 2019. Front Oncol. 2020. PMID: 31970084 Free PMC article.
-
Predictors of clinical outcome in pediatric oligodendroglioma: meta-analysis of individual patient data and multiple imputation.J Neurosurg Pediatr. 2018 Feb;21(2):153-163. doi: 10.3171/2017.7.PEDS17133. Epub 2017 Dec 1. J Neurosurg Pediatr. 2018. PMID: 29192869 Review.
-
Dealing with missing values in large-scale studies: microarray data imputation and beyond.Brief Bioinform. 2010 Mar;11(2):253-64. doi: 10.1093/bib/bbp059. Epub 2009 Dec 4. Brief Bioinform. 2010. PMID: 19965979 Review.
Cited by
-
hSAGEing: an improved SAGE-based software for identification of human tissue-specific or common tumor markers and suppressors.PLoS One. 2010 Dec 17;5(12):e14369. doi: 10.1371/journal.pone.0014369. PLoS One. 2010. PMID: 21179408 Free PMC article.
-
Novel role of COX6c in the regulation of oxidative phosphorylation and diseases.Cell Death Discov. 2022 Jul 25;8(1):336. doi: 10.1038/s41420-022-01130-1. Cell Death Discov. 2022. PMID: 35879322 Free PMC article. Review.
-
Construction and analysis of multiparameter prognostic models for melanoma outcome.Methods Mol Biol. 2014;1102:227-58. doi: 10.1007/978-1-62703-727-3_13. Methods Mol Biol. 2014. PMID: 24258982 Free PMC article.
-
7-UP: Generating in silico CODEX from a small set of immunofluorescence markers.PNAS Nexus. 2023 May 19;2(6):pgad171. doi: 10.1093/pnasnexus/pgad171. eCollection 2023 Jun. PNAS Nexus. 2023. PMID: 37275261 Free PMC article.
-
Differential expression and clinical significance of COX6C in human diseases.Am J Transl Res. 2021 Jan 15;13(1):1-10. eCollection 2021. Am J Transl Res. 2021. PMID: 33527004 Free PMC article. Review.
References
-
- Arnes JB, Brunet JS, Stefansson I, et al. Placental Cadherin and the Basal Epithelial Phenotype of BRCA1-Related Breast Cancer. . Clin. Cancer Res. 2005;11:4003–11. - PubMed
-
- Camp RL, Chung GG, Rimm DL. Nov. Automated subcellular localization and quantification of protein expression in tissue microarrays. . Nat. Med. 2002;8(11):1323–7. - PubMed
-
- Camp RL, Dolled-Filhart M, King BL, Rimm DL. Quantitative analysis of breast cancer tissue microarrays shows that both high and normal levels of HER2 expression are associated with poor outcome. . Cancer Res. 2003;63:1445–8. - PubMed
-
- Chung GG, Zerkowski MP, Ocal IT, et al. Beta-Catenin and p53 analyses of a breast carcinoma tissue microarray. Cancer. 2004;100:2084–92. - PubMed
-
- Dempster AP, Laird NM, Rubin DB. Maximum likelihood estimation from incomplete data via the EM algorithm. . Journal of the Royal Statistical Society Series B. 1977;39:1–38.
LinkOut - more resources
Full Text Sources
Miscellaneous