. 2020 Feb 24;12(2):518.

doi: 10.3390/cancers12020518.

The Impact of Normalization Approaches to Automatically Detect Radiogenomic Phenotypes Characterizing Breast Cancer Receptors Status

Rossana Castaldo¹, Katia Pane¹, Emanuele Nicolai¹, Marco Salvatore¹, Monica Franzese¹

Affiliations

PMID: 32102334
PMCID: PMC7072389
DOI: 10.3390/cancers12020518

The Impact of Normalization Approaches to Automatically Detect Radiogenomic Phenotypes Characterizing Breast Cancer Receptors Status

Rossana Castaldo et al. Cancers (Basel). 2020.

. 2020 Feb 24;12(2):518.

doi: 10.3390/cancers12020518.

Authors

Rossana Castaldo¹, Katia Pane¹, Emanuele Nicolai¹, Marco Salvatore¹, Monica Franzese¹

Affiliation

¹ IRCCS SDN, Via E. Gianturco, 113, 80143 Naples, Italy.

PMID: 32102334
PMCID: PMC7072389
DOI: 10.3390/cancers12020518

Abstract

In breast cancer studies, combining quantitative radiomic with genomic signatures can help identifying and characterizing radiogenomic phenotypes, in function of molecular receptor status. Biomedical imaging processing lacks standards in radiomic feature normalization methods and neglecting feature normalization can highly bias the overall analysis. This study evaluates the effect of several normalization techniques to predict four clinical phenotypes such as estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and triple negative (TN) status, by quantitative features. The Cancer Imaging Archive (TCIA) radiomic features from 91 T1-weighted Dynamic Contrast Enhancement MRI of invasive breast cancers were investigated in association with breast invasive carcinoma miRNA expression profiling from the Cancer Genome Atlas (TCGA). Three advanced machine learning techniques (Support Vector Machine, Random Forest, and Naïve Bayesian) were investigated to distinguish between molecular prognostic indicators and achieved an area under the ROC curve (AUC) values of 86%, 93%, 91%, and 91% for the prediction of ER+ versus ER-, PR+ versus PR-, HER2+ versus HER2-, and triple-negative, respectively. In conclusion, radiomic features enable to discriminate major breast cancer molecular subtypes and may yield a potential imaging biomarker for advancing precision medicine.

Keywords: Molecular imaging; biomarker; breast cancer; diagnosis; machine learning; miRNA expression; radiogenomics; radiomic.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Radiomic analysis framework.

**Figure 2**
Correlation analysis on the whole dataset between non-normalized and normalized radiomic features. On the x-axis radiomic features are reported; on the y-axis correlation coefficients via Spearman correlation analysis are reported for each comparison between normalization methods and raw features (i.e., non-normalized radiomic features). All correlation p-values resulted less than 0.05. NO: non-normalized features; Scaling normalization method; Z-score normalization method; ZscoreR: Robust Z-score normalization method; LOG transformation; Quantile and Upper Quartile normalization method; WHT: Whitening normalization method.

**Figure 3**
Statistically significant radiomic features. Statistically significant radiomic features across normalization methods are identified with a circle. (A) Statistically significant radiomic features for receptor status ER across normalization methods. (B) Statistically significant radiomic features for receptor status PR across normalization methods. (C) Statistically significant radiomic features for receptor status HER2 across normalization methods. (D) Statistically significant radiomic features for receptor status TN across normalization methods. NO: non-normalized features; Scaling normalization method; Z-score normalization method; ZscoreR: Robust Z-score normalization method; LOG transformation; Quantile and Upper Quartile normalization method; WHT: Whitening normalization method.

**Figure 4**
Spearman correlation between miRNAs expression and MRI radiomic features normalized by Upper Quartile and Whitening methods for molecular receptor status. (A) Correlation between ER negative breast cancer miRNAs expression and MRI radiomic features normalized by Upper Quartile (UQ) method. (B) Correlation between ER negative breast cancer miRNAs expression and MRI radiomic features normalized by Whitening (WHT) method. (C) Correlation between PR negative breast cancer miRNAs expression and MRI radiomic features normalized by Upper Quartile (UQ) method. (D) Correlation between PR negative breast cancer miRNAs expression and MRI radiomic features normalized by Whitening (WHT) method. (E) Correlation between HER2 positive breast cancer miRNAs expression and MRI radiomic features normalized by whitening methods. (F) Correlation between TN negative breast cancer miRNAs expression and MRI radiomic features normalized by Upper Quartile (UQ) method. (G) Correlation between TN negative breast cancer miRNAs expression and MRI radiomic features normalized by Whitening (WHT) method.

**Figure 5**
Classifiers performance on testing dataset to identify ER receptor status via radiomic features across normalization methods (NO: non-normalized features; SCL: Scaling normalization method; ZSC: Z-score normalization method; RZSC: Robust Z-score normalization method; LOG transformation; UPQRT: Upper Quartile normalization methods; QNT: Quantile normalization method; WHT: Whitening normalization method). A) Support Vector Machine (SVM) Performance on Testing dataset ER+ vs ER–. B) Random Forest (RF) Performance on Testing dataset ER+ vs ER–. C) Naïve Bayesian (NB) Performance on Testing dataset ER+ vs ER–.

**Figure 6**
Classifiers’ performance to detect ER receptor status. (A) Comparison table among classifiers: ER+ vs ER–. (B) ROC curves for the best classifiers to automatically detect ER receptor status. a) Best classifier for Support Vector Machine Method. The normalization methods that achieved the best performance are Scaling, Z-score, Robust Z-score, Upper Quartile normalization methods. They achieved the same performance; therefore, one ROC curve with non-normalized features is reported. b) Best classifier for Random Forest Method. The normalization method that achieved the best performance is the whitening method. c) Best classifier for Naïve Bayesian Method. The normalization method that achieved the best performance is scaling method.

**Figure 7**
Classifiers performance on testing dataset to identify PR receptor status via radiomic features across normalization methods (NO: non-normalized features; SCL: Scaling normalization method; ZSC: Z-score normalization method; RZSC: Robust Z-score normalization method; LOG transformation; UPQRT: Upper Quartile normalization methods; QNT: Quantile normalization method; WHT: Whitening normalization method). A) Support Vector Machine (SVM) Performance on Testing dataset PR+ vs PR–. B) Random Forest (RF) Performance on Testing dataset PR+ vs PR–. C) Naïve Bayesian (NB) Performance on Testing dataset PR+ vs PR–.

**Figure 8**
Classifiers’ performance to detect PR receptor status. (A) Comparison table among classifiers: PR+ vs PR–. (B) ROC curves for the best classifiers to automatically detect PR receptor status. a) Best classifier for Support Vector Machine Method. The normalization methods that achieved the best performance is the quantile normalization method. b) Best classifier for Random Forest Method. The normalization method that achieved the best performance is the quantile method. c) Best classifier for Naïve Bayesian Method. The normalization method that achieved the best performance is quantile method.

**Figure 9**
Classifiers’ performance to detect HER2 receptor status. (A) Comparison table among classifiers: HER2+ vs HER2–. (B) ROC curves for the best classifiers to automatically detect HER2 receptor status. Radiomic Feature normalized by whitening methods were considered for the classification task. a) Best classifier for Support Vector Machine Method. b) Best classifier for Random Forest Method. c) Best classifier for Naïve Bayesian method.

**Figure 10**
Classifiers performance on testing dataset to identify TN cases via radiomic features across normalization methods (NO: non-normalized features; SCL: Scaling normalization method; ZSC: Z-score normalization method; RZSC: Robust Z-score normalization method; LOG transformation; UPQRT: Upper Quartile normalization methods; QNT: Quantile normalization method; WHT: Whitening normalization method). A) Support Vector Machine (SVM) Performance on Testing dataset TN vs Others. B) Random Forest (RF) Performance on Testing dataset TN vs Others. C) Naïve Bayesian (NB) Performance on Testing dataset TN vs Others.

**Figure 11**
Classifiers’ performance to detect TN receptor status. (A) Comparison table among classifiers: TN vs Others. (B) ROC curves for the best classifiers to automatically detect TN receptor status. a) Best classifier for Support Vector Machine Method. The normalization methods that achieved the best performance is the quantile normalization method. b) Best classifier for Random Forest Method. The normalization method that achieved the best performance is the quantile method. c) Best classifier for Naïve Bayesian Method. The normalization method that achieved the best performance is quantile method.

See this image and copyright information in PMC

References

1. Siegel R., Ma J., Zou Z., Jemal A. Cancer statistics, 2014. CA Cancer J. Clin. 2014;64:9–29. doi: 10.3322/caac.21208. - DOI - PubMed
1. Fiordelisi M., Auletta L., Meomartino L., Basso L., Fatone G., Salvatore M., Mancini M., Greco A. Preclinical Molecular Imaging for Precision Medicine in Breast Cancer Mouse Models. Contrast Media Mol. Imaging. 2019;2019:8946729. doi: 10.1155/2019/8946729. - DOI - PMC - PubMed
1. Carey L.A., Perou C.M., Livasy C.A., Dressler L.G., Cowan D., Conway K., Karaca G., Troester M.A., Tse C.K., Edmiston S. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. JAMA. 2006;295(21):2492–2502. doi: 10.1001/jama.295.21.2492. - DOI - PubMed
1. Voduc K.D., Cheang M.C., Tyldesley S., Gelmon K., Nielsen T.O., Kennecke H. Breast cancer subtypes and the risk of local and regional relapse. J. Clin. Oncol. 2010;28(10):1684–1691. doi: 10.1200/JCO.2009.24.9284. - DOI - PubMed
1. Metzger-Filho O., Sun Z., Viale G., Price K.N., Crivellari D., Snyder R.D., Gelber R.D., Castiglione-Gertsch M., Coates A.S., Goldhirsch A. Patterns of recurrence and outcome according to breast cancer subtypes in lymph node–negative disease: Results from International Breast Cancer Study Group Trials VIII and IX. J. Clin. Oncol. 2013;31(25):3083. doi: 10.1200/JCO.2012.46.1574. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The Impact of Normalization Approaches to Automatically Detect Radiogenomic Phenotypes Characterizing Breast Cancer Receptors Status

Affiliation

The Impact of Normalization Approaches to Automatically Detect Radiogenomic Phenotypes Characterizing Breast Cancer Receptors Status

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous