Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 24;12(2):518.
doi: 10.3390/cancers12020518.

The Impact of Normalization Approaches to Automatically Detect Radiogenomic Phenotypes Characterizing Breast Cancer Receptors Status

Affiliations

The Impact of Normalization Approaches to Automatically Detect Radiogenomic Phenotypes Characterizing Breast Cancer Receptors Status

Rossana Castaldo et al. Cancers (Basel). .

Abstract

In breast cancer studies, combining quantitative radiomic with genomic signatures can help identifying and characterizing radiogenomic phenotypes, in function of molecular receptor status. Biomedical imaging processing lacks standards in radiomic feature normalization methods and neglecting feature normalization can highly bias the overall analysis. This study evaluates the effect of several normalization techniques to predict four clinical phenotypes such as estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and triple negative (TN) status, by quantitative features. The Cancer Imaging Archive (TCIA) radiomic features from 91 T1-weighted Dynamic Contrast Enhancement MRI of invasive breast cancers were investigated in association with breast invasive carcinoma miRNA expression profiling from the Cancer Genome Atlas (TCGA). Three advanced machine learning techniques (Support Vector Machine, Random Forest, and Naïve Bayesian) were investigated to distinguish between molecular prognostic indicators and achieved an area under the ROC curve (AUC) values of 86%, 93%, 91%, and 91% for the prediction of ER+ versus ER-, PR+ versus PR-, HER2+ versus HER2-, and triple-negative, respectively. In conclusion, radiomic features enable to discriminate major breast cancer molecular subtypes and may yield a potential imaging biomarker for advancing precision medicine.

Keywords: Molecular imaging; biomarker; breast cancer; diagnosis; machine learning; miRNA expression; radiogenomics; radiomic.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Radiomic analysis framework.
Figure 2
Figure 2
Correlation analysis on the whole dataset between non-normalized and normalized radiomic features. On the x-axis radiomic features are reported; on the y-axis correlation coefficients via Spearman correlation analysis are reported for each comparison between normalization methods and raw features (i.e., non-normalized radiomic features). All correlation p-values resulted less than 0.05. NO: non-normalized features; Scaling normalization method; Z-score normalization method; ZscoreR: Robust Z-score normalization method; LOG transformation; Quantile and Upper Quartile normalization method; WHT: Whitening normalization method.
Figure 3
Figure 3
Statistically significant radiomic features. Statistically significant radiomic features across normalization methods are identified with a circle. (A) Statistically significant radiomic features for receptor status ER across normalization methods. (B) Statistically significant radiomic features for receptor status PR across normalization methods. (C) Statistically significant radiomic features for receptor status HER2 across normalization methods. (D) Statistically significant radiomic features for receptor status TN across normalization methods. NO: non-normalized features; Scaling normalization method; Z-score normalization method; ZscoreR: Robust Z-score normalization method; LOG transformation; Quantile and Upper Quartile normalization method; WHT: Whitening normalization method.
Figure 4
Figure 4
Spearman correlation between miRNAs expression and MRI radiomic features normalized by Upper Quartile and Whitening methods for molecular receptor status. (A) Correlation between ER negative breast cancer miRNAs expression and MRI radiomic features normalized by Upper Quartile (UQ) method. (B) Correlation between ER negative breast cancer miRNAs expression and MRI radiomic features normalized by Whitening (WHT) method. (C) Correlation between PR negative breast cancer miRNAs expression and MRI radiomic features normalized by Upper Quartile (UQ) method. (D) Correlation between PR negative breast cancer miRNAs expression and MRI radiomic features normalized by Whitening (WHT) method. (E) Correlation between HER2 positive breast cancer miRNAs expression and MRI radiomic features normalized by whitening methods. (F) Correlation between TN negative breast cancer miRNAs expression and MRI radiomic features normalized by Upper Quartile (UQ) method. (G) Correlation between TN negative breast cancer miRNAs expression and MRI radiomic features normalized by Whitening (WHT) method.
Figure 5
Figure 5
Classifiers performance on testing dataset to identify ER receptor status via radiomic features across normalization methods (NO: non-normalized features; SCL: Scaling normalization method; ZSC: Z-score normalization method; RZSC: Robust Z-score normalization method; LOG transformation; UPQRT: Upper Quartile normalization methods; QNT: Quantile normalization method; WHT: Whitening normalization method). A) Support Vector Machine (SVM) Performance on Testing dataset ER+ vs ER–. B) Random Forest (RF) Performance on Testing dataset ER+ vs ER–. C) Naïve Bayesian (NB) Performance on Testing dataset ER+ vs ER–.
Figure 6
Figure 6
Classifiers’ performance to detect ER receptor status. (A) Comparison table among classifiers: ER+ vs ER–. (B) ROC curves for the best classifiers to automatically detect ER receptor status. a) Best classifier for Support Vector Machine Method. The normalization methods that achieved the best performance are Scaling, Z-score, Robust Z-score, Upper Quartile normalization methods. They achieved the same performance; therefore, one ROC curve with non-normalized features is reported. b) Best classifier for Random Forest Method. The normalization method that achieved the best performance is the whitening method. c) Best classifier for Naïve Bayesian Method. The normalization method that achieved the best performance is scaling method.
Figure 7
Figure 7
Classifiers performance on testing dataset to identify PR receptor status via radiomic features across normalization methods (NO: non-normalized features; SCL: Scaling normalization method; ZSC: Z-score normalization method; RZSC: Robust Z-score normalization method; LOG transformation; UPQRT: Upper Quartile normalization methods; QNT: Quantile normalization method; WHT: Whitening normalization method). A) Support Vector Machine (SVM) Performance on Testing dataset PR+ vs PR–. B) Random Forest (RF) Performance on Testing dataset PR+ vs PR–. C) Naïve Bayesian (NB) Performance on Testing dataset PR+ vs PR–.
Figure 8
Figure 8
Classifiers’ performance to detect PR receptor status. (A) Comparison table among classifiers: PR+ vs PR–. (B) ROC curves for the best classifiers to automatically detect PR receptor status. a) Best classifier for Support Vector Machine Method. The normalization methods that achieved the best performance is the quantile normalization method. b) Best classifier for Random Forest Method. The normalization method that achieved the best performance is the quantile method. c) Best classifier for Naïve Bayesian Method. The normalization method that achieved the best performance is quantile method.
Figure 9
Figure 9
Classifiers’ performance to detect HER2 receptor status. (A) Comparison table among classifiers: HER2+ vs HER2–. (B) ROC curves for the best classifiers to automatically detect HER2 receptor status. Radiomic Feature normalized by whitening methods were considered for the classification task. a) Best classifier for Support Vector Machine Method. b) Best classifier for Random Forest Method. c) Best classifier for Naïve Bayesian method.
Figure 10
Figure 10
Classifiers performance on testing dataset to identify TN cases via radiomic features across normalization methods (NO: non-normalized features; SCL: Scaling normalization method; ZSC: Z-score normalization method; RZSC: Robust Z-score normalization method; LOG transformation; UPQRT: Upper Quartile normalization methods; QNT: Quantile normalization method; WHT: Whitening normalization method). A) Support Vector Machine (SVM) Performance on Testing dataset TN vs Others. B) Random Forest (RF) Performance on Testing dataset TN vs Others. C) Naïve Bayesian (NB) Performance on Testing dataset TN vs Others.
Figure 11
Figure 11
Classifiers’ performance to detect TN receptor status. (A) Comparison table among classifiers: TN vs Others. (B) ROC curves for the best classifiers to automatically detect TN receptor status. a) Best classifier for Support Vector Machine Method. The normalization methods that achieved the best performance is the quantile normalization method. b) Best classifier for Random Forest Method. The normalization method that achieved the best performance is the quantile method. c) Best classifier for Naïve Bayesian Method. The normalization method that achieved the best performance is quantile method.

References

    1. Siegel R., Ma J., Zou Z., Jemal A. Cancer statistics, 2014. CA Cancer J. Clin. 2014;64:9–29. doi: 10.3322/caac.21208. - DOI - PubMed
    1. Fiordelisi M., Auletta L., Meomartino L., Basso L., Fatone G., Salvatore M., Mancini M., Greco A. Preclinical Molecular Imaging for Precision Medicine in Breast Cancer Mouse Models. Contrast Media Mol. Imaging. 2019;2019:8946729. doi: 10.1155/2019/8946729. - DOI - PMC - PubMed
    1. Carey L.A., Perou C.M., Livasy C.A., Dressler L.G., Cowan D., Conway K., Karaca G., Troester M.A., Tse C.K., Edmiston S. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. JAMA. 2006;295(21):2492–2502. doi: 10.1001/jama.295.21.2492. - DOI - PubMed
    1. Voduc K.D., Cheang M.C., Tyldesley S., Gelmon K., Nielsen T.O., Kennecke H. Breast cancer subtypes and the risk of local and regional relapse. J. Clin. Oncol. 2010;28(10):1684–1691. doi: 10.1200/JCO.2009.24.9284. - DOI - PubMed
    1. Metzger-Filho O., Sun Z., Viale G., Price K.N., Crivellari D., Snyder R.D., Gelber R.D., Castiglione-Gertsch M., Coates A.S., Goldhirsch A. Patterns of recurrence and outcome according to breast cancer subtypes in lymph node–negative disease: Results from International Breast Cancer Study Group Trials VIII and IX. J. Clin. Oncol. 2013;31(25):3083. doi: 10.1200/JCO.2012.46.1574. - DOI - PMC - PubMed