. 2023 Nov 16;13(1):20014.

doi: 10.1038/s41598-023-46239-0.

Enhancing histopathological image classification of invasive ductal carcinoma using hybrid harmonization techniques

Nassib Abdallah^{1

2}, Jean-Marie Marion³, Clovis Tauber⁴, Thomas Carlier⁵, Mathieu Hatt⁶, Pierre Chauvet³

Affiliations

¹ LaTIM, INSERM, Université de Bretagne-Occidentale, Brest, France. nassib.abdallah@univ-angers.fr.
² LARIS, Université d'Angers, Angers, France. nassib.abdallah@univ-angers.fr.
³ Catholic University of the West, Angers, France.
⁴ Imaging & Brain, Université de Tours, Tours, France.
⁵ University Hospital of Nantes, Nantes, France.
⁶ LaTIM, INSERM, Université de Bretagne-Occidentale, Brest, France.

PMID: 37973797
PMCID: PMC10654662
DOI: 10.1038/s41598-023-46239-0

Enhancing histopathological image classification of invasive ductal carcinoma using hybrid harmonization techniques

Nassib Abdallah et al. Sci Rep. 2023.

. 2023 Nov 16;13(1):20014.

doi: 10.1038/s41598-023-46239-0.

Authors

Nassib Abdallah^{1

2}, Jean-Marie Marion³, Clovis Tauber⁴, Thomas Carlier⁵, Mathieu Hatt⁶, Pierre Chauvet³

Affiliations

¹ LaTIM, INSERM, Université de Bretagne-Occidentale, Brest, France. nassib.abdallah@univ-angers.fr.
² LARIS, Université d'Angers, Angers, France. nassib.abdallah@univ-angers.fr.
³ Catholic University of the West, Angers, France.
⁴ Imaging & Brain, Université de Tours, Tours, France.
⁵ University Hospital of Nantes, Nantes, France.
⁶ LaTIM, INSERM, Université de Bretagne-Occidentale, Brest, France.

PMID: 37973797
PMCID: PMC10654662
DOI: 10.1038/s41598-023-46239-0

Abstract

This study aims to develop a robust pipeline for classifying invasive ductal carcinomas and benign tumors in histopathological images, addressing variability within and between centers. We specifically tackle the challenge of detecting atypical data and variability between common clusters within the same database. Our feature engineering-based pipeline comprises a feature extraction step, followed by multiple harmonization techniques to rectify intra- and inter-center batch effects resulting from image acquisition variability and diverse patient clinical characteristics. These harmonization steps facilitate the construction of more robust and efficient models. We assess the proposed pipeline's performance on two public breast cancer databases, BreaKHIS and IDCDB, utilizing recall, precision, and accuracy metrics. Our pipeline outperforms recent models, achieving 90-95% accuracy in classifying benign and malignant tumors. We demonstrate the advantage of harmonization for classifying patches from different databases. Our top model scored 94.7% for IDCDB and 95.2% for BreaKHis, surpassing existing feature engineering-based models (92.1% for IDCDB and 87.7% for BreaKHIS) and attaining comparable performance to deep learning models. The proposed feature-engineering-based pipeline effectively classifies malignant and benign tumors while addressing variability within and between centers through the incorporation of various harmonization techniques. Our findings reveal that harmonizing variabilities between patches from different batches directly impacts the learning and testing performance of classification models. This pipeline has the potential to enhance breast cancer diagnosis and treatment and may be applicable to other diseases.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
The architecture of our intra-base harmonization module, consisting of 6 steps. The input is a database; the first step is the extraction of features, followed by a normalization of the different groups of features. Then, a split into learning and testing is performed, followed by a processing on the learning samples to reduce the intra-base variabilities.

**Figure 2**
Our complete pipeline: the first step consists in applying the intra-database harmonization module to each database. The second step consists in applying the inter-database harmonization module to the data from different sources (here the two databases). The last step consists in training the classifier.

**Figure 3**
The projection of the samples onto the principal factorial plane, both before and after harmonization, elucidates the impact of our methodology on the projected scatterplot. As illustrated, patches with either IDC or non-IDC subtypes can exist as outliers within the entire dataset and need to be aligned closer to the reference scatterplot, which comprises the majority of samples.

**Figure 4**
Flow diagram for outliers detection: the first step consists in applying the outlier’s detection methods. Based on the results, the second step consists in classifying the samples as atypical or normal. The third step consists in training a logistic regression model to classify IDC/nonIDC patches on the atypical-free datasets. Finally, the last step consists in selecting the best model based on the MSE criterion (the classification performance of the RLog model).

**Figure 5**
On the left, we present examples of central patches, which constitute the majority within the entire histopathological slide. On the right, we showcase examples of border patches. These two distinct types of patches are invariably present in histopathological studies, as they result from the segmentation of a whole slide.

**Figure 6**
Representation of patches grouped by class: the patches on the right contain no malignant tumor whereas those on the left contain malignant tumor.

**Figure 7**
Results from the SHAP model on the IDCDB classification dataset, highlighting the most influential features for classification. Feature 72: Feat_Red_47; Feature 122 : Feat_Green_47; Feature 73 : Feat_Red_48; Feature 123 : Feat_Green_48; Feature 11 : longest_strike_above_mean; Feature 6 : autocorrelation; Feature 184 : Feat_Moments_Red_9; Feature 215 : Feat_Moments_Green_9; Feature 186 : Feat_Moments_Red_11; Feature 217: Feat_Moments_Green_11; Feature 14 : mean_change; Feature 8 : maximum; Feature 10 : kurtosis; Feature 193 : Feat_Moments_Red_18; Feature 224 : Feat_Moments_Green_18; Feature 173 : Feat_Blue_48; Feature 172 : Feat_Blue_47; Feature 17 : ratio_value_number_to_time_series_length; Feature 158 : Feat_Blue_33; Feature 98 : Feat_Green_23.

See this image and copyright information in PMC

Cited by

Bilateral Infiltrating Ductal Carcinoma With Adrenal Metastasis: A Rare Case Report.
Puvvada P, Nirhale DS, Gaudani RH, Mane P. Puvvada P, et al. Cureus. 2024 Jul 29;16(7):e65635. doi: 10.7759/cureus.65635. eCollection 2024 Jul. Cureus. 2024. PMID: 39205706 Free PMC article.
PND-Net: plant nutrition deficiency and disease classification using graph convolutional network.
Bera A, Bhattacharjee D, Krejcar O. Bera A, et al. Sci Rep. 2024 Jul 5;14(1):15537. doi: 10.1038/s41598-024-66543-7. Sci Rep. 2024. PMID: 38969738 Free PMC article.

References

1. Sollini Martina, Cozzi Luca, Ninatti Gaia, Antunovic Lidija, Cavinato Lara, Chiti Arturo, Kirienko Margarita. PET/CT radiomics in breast cancer: Mind the step. Methods. 2021;188:122–132. doi: 10.1016/j.ymeth.2020.01.007. - DOI - PubMed
1. Kitajima K, Miyoshi Y, Sekine T, Takei H, Ito K, Suto A, Kaida H, Ishii K, Daisaki H, Yamakado K. Harmonized pretreatment quantitative volume-based FDG-PET/CT parameters for prognosis of stage I-III breast cancer: Multicenter study. Oncotarget. 2021;12(2):95–105. doi: 10.18632/oncotarget.27851. - DOI - PMC - PubMed
1. Ramtohul T, et al. Multiparametric MRI and radiomics for the prediction of HER2-zero,-low, and-positive breast cancers. Radiology. 2023;308(2):e222646. doi: 10.1148/radiol.222646. - DOI - PubMed
1. Joann G Elmore, Gary M Longton, Patricia A Carney, Berta M Geller, Tracy Onega, Anna N A Tosteson, Heidi D Nelson, Margaret S Pepe, Kimberly H Allison, Stuart J Schnitt, Frances P O’Malley, Donald L Weaver, “Diagnostic Concordance among Pathologists Interpreting Breast Biopsy Specimens,” JAMA, 2015. doi: 0.1001/jama.2015.1405 - PMC - PubMed
1. Adlung Lorenz, Cohen Yotam, Mor Uria, Elinav Eran. Machine learning in clinical decision making. Med. 2021;2(6):642–665. doi: 10.1016/j.medj.2021.04.006. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Enhancing histopathological image classification of invasive ductal carcinoma using hybrid harmonization techniques

Affiliations

Enhancing histopathological image classification of invasive ductal carcinoma using hybrid harmonization techniques

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical