Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 24:11:19.
doi: 10.4103/jpi.jpi_10_20. eCollection 2020.

Deep Learning to Estimate Human Epidermal Growth Factor Receptor 2 Status from Hematoxylin and Eosin-Stained Breast Tissue Images

Affiliations

Deep Learning to Estimate Human Epidermal Growth Factor Receptor 2 Status from Hematoxylin and Eosin-Stained Breast Tissue Images

Deepak Anand et al. J Pathol Inform. .

Abstract

Context: Several therapeutically important mutations in cancers are economically detected using immunohistochemistry (IHC), which highlights the overexpression of specific antigens associated with the mutation. However, IHC panels can be imprecise and relatively expensive in low-income settings. On the other hand, although hematoxylin and eosin (H&E) staining used to visualize the general tissue morphology is a routine and low cost, it does not highlight any specific antigen or mutation.

Aims: Using the human epidermal growth factor receptor 2 (HER2) mutation in breast cancer as an example, we strengthen the case for cost-effective detection and screening of overexpression of HER2 protein in H&E-stained tissue.

Settings and design: We use computational methods that reliably detect subtle morphological changes associated with the over-expression of mutation-specific proteins directly from H&E images.

Subjects and methods: We trained a classification pipeline to determine HER2 overexpression status of H&E stained whole slide images. Our training dataset was derived from a single hospital containing 26 (11 HER2+ and 15 HER2-) cases. We tested the classification pipeline on 26 (8 HER2+ and 18 HER2-) held-out cases from the same hospital and 45 independent cases (23 HER2+ and 22 HER2-) from the TCGA-BRCA cohort. The pipeline was composed of a stain separation module and three deep neural network modules in tandem for robustness and interpretability.

Statistical analysis used: We evaluate our trained model through area under the curve (AUC)-receiver operating characteristic.

Results: Our pipeline achieved an AUC of 0.82 (confidence interval [CI]: 0.65-0.98) on held-out cases and an AUC of 0.76 (CI: 0.61-0.89) on the independent dataset from TCGA. We also demonstrate the region-level correspondence of HER2 overexpression between a patient's IHC and H&E serial sections.

Conclusions: Our work strengthens the case for automatically quantifying the overexpression of mutation-specific proteins in H&E-stained digital pathology, and it highlights the importance of multi-stage machine learning pipelines for added robustness and interpretability.

Keywords: Breast cancer; convolutional neural networks; histopathology; human epidermal growth factor receptor 2; immunohistochemistry; mutation detection; nucleus detection.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts of interest.

Figures

Figure 1
Figure 1
Examples of HER2neu immunohistochemistry staining that shows patches from slides with different HER2 score varying with the staining intensity. HER2: Human epidermal growth factor receptor 2
Figure 2
Figure 2
Examples of patches without tumor from the Warwick training set
Figure 3
Figure 3
Block diagram of the proposed method
Figure 4
Figure 4
A sample annotation of H&E image (right) using the serial immunohistochemistry image (left) included in the training dataset
Figure 5
Figure 5
Sample visual results showing spatial correspondence with immunohistochemistry: Δ: HER2+, Δ: HER2– and ×: Noncancerous. (a and b) HER2+ image and its corresponding H&E marked images. (c and d) HER2– images and its corresponding H&E marked images. HER2: Human epidermal growth factor receptor 2
Figure 6
Figure 6
Receiver operating characteristic curve for held-out patients in the Warwick dataset for HER2 + versus HER2– task. HER2: Human epidermal growth factor receptor 2
Figure 7
Figure 7
Area under the curve-receiver operating characteristic curve for independent testing dataset TCGA-BRCA
Figure 8
Figure 8
Positive predictive value and negative predictive value curves for independent testing on the TCGA-BRCA dataset
Figure 9
Figure 9
Area under the curve-receiver operating characteristic curve for testing on human epidermal growth factor receptor 2 2+ cases from TCGA-BRCA cohort

References

    1. Cancer of the Breast (Female) – Cancer Stat Facts. [Last accessed on 2019 Oct 02]. Available from: https://seercancergov/statfacts/html/breasthtml .
    1. American Cancer Society. Cancer Facts and Statistics. [Last accessed on 2019 Oct 02]. Available from: http://cancerstatisticscent ercancerorg/
    1. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci United States Am. 2003;100:8418–23. - PMC - PubMed
    1. Dai X, Chen A, Bai Z. Integrative investigation on breast cancer in ER, PR and HER2-defined subgroups using mRNA and miRNA expression profiling. Sci Rep. 2014;4:6566. - PMC - PubMed
    1. Perou CM, SÞrlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–52. - PubMed