Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov;10(6):e70004.
doi: 10.1002/2056-4538.70004.

Deep learning-based analysis of EGFR mutation prevalence in lung adenocarcinoma H&E whole slide images

Affiliations

Deep learning-based analysis of EGFR mutation prevalence in lung adenocarcinoma H&E whole slide images

Jun Hyeong Park et al. J Pathol Clin Res. 2024 Nov.

Abstract

EGFR mutations are a major prognostic factor in lung adenocarcinoma. However, current detection methods require sufficient samples and are costly. Deep learning is promising for mutation prediction in histopathological image analysis but has limitations in that it does not sufficiently reflect tumor heterogeneity and lacks interpretability. In this study, we developed a deep learning model to predict the presence of EGFR mutations by analyzing histopathological patterns in whole slide images (WSIs). We also introduced the EGFR mutation prevalence (EMP) score, which quantifies EGFR prevalence in WSIs based on patch-level predictions, and evaluated its interpretability and utility. Our model estimates the probability of EGFR prevalence in each patch by partitioning the WSI based on multiple-instance learning and predicts the presence of EGFR mutations at the slide level. We utilized a patch-masking scheduler training strategy to enable the model to learn various histopathological patterns of EGFR. This study included 868 WSI samples from lung adenocarcinoma patients collected from three medical institutions: Hallym University Medical Center, Inha University Hospital, and Chungnam National University Hospital. For the test dataset, 197 WSIs were collected from Ajou University Medical Center to evaluate the presence of EGFR mutations. Our model demonstrated prediction performance with an area under the receiver operating characteristic curve of 0.7680 (0.7607-0.7720) and an area under the precision-recall curve of 0.8391 (0.8326-0.8430). The EMP score showed Spearman correlation coefficients of 0.4705 (p = 0.0087) for p.L858R and 0.5918 (p = 0.0037) for exon 19 deletions in 64 samples subjected to next-generation sequencing analysis. Additionally, high EMP scores were associated with papillary and acinar patterns (p = 0.0038 and p = 0.0255, respectively), whereas low EMP scores were associated with solid patterns (p = 0.0001). These results validate the reliability of our model and suggest that it can provide crucial information for rapid screening and treatment plans.

Keywords: EGFR; deep learning in histopathology; multiple‐instance learning; whole‐slide image analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Preprocessing of WSIs. Mimetic diagram illustrating the preprocessing process of WSI data. Series of steps: extraction of patch images from the original WSI, RandStainNA, and feature extraction. The ViT‐B/14 DINO model was used for feature extraction.
Figure 2
Figure 2
MIL architecture. Structure of dual‐stream multiple‐instance learning (DSMIL) model used for predicting EGFR mutations in this study. DSMIL learns features of the patch with the highest probability of EGFR mutation and aggregates slide‐level features based on the attention score. During the training phase, the teacher model performs masking based on the predicted score.
Figure 3
Figure 3
Attention‐based masking. Process of masking the top N% of patches with high EGFR mutation rate. By masking these patches, the model can focus on learning various patterns, including the easy‐to‐distinguish features of EGFR mutation.
Figure 4
Figure 4
Training and inference process. Process of learning and reasoning using MIL and patch‐masking strategies. A teacher model is utilized for learning various patterns during the training phase and is not involved in the reasoning process.
Figure 5
Figure 5
EGFR prevalence heatmaps (EMP scores) and VAFs. Heat maps showing the probability of EGFR mutation predicted by the AI model for each WSI. They visually represent that higher VAF scores correspond to more areas predicted to be EGFR mutation positive.

References

    1. Lazcanoiturburu N, García‐Sáez J, González‐Corralejo C, et al. Lack of EGFR catalytic activity in hepatocytes improves liver regeneration following DDC‐induced cholestatic injury by promoting a pro‐restorative inflammatory response. J Pathol 2022; 258: 312–324. - PubMed
    1. Fontugne J, Wong J, Cabel L, et al. Progression‐associated molecular changes in basal/squamous and sarcomatoid bladder carcinogenesis. J Pathol 2023; 259: 455–467. - PubMed
    1. Pastorino GA, Sheraj I, Huebner K, et al. A partial epithelial‐mesenchymal transition signature for highly aggressive colorectal cancer cells that survive under nutrient restriction. J Pathol 2024; 262: 347–361. - PubMed
    1. Gonzalez‐Sanchez E, Vaquero J, Caballero‐Diaz D, et al. The hepatocyte epidermal growth factor receptor (EGFR) pathway regulates the cellular interactome within the liver fibrotic niche. J Pathol 2024; 263: 482–495. - PubMed
    1. O'Leary C, Gasper H, Sahin KB, et al. Epidermal growth factor receptor (EGFR)‐mutated non‐small‐cell lung cancer (NSCLC). Pharmaceuticals (Basel) 2020; 13: 273. - PMC - PubMed