Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep;31(9):3002-3010.
doi: 10.1038/s41591-025-03780-x. Epub 2025 Jul 9.

Real-world deployment of a fine-tuned pathology foundation model for lung cancer biomarker detection

Affiliations

Real-world deployment of a fine-tuned pathology foundation model for lung cancer biomarker detection

Gabriele Campanella et al. Nat Med. 2025 Sep.

Abstract

Artificial intelligence models using digital histopathology slides stained with hematoxylin and eosin offer promising, tissue-preserving diagnostic tools for patients with cancer. Despite their advantages, their clinical utility in real-world settings remains unproven. Assessing EGFR mutations in lung adenocarcinoma demands rapid, accurate and cost-effective tests that preserve tissue for genomic sequencing. PCR-based assays provide rapid results but with reduced accuracy compared with next-generation sequencing and require additional tissue. Computational biomarkers leveraging modern foundation models can address these limitations. Here we assembled a large international clinical dataset of digital lung adenocarcinoma slides (N = 8,461) to develop a computational EGFR biomarker. Our model fine-tunes an open-source foundation model, improving task-specific performance with out-of-center generalization and clinical-grade accuracy on primary and metastatic specimens (mean area under the curve: internal 0.847, external 0.870). To evaluate real-world clinical translation, we conducted a prospective silent trial of the biomarker on primary samples, achieving an area under the curve of 0.890. The artificial-intelligence-assisted workflow reduced the number of rapid molecular tests needed by up to 43% while maintaining the current clinical standard performance. Our retrospective and prospective analyses demonstrate the real-world clinical utility of a computational pathology biomarker.

PubMed Disclaimer

Conflict of interest statement

Competing interests: C.V. and T.J.F. report intellectual property rights and equity interest in Paige.AI, Inc. T.J.F. is employed by Eli Lilly. F.R.H. has acted as an adviser of Amgen, AstraZeneca, Bicara Therapeutics, BMS, Daiichi, G1 Therapeutics, Genentech/Roche, Genzyme/Sanofi, GSK, Merck, Merus Therapeutics, Nectin Therapeutics, NextCure, Novartis, OncoCyte, Oncohost and Regeneron. H.Y. has received consulting fees from AstraZeneca, Blueprint Medicines, Daiichi Sankyo, Genentech/Roche, Janssen, Merus, Pfizer, Regeneron, Ribon Therapeutics and Turning Point Therapeutics. M.H. holds fiduciary roles with the International Society of Bone and Soft Tissue Pathology and the United States & Canadian Academy of Pathology (USCAP). M.A. holds a fiduciary role with the Association for Molecular Pathology, has provided professional services for Biocartis US, Inc. and PER Events, LLC, and has intellectual property rights associated with SOPHiA. A.E. declares research grant support from Kanvas Bioscience, GMT Bioscience, BMS, Merck and Astrazeneca, and consulting and honoraria from BMS, Merck, Astrazeneca and EMD SorenoGENETICS. All disclosed competing interests are outside of the submitted work. G.C., N.K., S.N., S.S., E.F., R.K., S.M., N.P., P.J.S., I.H., N.N., L.M.A., A.B., T.J., M.R.N., M.M.C., O.A., G.M.G. and J.H. have no competing interests.

Figures

Fig. 1
Fig. 1. Clinical implementation of a computational EGFR biomarker.
In the standard clinical workflow for patients with LUAD, rapid tests for EGFR and other biomarkers are performed, reducing the tissue available for NGS and leading to up to one-quarter of the cases being unsuitable for NGS. By contrast, the clinical application of the proposed EGFR biomarker will allow a drastic reduction in the number of cases unsuitable for NGS. As soon as slides are digitized, the computational biomarker can be calculated and may be available to the pathologist before they review the case and sign it out. Based on the model’s outputs, the rapid test may be avoided, increasing the tissue available for NGS.
Fig. 2
Fig. 2. EAGLE performance on the internal and external cohorts.
Receiver operating characteristic (ROC) curves and respective AUCs are presented. The ROC 95% confidence interval (shaded area) was calculated via bootstrapping with 1,000 iterations. a, Retrospective internal validation. b, Retrospective external validation. c, Retrospective pretrial cohort. d, Prospective silent trial cohort.
Fig. 3
Fig. 3. Silent trial workflow diagram.
In blue, relevant components of the standard clinical workflow are shown along a timeline. ΔT indicates the time from molecular accession to the availability of a result. The silent trial components occurring in parallel with the clinical workflow are indicated in green.
Fig. 4
Fig. 4. Pretrial tuning and silent trial results.
In the AI-assisted EGFR screening, samples with EAGLE scores below the NPV threshold or above the PPV threshold can be spared from the rapid test. a, A heatmap of the reduction of rapid tests with isolines corresponding to the historical Idylla performance when modulating the NPV and PPV thresholds. Avg., average. The 95% confidence intervals (CI) were estimated via bootstrapping with 1,000 iterations. b, A zoomed-in area from a focusing on the top right corner where NPV and PPV are maximized. Threshold points are chosen within the Idylla noninferiority region with increasing levels of rapid test reduction. c, Pretrial deployment along the line established in b with the selected thresholds (thresh.) as vertical lines. The historical Idylla performance is shown with solid horizontal lines, and the dashed horizontal lines represent the 95% confidence intervals estimated via bootstrapping with 1,000 iterations. From top to bottom, NPV, PPV and rapid test reduction associated with AI-assisted workflow are presented. Shaded areas represent the 95% confidence interval estimated via bootstrapping with 1,000 iterations. d, A similar analysis as in c but for the silent trial cohort. The thresholds were chosen on the pretrial cohort and not on the silent trial cohort.
Fig. 5
Fig. 5. Model introspection using the attention scores from the aggregation function.
As inference is deterministic, images are generated in one shot. Repeat inference generates identical image. ad, Each figure is an example from the silent trial: true positive (a), true negative (b), false positive (c) and false negative (d). In each image, the top left is the thumbnail of the H&E WSI. The top middle contains an overlay of the thumbnail with the full spectrum of the attention mask from a score of −4 to 4 (see legend). The attention is the level to which the model is attending to the region on the image for making the decision of positive or negative (that is, does not indicate whether the model is interpreting the area as positive or negative, but only weighting). The bottom left has an overlay of the regions of the slide that have an attention score >3 (that is, high attention). The label at the bottom of panel is the quantity of pixels with high attention. The bottom middle has an inverted mask so that the non-high-attention regions are obscured. The red box in the panel indicates the region of the WSI that has the highest density of high-attention pixels. To the right is a high-resolution image of the portion of the slide highlighted by the red box in the prior panel.

References

    1. Ciardiello, F. & Tortora, G. EGFR antagonists in cancer treatment. N. Engl. J. Med.358, 1160–1174 (2008). - PubMed
    1. Lindeman, N. I. et al. Updated molecular testing guideline for the selection of lung cancer patients for treatment with targeted tyrosine kinase inhibitors: Guideline from the college of american pathologists, the international association for the study of lung cancer, and the association for molecular pathology. Arch. Pathol. Lab. Med.142, 321–346 (2018). - PubMed
    1. NCCN clinical practice guidelines in oncology: non-small cell lung cancer. https://www.nccn.org/professionals/physiciangls/pdf/nscl.pdf (2024).
    1. Audibert, C. et al. Trends in the molecular diagnosis of lung cancer: results from an online market research survey. Friends of Cancer Research (2017); https://friendsofcancerresearch.org/wp-content/uploads/FINAL-2017-Friend...
    1. Robert, N. J. et al. Biomarker testing and tissue journey among patients with metastatic non-small cell lung cancer receiving first-line therapy in the US oncology network. Lung Cancer166, 197–204 (2022). - PubMed