Agreement Across 10 Artificial Intelligence Models in Assessing Human Epidermal Growth Factor Receptor 2 (HER2) Expression in Breast Cancer Whole-Slide Images

Brittany McKelvey¹, Pedro A Torres-Saavedra², Jessica Li², Glenn Broeckx³, Frederik Deman³, Siraj Ali⁴, Hillary S Andrews¹, Salim Arslan⁵, Meir Azulay⁶, Santhosh Balasubramanian⁷, J C Barrett⁸, Peter Caie⁹, Ming Chen¹⁰, Daniel Cohen¹¹, Tathagata Dasgupta¹², Diana Fahrer⁹, George Green¹³, Mark Gustavson¹⁴, Sarah Hersey¹⁵, Ana Hidalgo-Sastre¹⁶, Shahanawaz Jiwani¹⁷, Elaine Joseph¹⁸, Wonkyung Jung⁴, Kimary Kulig¹⁹, Vladimir Kushnarev²⁰, Jochen K Lennerz²⁰, Xiaoxian Li²¹, Meredith Lodge⁹, Joan Mancuso²², Mike Montalto²³, Satabhisa Mukhopadhyay¹², Foivos Ntelemis⁵, Matthew Oberley¹⁰, Pahini Pandya⁵, Oscar Puig⁶, Edward T Richardson²⁴, Alexander Sarachakov²⁰, Mark Stewart²⁵, Lisa M McShane², Roberto Salgado²⁶, Jeff Allen¹

Affiliations

¹ Friends of Cancer Research, Washington, District of Columbia.
² Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, Maryland.
³ PA(2), Department of Pathology, Ziekenhuis aan de Stroom (ZAS), Antwerp, Belgium; Centre for Oncological Research (CORE), MIPPRO, Faculty of Medicine, Antwerp University, Antwerp, Belgium.
⁴ Lunit, Seoul, South Korea.
⁵ Panakeia Technologies, Cambridge, United Kingdom.
⁶ Nucleai, Chicago, Illinois.
⁷ PathAI, Boston, Massachusetts.
⁸ University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.
⁹ Indica Labs, Albuquerque, New Mexico.
¹⁰ Caris Life Sciences, Irving, Texas.
¹¹ GSK, Collegeville, Pennsylvania.
¹² 4D Path, Newton, Massachusetts.
¹³ GA Green Consulting LLC, Newton, New Jersey.
¹⁴ AstraZeneca, Gaithersburg, Maryland.
¹⁵ Bristol Myers Squibb, Lawrenceville, New Jersey.
¹⁶ AstraZeneca Computational Pathology GmbH, AstraZeneca, Munich, Germany.
¹⁷ Molecular Characterization Laboratory, Frederick National Laboratory/National Cancer Institute, Frederick, Maryland.
¹⁸ AstraZeneca, Waltham, Massachusetts.
¹⁹ Kulig Consulting, New York, New York.
²⁰ BostonGene, Waltham, Massachusetts.
²¹ Emory University, Atlanta, Georgia.
²² Patient Advocate, Philadelphia, Pennsylvania.
²³ Amgen, Thousand Oaks, California.
²⁴ Merck & Co Inc, Boston, Massachusetts.
²⁵ Friends of Cancer Research, Washington, District of Columbia. Electronic address: mstewart@focr.org.
²⁶ PA(2), Department of Pathology, Ziekenhuis aan de Stroom (ZAS), Antwerp, Belgium; Division of Research, Peter Mac Callum Cancer Centre, Melbourne, Australia.

PMID: 41489584
DOI: 10.1016/j.modpat.2025.100944

Free article

Agreement Across 10 Artificial Intelligence Models in Assessing Human Epidermal Growth Factor Receptor 2 (HER2) Expression in Breast Cancer Whole-Slide Images

Brittany McKelvey et al. Mod Pathol. 2026 Feb.

Free article

. 2026 Feb;39(2):100944.

doi: 10.1016/j.modpat.2025.100944. Epub 2026 Jan 5.

Authors

Affiliations

¹ Friends of Cancer Research, Washington, District of Columbia.
² Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, Maryland.
³ PA(2), Department of Pathology, Ziekenhuis aan de Stroom (ZAS), Antwerp, Belgium; Centre for Oncological Research (CORE), MIPPRO, Faculty of Medicine, Antwerp University, Antwerp, Belgium.
⁴ Lunit, Seoul, South Korea.
⁵ Panakeia Technologies, Cambridge, United Kingdom.
⁶ Nucleai, Chicago, Illinois.
⁷ PathAI, Boston, Massachusetts.
⁸ University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.
⁹ Indica Labs, Albuquerque, New Mexico.
¹⁰ Caris Life Sciences, Irving, Texas.
¹¹ GSK, Collegeville, Pennsylvania.
¹² 4D Path, Newton, Massachusetts.
¹³ GA Green Consulting LLC, Newton, New Jersey.
¹⁴ AstraZeneca, Gaithersburg, Maryland.
¹⁵ Bristol Myers Squibb, Lawrenceville, New Jersey.
¹⁶ AstraZeneca Computational Pathology GmbH, AstraZeneca, Munich, Germany.
¹⁷ Molecular Characterization Laboratory, Frederick National Laboratory/National Cancer Institute, Frederick, Maryland.
¹⁸ AstraZeneca, Waltham, Massachusetts.
¹⁹ Kulig Consulting, New York, New York.
²⁰ BostonGene, Waltham, Massachusetts.
²¹ Emory University, Atlanta, Georgia.
²² Patient Advocate, Philadelphia, Pennsylvania.
²³ Amgen, Thousand Oaks, California.
²⁴ Merck & Co Inc, Boston, Massachusetts.
²⁵ Friends of Cancer Research, Washington, District of Columbia. Electronic address: mstewart@focr.org.
²⁶ PA(2), Department of Pathology, Ziekenhuis aan de Stroom (ZAS), Antwerp, Belgium; Division of Research, Peter Mac Callum Cancer Centre, Melbourne, Australia.

PMID: 41489584
DOI: 10.1016/j.modpat.2025.100944

Abstract

Historically, eligibility for receiving human epidermal growth factor receptor 2 (HER2)-targeted therapies was limited to HER2-positive tumors (immunohistochemistry 3+ or in situ hybridization amplified), but recent advances in antibody-drug conjugates have expanded these criteria to include HER2-low and HER2-ultralow expression. This evolving therapeutic landscape underscores the need for precise and reproducible HER2 assessment. Digital and computational pathology tools may help address these needs, but their measurement variability must be evaluated to inform research and clinical use. We evaluated HER2 scoring variability across 10 independently developed computational pathology artificial intelligence models applied to 1124 whole-slide images from 733 patients with breast cancer. Analyses included American Society of Clinical Oncology-College of American Pathologists categorical scores (0, 1+, 2+, and 3+), H-scores, tumor cell staining percentages, and counts of total and stained invasive carcinoma cells. Agreement among models and 3 pathologists was assessed using pairwise overall percent agreement (OPA), Cohen kappa, and hierarchical clustering. Median model pairwise OPA for categorical HER2 scores was 65.1% (kappa, 0.51). Agreement was highest for HER2 3+ vs not 3+ (OPA, 97.3%; kappa, 0.86) and lowest for HER2-low cases, reflecting existing measurement challenges. For HER2 0 (negative) vs not 0 (positive) scoring, the average negative agreement was 65.3%, compared with the average positive agreement of 91.3%, suggesting more agreement in non-HER2 0 scores. H-score and cell count analyses indicated that scoring differences were more related to staining interpretation than tumor cell detection. Pathologists showed numerically higher concordance than models, but interobserver variability persisted. In exploratory analyses, sample type, staining artifacts, and heterogeneous HER2 expression appeared to be associated with discrepancies. Artificial intelligence-based HER2 scoring demonstrated high agreement in identifying HER2 3+ cases. Variability was most pronounced in borderline HER2 categories, particularly in HER2 low, underscoring the need for continued tool refinement for handling low-intensity staining. Standardized training data sets, validation frameworks, and regulatory alignment are important to improve reproducibility. Developing reference standards and benchmarking data sets is critical to evaluate performance, support regulatory decision-making, and ensure real-world applicability.

Keywords: HER2 scoring; artificial intelligence; breast cancer; computational pathology; whole-slide imaging.

PubMed Disclaimer

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Elsevier Science
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Agreement Across 10 Artificial Intelligence Models in Assessing Human Epidermal Growth Factor Receptor 2 (HER2) Expression in Breast Cancer Whole-Slide Images

Affiliations

Agreement Across 10 Artificial Intelligence Models in Assessing Human Epidermal Growth Factor Receptor 2 (HER2) Expression in Breast Cancer Whole-Slide Images

Authors

Affiliations

Abstract

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous