Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Mar 10:2023.03.08.23286975.
doi: 10.1101/2023.03.08.23286975.

Direct prediction of Homologous Recombination Deficiency from routine histology in ten different tumor types with attention-based Multiple Instance Learning: a development and validation study

Affiliations

Direct prediction of Homologous Recombination Deficiency from routine histology in ten different tumor types with attention-based Multiple Instance Learning: a development and validation study

Chiara Maria Lavinia Loeffler et al. medRxiv. .

Update in

Abstract

Background: Homologous Recombination Deficiency (HRD) is a pan-cancer predictive biomarker that identifies patients who benefit from therapy with PARP inhibitors (PARPi). However, testing for HRD is highly complex. Here, we investigated whether Deep Learning can predict HRD status solely based on routine Hematoxylin & Eosin (H&E) histology images in ten cancer types.

Methods: We developed a fully automated deep learning pipeline with attention-weighted multiple instance learning (attMIL) to predict HRD status from histology images. A combined genomic scar HRD score, which integrated loss of heterozygosity (LOH), telomeric allelic imbalance (TAI) and large-scale state transitions (LST) was calculated from whole genome sequencing data for n=4,565 patients from two independent cohorts. The primary statistical endpoint was the Area Under the Receiver Operating Characteristic curve (AUROC) for the prediction of genomic scar HRD with a clinically used cutoff value.

Results: We found that HRD status is predictable in tumors of the endometrium, pancreas and lung, reaching cross-validated AUROCs of 0.79, 0.58 and 0.66. Predictions generalized well to an external cohort with AUROCs of 0.93, 0.81 and 0.73 respectively. Additionally, an HRD classifier trained on breast cancer yielded an AUROC of 0.78 in internal validation and was able to predict HRD in endometrial, prostate and pancreatic cancer with AUROCs of 0.87, 0.84 and 0.67 indicating a shared HRD-like phenotype is across tumor entities.

Conclusion: In this study, we show that HRD is directly predictable from H&E slides using attMIL within and across ten different tumor types.

Keywords: DNA repair mechanism; Deep Learning; Homologous Recombination Deficiency; artificial intelligence; molecular pathology; pan cancer study.

PubMed Disclaimer

Conflict of interest statement

Competing Interests JNK reports consulting services for Owkin, France, Panakeia, UK and DoMore Diagnostics, Norway and has received honoraria for lectures by MSD, Eisai and Fresenius. JSRF reports a leadership (board of directors) role at Grupo Oncoclinicas, stock or other ownership interests at Repare Therapeutics and Paige.AI, and a consulting or Advisory Role at Genentech/Roche, Invicro, Ventana Medical Systems, Volition RX, Paige.AI, Goldman Sachs, Bain Capital, Novartis, Repare Therapeutics, Lilly, Saga Diagnostics, Swarm and Personalis. No other potential conflicts of interest are reported by any of the authors.

Figures

Figure 1:
Figure 1:. Experimental Design and Study overview.
(A) Overview of the different Homologous Recombination Deficiency (HRD) scores, their content and assessment methods. (B) Workflow of our Deep Learning (DL) pipeline. A total of n=9517 Whole Slide Images (WSI) were processed and trained with an attention-based Multiple Instance Learning (attMIL) approach. The statistical endpoint was the Area under the receiving operating curve (AUROC). (C) Study design for the three main experiments (Internal 5-fold cross-validation, tumor-wise external validation and cross-cancer external validation) conducted and cohort overview for patients and tumor types included from The Cancer Genome Atlas (TCGA, n=4113 patients) and Clinical Proteomic Tumor Analysis Consortium (CPTAC, n=474 patients). Abbreviations: BRCA=breast cancer; CRC=colorectal cancer; GBM=glioblastoma; LIHC=liver cancer; LUAD=lung adenocarcinoma; LUSC/LSCC=lung squamous cell carcinoma; OV=ovarian cancer; PAAD/PDA=pancreatic adenocarcinoma; PRAD=prostate adenocarcinoma; UCEC=endometrial cancer; HRR=Homologous recombination repair. (This Figure was partly generated using Servier Medical Art, provided by Servier, licensed under a Creative Commons Attribution 3.0 unported license)
Figure 2:
Figure 2:. Comparison of Area under the receiving operating curve (AUROC) for internal and tumor wise external validation experiment models.
Boxplot displaying the distribution for the AUROC for (A) internal 5-fold cross-validation experiment of The Cancer Genome Atlas (TCGA) and tumor-wise external validation on the Clinical Proteomic Tumor Analysis Consortium (CPTAC); (B) AUROCs for the cross-cancer external validation experiment of the TCGA breast cancer cohort (TCGA-BRCA) on the TCGA and CPTAC cohort. The horizontal line indicates the median, whereas each box represents the interquartile range (IQR) between the first and third quartiles. The whiskers extend from the box to the minimum and maximum values, considering 1.5 times the IQR. Abbreviations: BRCA=breast cancer; CRC=colorectal cancer; GBM=glioblastoma; LIHC=liver cancer; LUAD=lung adenocarcinoma; LUSC/LSCC=lung squamous cell carcinoma; OV=ovarian cancer; PAAD/PDA=pancreatic adenocarcinoma; PRAD=prostate adenocarcinoma; UCEC=endometrial cancer
Figure 3:
Figure 3:. Molecular Characterization of The Cancer Genome Atlas breast cancer (TCGA-BRCA) cohort.
(A) Distribution of breast cancer subtypes for the Homologous Recombination deficiency high (HRD-H) and low (HRD-L) ground truth subgroups. (B) Distribution of the breast cancer subtypes for the HRD-H and HRD-L Deep Learning (DL) predicted subgroups. (C) Alteration Frequency for several genes of the HRD-H and HRD-L ground truth subgroups. (D) Alteration Frequency for several genes of the HRD-H and HRD-L within cohort internal results prediction subgroups. (E) Grouped Boxplots comparing the Homologous Recombination Deficiency high (HRD-H) prediction scores with the mutational status (mutated=MUT wildtype=WT) for the somatic and germline alterations of the BRCA1/2 genes. The central line represents the median value, while the box ranges between the first and third quartile (IQR) and the whiskers extend to the lowest and highest values within 1.5 times the IQR. The y-axis represents the Deep Learning (DL) HRD-H prediction values. An independent t–test was performed to calculate the p-values: ns: p <= 1.00e+00 *: 1.00e−02 < p <= 5.00e−02 **: 1.00e−03 < p <= 1.00e−02 ***: 1.00e−04 < p <= 1.00e−03
Figure 4:
Figure 4:. Visualization of predicted Homologous Recombination Deficiency high (HRD-H) tumor samples.
(A) Whole slide image (WSI) of an HRD-H predicted patient (ID: C3L-00358-21) from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) endometrial cancer (UCEC) cohort with magnification. (B) Attention heatmap for the same patient with magnification. (C) Classification Heatmap for the same patient with magnification. (D) Top predicted tiles for top three homologous recombination deficiency high (HRD-H) patients in The Cancer Genome Atlas (TCGA) breast cancer (BRCA). (E) Top predicted tiles for three HRD-H patients in the CPTAC-UCEC cohort.

References

    1. Frey MK, Pothuri B. Homologous recombination deficiency (HRD) testing in ovarian cancer clinical practice: a review of the literature. Gynecol Oncol Res Pract. 2017. Feb 22;4:4. - PMC - PubMed
    1. Hoeijmakers JH. Genome maintenance mechanisms for preventing cancer. Nature. 2001. May 17;411(6835):366–74. - PubMed
    1. Rose M, Burgess JT, O’Byrne K, Richard DJ, Bolderson E. PARP Inhibitors: Clinical Relevance, Mechanisms of Action and Tumor Resistance. Front Cell Dev Biol. 2020. Sep 9;8:564601. - PMC - PubMed
    1. Dedes KJ, Wilkerson PM, Wetterskog D, Weigelt B, Ashworth A, Reis-Filho JS. Synthetic lethality of PARP inhibition in cancers lacking BRCA1 and BRCA2 mutations. Cell Cycle. 2011. Apr 15;10(8):1192–9. - PMC - PubMed
    1. Leary A, Auguste A, Mesnage S. DNA damage response as a therapeutic target in gynecological cancers. Curr Opin Oncol. 2016. Sep;28(5):404–11. - PubMed

Publication types