Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul;6(4):e230431.
doi: 10.1148/ryai.230431.

Deep Learning for Breast Cancer Risk Prediction: Application to a Large Representative UK Screening Cohort

Affiliations

Deep Learning for Breast Cancer Risk Prediction: Application to a Large Representative UK Screening Cohort

Sam Ellis et al. Radiol Artif Intell. 2024 Jul.

Abstract

Purpose To develop an artificial intelligence (AI) deep learning tool capable of predicting future breast cancer risk from a current negative screening mammographic examination and to evaluate the model on data from the UK National Health Service Breast Screening Program. Materials and Methods The OPTIMAM Mammography Imaging Database contains screening data, including mammograms and information on interval cancers, for more than 300 000 female patients who attended screening at three different sites in the United Kingdom from 2012 onward. Cancer-free screening examinations from women aged 50-70 years were performed and classified as risk-positive or risk-negative based on the occurrence of cancer within 3 years of the original examination. Examinations with confirmed cancer and images containing implants were excluded. From the resulting 5264 risk-positive and 191 488 risk-negative examinations, training (n = 89 285), validation (n = 2106), and test (n = 39 351) datasets were produced for model development and evaluation. The AI model was trained to predict future cancer occurrence based on screening mammograms and patient age. Performance was evaluated on the test dataset using the area under the receiver operating characteristic curve (AUC) and compared across subpopulations to assess potential biases. Interpretability of the model was explored, including with saliency maps. Results On the hold-out test set, the AI model achieved an overall AUC of 0.70 (95% CI: 0.69, 0.72). There was no evidence of a difference in performance across the three sites, between patient ethnicities, or across age groups. Visualization of saliency maps and sample images provided insights into the mammographic features associated with AI-predicted cancer risk. Conclusion The developed AI tool showed good performance on a multisite, United Kingdom-specific dataset. Keywords: Deep Learning, Artificial Intelligence, Breast Cancer, Screening, Risk Prediction Supplemental material is available for this article. ©RSNA, 2024.

Keywords: Artificial Intelligence; Breast Cancer; Deep Learning; Risk Prediction; Screening.

PubMed Disclaimer

Conflict of interest statement

Disclosures of conflicts of interest: S.E. The Royal Surrey NHS Foundation Trust has a research contract with Google Health to expand the collection of de-identified data in the OPTIMAM image database. These data are then available for sharing to other parties upon application. Duration July 2022-July 2024. S.G. The Royal Surrey NHS Foundation Trust has a research contract with Google Health to expand the collection of de-identified data in the OPTIMAM image database. These data are then available for sharing to other parties upon application. Duration July 2022-July 2024. M.T. The Royal Surrey NHS Foundation Trust has a research contract with Google Health to expand the collection of de-identified data in the OPTIMAM image database. These data are then available for sharing to other parties upon application. Duration July 2022-July 2024. M.D.H.B. The Royal Surrey NHS Foundation Trust has a research contract with Google Health to expand the collection of de-identified data in the OPTIMAM image database. These data are then available for sharing to other parties upon application. Duration July 2022-July 2024. K.C.Y. The Royal Surrey NHS Foundation Trust has a research contract with Google Health to expand the collection of de-identified data in the OPTIMAM image database. These data are then available for sharing to other parties upon application. Duration July 2022-July 2024. N.S.C. No relevant relationships. P.H. No relevant relationships. L.M.W. The Royal Surrey NHS Foundation Trust has a research contract with Google Health to expand the collection of de-identified data in the OPTIMAM image database. These data are then available for sharing to other parties upon application. Duration July 2022-July 2024.

Figures

None
Graphical abstract
Patient-level data selection flowchart. Reported numbers (n) refer to
women. From the OPTIMAM database, suitable examinations were identified as
those that did not show malignant findings and contained sufficient
follow-up information to be classified as risk-positive or risk-negative.
After images with implants were removed, the remaining women were then split
in a stratified manner into training, validation, and test datasets for
model training and evaluation. For computational reasons, numbers of women
in the training and validation risk-negative groups were reduced via random
sampling.
Figure 1:
Patient-level data selection flowchart. Reported numbers (n) refer to women. From the OPTIMAM database, suitable examinations were identified as those that did not show malignant findings and contained sufficient follow-up information to be classified as risk-positive or risk-negative. After images with implants were removed, the remaining women were then split in a stratified manner into training, validation, and test datasets for model training and evaluation. For computational reasons, numbers of women in the training and validation risk-negative groups were reduced via random sampling.
Receiver operating characteristic (ROC) performance of artificial
intelligence risk model on the hold-out test dataset. The predictive
performance of patient age alone is also shown for comparison. AUC = area
under the ROC curve, FPR = false-positive rate, TPR = true-positive
rate.
Figure 2:
Receiver operating characteristic (ROC) performance of artificial intelligence risk model on the hold-out test dataset. The predictive performance of patient age alone is also shown for comparison. AUC = area under the ROC curve, FPR = false-positive rate, TPR = true-positive rate.
Receiver operating characteristic (ROC) performance of artificial
intelligence risk model with respect to (A) screening site, (B) patient age
(in years) at screening, (C) patient ethnic background, and (D) mammography
unit model. AUC = area under the ROC curve, FPR = false-positive rate, TPR =
true-positive rate.
Figure 3:
Receiver operating characteristic (ROC) performance of artificial intelligence risk model with respect to (A) screening site, (B) patient age (in years) at screening, (C) patient ethnic background, and (D) mammography unit model. AUC = area under the ROC curve, FPR = false-positive rate, TPR = true-positive rate.
Receiver operating characteristic (ROC) performance of the presented
artificial intelligence risk model with respect to subsequent cancer type.
In this case, the ROC curve was calculated using the same negative cases for
all cancer types, whereas for Figure 3, the negative cases were also split
by the demographic factor under consideration. AUC = area under the ROC
curve, FPR = false-positive rate, TPR = true-positive rate.
Figure 4:
Receiver operating characteristic (ROC) performance of the presented artificial intelligence risk model with respect to subsequent cancer type. In this case, the ROC curve was calculated using the same negative cases for all cancer types, whereas for Figure 3, the negative cases were also split by the demographic factor under consideration. AUC = area under the ROC curve, FPR = false-positive rate, TPR = true-positive rate.
Example images from mammography examinations classified by the
developed artificial intelligence tool as risk-positive and risk-negative,
including correct and incorrect classifications.
Figure 5:
Example images from mammography examinations classified by the developed artificial intelligence tool as risk-positive and risk-negative, including correct and incorrect classifications.
Saliency map demonstration for two example images. The normal baseline
craniocaudal (A) and mediolateral oblique (D) mammograms are shown
along-side the same images with saliency map overlays (B, E), and the future
mammograms containing screen-detected cancers (C, F, with lesions shown in
red boxes). White dashed lines highlight the zoomed regions.
Figure 6:
Saliency map demonstration for two example images. The normal baseline craniocaudal (A) and mediolateral oblique (D) mammograms are shown along-side the same images with saliency map overlays (B, E), and the future mammograms containing screen-detected cancers (C, F, with lesions shown in red boxes). White dashed lines highlight the zoomed regions.

References

    1. Marmot MG , Altman DG , Cameron DA , Dewar JA , Thompson SG , Wilcox M . The benefits and harms of breast cancer screening: an independent review . Br J Cancer 2013. ; 108 ( 11 ): 2205 – 2240 . - PMC - PubMed
    1. Familial breast cancer: classification, care and managing breast cancer and related risks in people with a family history of breast cancer . National Institute for Health and Care Excellence . 2013. . https://www.nice.org.uk/guidance/cg164/ifp/chapter/How-breast-cancer-ris.... Published June 25, 2013. Last updated November 14, 2023. Accessed March 19, 2023 . - PubMed
    1. Monticciolo DL , Newell MS , Hendrick RE , et al. . Breast cancer screening for average-risk women: recommendations from the ACR Commission on Breast Imaging . J Am Coll Radiol 2017. ; 14 ( 9 ): 1137 – 1143 . - PubMed
    1. Tice JA , Cummings SR , Smith-Bindman R , Ichikawa L , Barlow WE , Kerlikowske K . Using clinical factors and mammographic breast density to estimate breast cancer risk: development and validation of a new predictive model . Ann Intern Med 2008. ; 148 ( 5 ): 337 – 347 . - PMC - PubMed
    1. Brandt KR , Scott CG , Ma L , et al. . Comparison of clinical and automated breast density measurements: implications for risk prediction and supplemental screening . Radiology 2016. ; 279 ( 3 ): 710 – 719 . - PMC - PubMed

Publication types

LinkOut - more resources