. 2021 Aug 2;4(8):e2119100.

doi: 10.1001/jamanetworkopen.2021.19100.

A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images

Mateusz Buda¹, Ashirbani Saha¹, Ruth Walsh¹, Sujata Ghate¹, Nianyi Li¹, Albert Swiecicki¹, Joseph Y Lo^{1

2}, Maciej A Mazurowski^{1

2

3

4}

Affiliations

¹ Department of Radiology, Duke University Medical Center, Durham, North Carolina.
² Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina.
³ Department of Computer Science, Duke University, Durham, North Carolina.
⁴ Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina.

PMID: 34398205
PMCID: PMC8369362
DOI: 10.1001/jamanetworkopen.2021.19100

A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images

Mateusz Buda et al. JAMA Netw Open. 2021.

. 2021 Aug 2;4(8):e2119100.

doi: 10.1001/jamanetworkopen.2021.19100.

Authors

Mateusz Buda¹, Ashirbani Saha¹, Ruth Walsh¹, Sujata Ghate¹, Nianyi Li¹, Albert Swiecicki¹, Joseph Y Lo^{1

2}, Maciej A Mazurowski^{1

2

3

4}

Affiliations

¹ Department of Radiology, Duke University Medical Center, Durham, North Carolina.
² Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina.
³ Department of Computer Science, Duke University, Durham, North Carolina.
⁴ Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina.

PMID: 34398205
PMCID: PMC8369362
DOI: 10.1001/jamanetworkopen.2021.19100

Abstract

Importance: Breast cancer screening is among the most common radiological tasks, with more than 39 million examinations performed each year. While it has been among the most studied medical imaging applications of artificial intelligence, the development and evaluation of algorithms are hindered by the lack of well-annotated, large-scale publicly available data sets.

Objectives: To curate, annotate, and make publicly available a large-scale data set of digital breast tomosynthesis (DBT) images to facilitate the development and evaluation of artificial intelligence algorithms for breast cancer screening; to develop a baseline deep learning model for breast cancer detection; and to test this model using the data set to serve as a baseline for future research.

Design, setting, and participants: In this diagnostic study, 16 802 DBT examinations with at least 1 reconstruction view available, performed between August 26, 2014, and January 29, 2018, were obtained from Duke Health System and analyzed. From the initial cohort, examinations were divided into 4 groups and split into training and test sets for the development and evaluation of a deep learning model. Images with foreign objects or spot compression views were excluded. Data analysis was conducted from January 2018 to October 2020.

Exposures: Screening DBT.

Main outcomes and measures: The detection algorithm was evaluated with breast-based free-response receiver operating characteristic curve and sensitivity at 2 false positives per volume.

Results: The curated data set contained 22 032 reconstructed DBT volumes that belonged to 5610 studies from 5060 patients with a mean (SD) age of 55 (11) years and 5059 (100.0%) women. This included 4 groups of studies: (1) 5129 (91.4%) normal studies; (2) 280 (5.0%) actionable studies, for which where additional imaging was needed but no biopsy was performed; (3) 112 (2.0%) benign biopsied studies; and (4) 89 studies (1.6%) with cancer. Our data set included masses and architectural distortions that were annotated by 2 experienced radiologists. Our deep learning model reached breast-based sensitivity of 65% (39 of 60; 95% CI, 56%-74%) at 2 false positives per DBT volume on a test set of 460 examinations from 418 patients.

Conclusions and relevance: The large, diverse, and curated data set presented in this study could facilitate the development and evaluation of artificial intelligence algorithms for breast cancer screening by providing data for training as well as a common set of cases for model validation. The performance of the model developed in this study showed that the task remains challenging; its performance could serve as a baseline for future model development.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: Dr Walsh reported receiving personal fees from Therapixel outside the submitted work. Dr Ghate reported receiving personal fees from Therapixel during the conduct of the study and personal fees from Siemens outside the submitted work. Dr Mazurowski reported serving as an advisor to Gradient Health outside the submitted work. No other disclosures were reported.

Figures

**Figure 1.. Patient Flowchart**
AD indicates architectural distortion; BI-RADS, Breast Imaging-Reporting and Data System; DBT, digital breast tomosynthesis; LCC, left craniocaudal; LMLO, left mediolateral oblique; RCC, right craniocaudal; RMLO, right mediolateral oblique.

**Figure 2.. Free-Response Receiver Operating Characteristic Curve Showing Performance on the Test Set of a Model Trained Using Focal Loss**
DBT indicates digital breast tomosynthesis; FP, false positive.

**Figure 3.. Breast-Based Free-Response Receiver Operating Characteristic Curve for the Test Set**
DBT indicates digital breast tomosynthesis; FP, false positive.

See this image and copyright information in PMC

Comment in

doi: 10.1001/jamanetworkopen.2021.19345

References

1. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ, eds. Advances in Neural Information Processing Systems 25. Accessed July 7, 2021. https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-...
1. Litjens G, Kooi T, Bejnordi BE, et al. . A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88. doi:10.1016/j.media.2017.07.005 - DOI - PubMed
1. Le EPV, Wang Y, Huang Y, Hickman S, Gilbert FJ. Artificial intelligence in breast imaging. Clin Radiol. 2019;74(5):357-366. doi:10.1016/j.crad.2019.02.006 - DOI - PubMed
1. Schaffter T, Buist DSM, Lee CI, et al. ; and the DM DREAM Consortium . Evaluation of combined artificial intelligence and radiologist assessment to interpret screening mammograms. JAMA Netw Open. 2020;3(3):e200265-e200265. doi:10.1001/jamanetworkopen.2020.0265 - DOI - PMC - PubMed
1. Kim H-E, Kim HH, Han B-K, et al. . Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit Health. 2020;2(3):e138-e148. doi:10.1016/S2589-7500(20)30003-0 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 EB021360/EB/NIBIB NIH HHS/United States

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images

Affiliations

A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical