Robust Estimation of Breast Cancer Incidence Risk in Presence of Incomplete or Inaccurate Information

doi:10.31557/APJCP.2020.21.8.2307

. 2020 Aug 1;21(8):2307-2313.

doi: 10.31557/APJCP.2020.21.8.2307.

Robust Estimation of Breast Cancer Incidence Risk in Presence of Incomplete or Inaccurate Information

Siva Teja Kakileti^{1

2}, Geetha Manjunath¹, Andre Dekker², Leonard Wee²

Affiliations

¹ Niramai Health Analytix Pvt Ltd., Koramangala, Bangalore, Karnataka, India.
² Department of Radiation Oncology (MAASTRO Clinic), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, The Netherlands.

PMID: 32856859
PMCID: PMC7771951
DOI: 10.31557/APJCP.2020.21.8.2307

Robust Estimation of Breast Cancer Incidence Risk in Presence of Incomplete or Inaccurate Information

Siva Teja Kakileti et al. Asian Pac J Cancer Prev. 2020.

. 2020 Aug 1;21(8):2307-2313.

doi: 10.31557/APJCP.2020.21.8.2307.

Authors

Siva Teja Kakileti^{1

2}, Geetha Manjunath¹, Andre Dekker², Leonard Wee²

Affiliations

¹ Niramai Health Analytix Pvt Ltd., Koramangala, Bangalore, Karnataka, India.
² Department of Radiation Oncology (MAASTRO Clinic), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, The Netherlands.

PMID: 32856859
PMCID: PMC7771951
DOI: 10.31557/APJCP.2020.21.8.2307

Abstract

Purpose: To evaluate the robustness of multiple machine learning classifiers for breast cancer risk estimation in the presence of incomplete or inaccurate information.

Data and methods: Open data for this study was obtained from the BCSC Data Resource (http://breastscreening.cancer.gov/). We conducted two ablation-type experiments to compare the robustness of different classifiers where we randomly switched known information to missing with a missing probability of pm in one experiment, and randomly corrupted the existing information with a probability of pc in another experiment. We considered three prominent machine-learning classifiers such as Logistic regression (LR), Random Forests (RF) and a custom Neural Network (NN) architecture and compared their degradation of discrimination performance as a function of increasing probability of missing or inaccurate data.

Results: LR, RF and custom NN resulted in an Area Under Curve (AUC) of 0.645, 0.643 and 0.649, respectively, on a test set with 500,000 total observations. When we manipulated the data by varying probabilities pm and pc from 0 to 1, NN resulted in better performance in terms of AUC compared to RF and LR as long as less than half the data was missing/inaccurate (that is, for values of pm < 0.5 and pc < 0.5). However, for missing (pm) or corruption (pc) probabilities above 0.5, LR gave similar performance as the custom NN. RF resulted in overall poorer performance when the data had additional missing or incorrect entries.

Conclusion: In cases where the input information is missing or inaccurate, our experiments show that the proposed custom NN provides reliable risk estimates in medical datasets like BCSC. These results are particularly important in health care applications where not every attribute of the individual participant might be available.<br />.

Keywords: Artificial Neural Networks; Breast cancer risk; Machine Learning; inaccurate data; missing values.

PubMed Disclaimer

Figures

**Figure 1**
Shows Variation of Mean AUCs with Missing Probability, p_m, for All the Three Classifiers on (a) validation set (b) test set

**Figure 2**
Shows Variation of Mean AUCs with Probability, p_c, for All the Three Classifiers on (a) validation set (b) test set

**Figure 3**
Illustration of Two Hypothetical Examples (Female A and Female B, as discussed in the text). For each example, the predicted risk of breast cancer within 1 year of screening is superimposed over the observed proportion of breast cancer diagnoses in the BCSC population as a function of age bracket

See this image and copyright information in PMC

Cited by

Comparison of Classification Success Rates of Different Machine Learning Algorithms in the Diagnosis of Breast Cancer.
Ozcan I, Aydin H, Cetinkaya A. Ozcan I, et al. Asian Pac J Cancer Prev. 2022 Oct 1;23(10):3287-3297. doi: 10.31557/APJCP.2022.23.10.3287. Asian Pac J Cancer Prev. 2022. PMID: 36308351 Free PMC article.
Study the Effect of the Risk Factors in the Estimation of the Breast Cancer Risk Score Using Machine Learning.
Khozama S, Mayya AM. Khozama S, et al. Asian Pac J Cancer Prev. 2021 Nov 1;22(11):3543-3551. doi: 10.31557/APJCP.2021.22.11.3543. Asian Pac J Cancer Prev. 2021. PMID: 34837911 Free PMC article.

References

1. Abadi M, Barham P, Chen J, et al. Tensorflow: A system for large-scale machine learning. 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16); 2016. 83 pp.
1. Amir E, Evan DG, Shenton A, et al. Evaluation of breast cancer risk assessment packages in the family history evaluation and screening programme. J Med Genet. 2003;40:807–14. - PMC - PubMed
1. Bagcchi S. India launches plan for national cancer screening programme. BMJ. 2016;355:i5574. - PubMed
1. Barlow , WE , White E, Ballard-Barbash R, et al. Prospective breast cancer risk prediction model for women undergoing screening mammography. J Natl Cancer Inst. 2006;98:1204–14. - PubMed
1. Bray F, Ren JS, Masuyer E, et al. Global estimates of cancer prevalence for 27 sites in the adult population in 2008. Int J Cancer. 2013;132:1133–45. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

HHSN261201100031C/CA/NCI NIH HHS/United States

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

[1] Abadi M, Barham P, Chen J, et al. Tensorflow: A system for large-scale machine learning. 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16); 2016. 83 pp.

[2] Abadi M, Barham P, Chen J, et al. Tensorflow: A system for large-scale machine learning. 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16); 2016. 83 pp.

[3] Amir E, Evan DG, Shenton A, et al. Evaluation of breast cancer risk assessment packages in the family history evaluation and screening programme. J Med Genet. 2003;40:807–14. - PMC - PubMed

[4] Amir E, Evan DG, Shenton A, et al. Evaluation of breast cancer risk assessment packages in the family history evaluation and screening programme. J Med Genet. 2003;40:807–14. - PMC - PubMed

[5] Bagcchi S. India launches plan for national cancer screening programme. BMJ. 2016;355:i5574. - PubMed

[6] Bagcchi S. India launches plan for national cancer screening programme. BMJ. 2016;355:i5574. - PubMed

[7] Barlow , WE , White E, Ballard-Barbash R, et al. Prospective breast cancer risk prediction model for women undergoing screening mammography. J Natl Cancer Inst. 2006;98:1204–14. - PubMed

[8] Barlow , WE , White E, Ballard-Barbash R, et al. Prospective breast cancer risk prediction model for women undergoing screening mammography. J Natl Cancer Inst. 2006;98:1204–14. - PubMed

[9] Bray F, Ren JS, Masuyer E, et al. Global estimates of cancer prevalence for 27 sites in the adult population in 2008. Int J Cancer. 2013;132:1133–45. - PubMed

[10] Bray F, Ren JS, Masuyer E, et al. Global estimates of cancer prevalence for 27 sites in the adult population in 2008. Int J Cancer. 2013;132:1133–45. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Robust Estimation of Breast Cancer Incidence Risk in Presence of Incomplete or Inaccurate Information

Affiliations

Robust Estimation of Breast Cancer Incidence Risk in Presence of Incomplete or Inaccurate Information

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical