Stress Testing Pathology Models with Generated Artifacts
- PMID: 35070483
- PMCID: PMC8721870
- DOI: 10.4103/jpi.jpi_6_21
Stress Testing Pathology Models with Generated Artifacts
Abstract
Background: Machine learning models provide significant opportunities for improvement in health care, but their "black-box" nature poses many risks.
Methods: We built a custom Python module as part of a framework for generating artifacts that are meant to be tunable and describable to allow for future testing needs. We conducted an analysis of a previously published digital pathology classification model and an internally developed kidney tissue segmentation model, utilizing a variety of generated artifacts including testing their effects. The artifacts simulated were bubbles, tissue folds, uneven illumination, marker lines, uneven sectioning, altered staining, and tissue tears.
Results: We found that there is some performance degradation on the tiles with artifacts, particularly with altered stains but also with marker lines, tissue folds, and uneven sectioning. We also found that the response of deep learning models to artifacts could be nonlinear.
Conclusions: Generated artifacts can provide a useful tool for testing and building trust in machine learning models by understanding where these models might fail.
Keywords: Artifact; digital pathology; failure mode; machine learning; neural network; robustness.
Copyright: © 2021 Journal of Pathology Informatics.
Conflict of interest statement
There are no conflicts of interest.
Figures







References
-
- Watson DS, Krutzinna J, Bruce IN, Griffiths CE, McInnes IB, Barnes MR, et al. Clinical applications of machine learning algorithms: Beyond the black box. BMJ. 2019;364:l886. - PubMed
-
- Shamout F, Zhu T, Clifton L, Briggs J, Prytherch D, Meredith P, et al. Early warning score adjusted for age to predict the composite outcome of mortality, cardiac arrest or unplanned intensive care unit admission using observational vital-sign data: A multicentre development and validation. BMJ Open. 2019;9:e033301. - PMC - PubMed
-
- Jennings L, Deerlin VM, Gulley ML. Recommended principles and practices for validating clinical molecular pathology tests. Arch Pathol Lab Med. 2009;133:13. - PubMed
-
- McPherson RA. Henry's Clinical Diagnosis and Management by Laboratory Methods: First South Asia Edition_e-Book. India: Elsevier Health Sciences; 2017.
-
- Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA. 2019;322:2377–8. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources