Subspecialty-Level Deep Gray Matter Differential Diagnoses with Deep Learning and Bayesian Networks on Clinical Brain MRI: A Pilot Study
- PMID: 33937838
- PMCID: PMC8082339
- DOI: 10.1148/ryai.2020190146
Subspecialty-Level Deep Gray Matter Differential Diagnoses with Deep Learning and Bayesian Networks on Clinical Brain MRI: A Pilot Study
Abstract
Purpose: To develop and validate a system that could perform automated diagnosis of common and rare neurologic diseases involving deep gray matter on clinical brain MRI studies.
Materials and methods: In this retrospective study, multimodal brain MRI scans from 212 patients (mean age, 55 years ± 17 [standard deviation]; 113 women) with 35 neurologic diseases and normal brain MRI scans obtained between January 2008 and January 2018 were included (110 patients in the training set, 102 patients in the test set). MRI scans from 178 patients (mean age, 48 years ± 17; 106 women) were used to supplement training of the neural networks. Three-dimensional convolutional neural networks and atlas-based image processing were used for extraction of 11 imaging features. Expert-derived Bayesian networks incorporating domain knowledge were used for differential diagnosis generation. The performance of the artificial intelligence (AI) system was assessed by comparing diagnostic accuracy with that of radiologists of varying levels of specialization by using the generalized estimating equation with robust variance estimator for the top three differential diagnoses (T3DDx) and the correct top diagnosis (TDx), as well as with receiver operating characteristic analyses.
Results: In the held-out test set, the imaging pipeline detected 11 key features on brain MRI scans with 89% accuracy (sensitivity, 81%; specificity, 95%) relative to academic neuroradiologists. The Bayesian network, integrating imaging features with clinical information, had an accuracy of 85% for T3DDx and 64% for TDx, which was better than that of radiology residents (n = 4; 56% for T3DDx, 36% for TDx; P < .001 for both) and general radiologists (n = 2; 53% for T3DDx, 31% for TDx; P < .001 for both). The accuracy of the Bayesian network was better than that of neuroradiology fellows (n = 2) for T3DDx (72%; P = .003) but not for TDx (59%; P = .19) and was not different from that of academic neuroradiologists (n = 2; 84% T3DDx, 65% TDx; P > .09 for both).
Conclusion: A hybrid AI system was developed that simultaneously provides a quantitative assessment of disease burden, explainable intermediate imaging features, and a probabilistic differential diagnosis that performed at the level of academic neuroradiologists. This type of approach has the potential to improve clinical decision making for common and rare diseases.Supplemental material is available for this article.© RSNA, 2020.
2020 by the Radiological Society of North America, Inc.
Conflict of interest statement
Disclosures of Conflicts of Interest: J.D.R. disclosed no relevant relationships. A.M.R. disclosed no relevant relationships. L.X. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a paid consultant for Galileo. Other relationships: disclosed no relevant relationships. J.W. disclosed no relevant relationships. M.T.D. disclosed no relevant relationships. E.J.B. disclosed no relevant relationships. A.K. disclosed no relevant relationships. J.M.E. disclosed no relevant relationships. T.C. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: receives royalties from the Osler Institute; received travel expenses and honorarium for participation in a day-long program from RadPartners AI Summit. Other relationships: disclosed no relevant relationships. R.N.B. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is on the board at Galileo CDS; has stock/stock options in Galileo CDS. Other relationships: has patents issued to the University of Pennsylvania; has a patent licensed from the University of Pennsylvania to Galileo CDS. I.M.N. disclosed no relevant relationships. S.M. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a paid consultant for Northwest Biotherapeutics; institution has grants/grants pending from NovoCure, Galileo, Guerbet, and ACC. Other relationships: disclosed no relevant relationships. J.C.G. disclosed no relevant relationships.
Figures




![Comparison of radiologist performance to that of an artificial intelligence (AI) system. A, B, Jitter plots for the accuracy of the AI system for including, A, the correct top three differential diagnoses (T3DDx) and, B, correct top diagnosis (TDx) relative to the different groups of radiologists (radiology residents, general radiologists [General Rad], neuroradiology fellows [Neurorad fellows], and academic neuroradiologists [Academic Neurorads]). C, Nonparametric receiver operating characteristic (ROC) curves for the AI system (blue) compared with groups of radiologists based on their TDx, top two differential diagnoses, and T3DDx for each patient. D, E, Jitter plots for the accuracy of the AI system and radiologists for the, D, T3DDx and, E, exact correct TDx as a function of disease prevalence: common (black circle), moderately rare (gray square) and rare (white triangle). Solid lines denote the mean, and error bars represent standard error of measurement.](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef05/8082339/06615380372b/ryai.2020190146.fig5.gif)

References
-
- Chowdhury FA, Nashef L, Elwes RD. Misdiagnosis in epilepsy: a review and recognition of diagnostic uncertainty. Eur J Neurol 2008;15(10):1034–1042. - PubMed
-
- Gunderman RB. Biases in radiologic reasoning. AJR Am J Roentgenol 2009;192(3):561–564. - PubMed
-
- Bruno MA, Walker EA, Abujudeh HH. Understanding and Confronting Our Mistakes: The Epidemiology of Error in Radiology and Strategies for Error Reduction. RadioGraphics 2015;35(6):1668–1676. - PubMed