. 2019 Nov 5;116(45):22737-22745.

doi: 10.1073/pnas.1908021116. Epub 2019 Oct 21.

Expert-level detection of acute intracranial hemorrhage on head computed tomography using deep learning

Weicheng Kuo¹, Christian Hӓne¹, Pratik Mukherjee², Jitendra Malik³, Esther L Yuh⁴

Affiliations

¹ Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720.
² Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA 94107.
³ Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720; malik@eecs.berkeley.edu esther.yuh@ucsf.edu.
⁴ Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA 94107 malik@eecs.berkeley.edu esther.yuh@ucsf.edu.

PMID: 31636195
PMCID: PMC6842581
DOI: 10.1073/pnas.1908021116

Expert-level detection of acute intracranial hemorrhage on head computed tomography using deep learning

Weicheng Kuo et al. Proc Natl Acad Sci U S A. 2019.

. 2019 Nov 5;116(45):22737-22745.

doi: 10.1073/pnas.1908021116. Epub 2019 Oct 21.

Authors

Weicheng Kuo¹, Christian Hӓne¹, Pratik Mukherjee², Jitendra Malik³, Esther L Yuh⁴

Affiliations

¹ Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720.
² Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA 94107.
³ Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720; malik@eecs.berkeley.edu esther.yuh@ucsf.edu.
⁴ Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA 94107 malik@eecs.berkeley.edu esther.yuh@ucsf.edu.

PMID: 31636195
PMCID: PMC6842581
DOI: 10.1073/pnas.1908021116

Abstract

Computed tomography (CT) of the head is used worldwide to diagnose neurologic emergencies. However, expertise is required to interpret these scans, and even highly trained experts may miss subtle life-threatening findings. For head CT, a unique challenge is to identify, with perfect or near-perfect sensitivity and very high specificity, often small subtle abnormalities on a multislice cross-sectional (three-dimensional [3D]) imaging modality that is characterized by poor soft tissue contrast, low signal-to-noise using current low radiation-dose protocols, and a high incidence of artifacts. We trained a fully convolutional neural network with 4,396 head CT scans performed at the University of California at San Francisco and affiliated hospitals and compared the algorithm's performance to that of 4 American Board of Radiology (ABR) certified radiologists on an independent test set of 200 randomly selected head CT scans. Our algorithm demonstrated the highest accuracy to date for this clinical application, with a receiver operating characteristic (ROC) area under the curve (AUC) of 0.991 ± 0.006 for identification of examinations positive for acute intracranial hemorrhage, and also exceeded the performance of 2 of 4 radiologists. We demonstrate an end-to-end network that performs joint classification and segmentation with examination-level classification comparable to experts, in addition to robust localization of abnormalities, including some that are missed by radiologists, both of which are critically important elements for this application.

Keywords: deep learning; head computed tomography; intracranial hemorrhage; radiology.

PubMed Disclaimer

Conflict of interest statement

Competing interest statement: E.L.Y. and P.M. are named inventors on US Patent and Trademark Office No. 62/269, 778, “Interpretation and Quantification of Emergency Features on Head Computed Tomography” filed by the Regents of the University of California. W.K., C.H., P.M., J.M., and E.L.Y. are named inventors on a provisional patent application titled “Expert-Level Detection of Acute Intracranial Hemorrhage on Head CT scans” filed by the University of California Regents.

Figures

**Fig. 1.**
Receiver operating characteristic (ROC) for the deep learning model to predict the presence of acute intracranial hemorrhage on 200 head CT examinations. The algorithm achieved an area under the curve (AUC) of 0.991 ± 0.006 referenced to the gold standard (consensus interpretation of 2 ABR-certified neuroradiologists with a CAQ in neuroradiology). Algorithm performance exceeded that of 2 of 4 American Board of Radiology (ABR)-certified radiologists with attending-level experience ranging from 4 to 16 y. In addition, PatchFCN achieved 100% sensitivity at specificity levels approaching 90%, making this a suitable screening tool for radiologists based on an acceptably low proportion of false positives. The 2 numbers in each color box are the x coordinate (1-specificity) and y coordinate (sensitivity) for that radiologist’s performance. The salmon-colored circle shows (sensitivity 1.00, specificity 0.87) the highest specificity operating point with perfect sensitivity.

**Fig. 2.**
Patch-based fully convolutional neural network (PatchFCN) segmentation of acute intracranial hemorrhage. (A–C) Subarachnoid hemorrhage (SAH) due to aneurysm rupture. (D–F) Acute intracerebral hemorrhage. (G–I) Traumatic SAH (missed by 1 of 4 radiologists) and (J–L), isodense subdural hematoma (SDH). (J–L) Acute SDH in the setting of coagulopathy versus subacuted SDH at 2 to several days after injury. The arrows in J indicate the border between the SDH and adjacent brain. Because isodense subdural hematomas are not brighter than the adjacent brain parenchyma, radiologists identify these by recognizing the absence of sulci and gyri within the isodense collection. In J–L, the SDH is detected despite its isodensity to gray matter, showing that the deep learning algorithm does not rely solely on hyperdensity but also uses other features to identify hemorrhage. (A, D, G, and J) Original images. (B, E, H, and K) Original images with red shading of pixel-level probabilities >0.5 (on a scale of 0 to 1) for hemorrhage, as determined by the PatchFCN; pixels with probability <0.5 were unaltered from the original images. (C, F, I, and L) Neuroradiologist’s segmentation of hemorrhage using green outline.

**Fig. 3.**
Five cases judged negative by at least 2 of 4 radiologists, but positive for acute hemorrhage by both the algorithm and the gold standard. (A–C) Small left temporal subarachnoid hemorrhage (SAH), (D–F) small right posterior frontal and parafalcine subdural hematomas (SDH), (G–I) small right frontal SDH, and (J–L) small right temporal epidural hematoma and left posterior temporal contusion were each called negative by 2 of 4 radiologists. (M–O) Called negative by all 4 radiologists but contained a right parietal SDH identified by both the algorithm and by the gold standard. (A, D, G, J, and M) Original images. (B, E, H, K, and N) Algorithmic delineation of hemorrhage with pixel-level probabilities >0.5 colored in red. (C, F, I, L, and O) Neuroradiologist segmentation of hemorrhage using a green outline. The boxed areas are magnified views of small areas of hemorrhage. Arrows indicate the borders of small or subtle hemorrhages.

**Fig. 4.**
Examples of algorithm “near misses.” As one moves along the algorithm’s ROC curve to sensitivities <1.00, the next discrete operating point occurs at (sensitivity 0.96, specificity 0.98). (A–C) Initial case “missed” by the algorithm as one moves along the algorithm’s ROC curve from sensitivity 1.00 to the next discrete operating point at sensitivity 0.96. Small areas of faint hyperdensity are present on a background of abnormally hypodense white matter. Original images (A), computer algorithm delineation of pixel-level probabilities >0.4 shown in red (B), and neuroradiologist segmentation of hemorrhage using a green outline (C). We hypothesize that the algorithm’s certainty for this case was borderline because this case resembles negative cases in the training data that demonstrated faint mineralization, resembling hemorrhage, within areas of remote brain infarction. (D–K) Four false-positive cases at (sensitivity 0.96, specificity 0.98). In F and G, the algorithm demarcates an area of true intracranial hemorrhage. However, it was designated as a chronic, rather than acute, subdural hematoma in the gold-standard consensus review. D, E, J, and K show tiny areas of false-positive areas of hemorrhage delineated by the algorithm on only 1 or 2 images of these examinations. H and I show peripheral areas of false-positive hemorrhage due to streak artifact and nonlinear partial volume artifact (9) that are common at the level of the skull base.

**Fig. 5.**
Examples of multiclass segmentation by the algorithm and by an expert. (A–C) Small left holohemispheric subdural hematoma (SDH, green) and adjacent contusion (purple). (D–F) Small right frontal and posterior parafalcine SDH and anterior interhemispheric fissure SAH (red). (G–I) Small bilateral tentorial and left frontotemporal SDH (green) and subjacent contusions (purple) and SAH (red), in addition to shear injury in the left cerebral peduncle (purple). (J–L) Small parafalcine SDH (green) with surrounding SAH (red). (M–O) Several small right frontal areas of SDH (green) with subjacent contusion (purple) and SAH (red). (P–R) Small left tentorial and left anterior temporal SDH (green) and right cerebellopontine angle SAH (red). (A, D, G, J, M, and P) Original images. (B, E, H, K, N, and Q) Algorithmic delineation of hemorrhage with pixel-level probabilities >0.5 colored in red (SAH), green (SDH), and contusion/shear injury (purple). (C, F, I, L, O, and R) Neuroradiologist segmentation of hemorrhage.

**Fig. 6.**
System diagram. Given a head CT stack, we evaluated each frame using a sliding window at inference time. The results were aggregated by averaging at the pixel level (*Top Right* image) where the green shows the prediction and red shows the ground truth annotation. Each frame stacked with its top and bottom neighbors was evaluated by the DRN-38 backbone. Along the top pathway, we applied deconvolution to the top-level features to decode the pixelwise prediction. Along the bottom pathway, we applied 2 convolutions, followed by global average pooling to obtain patchwise classification (*Bottom Right* image). The stack-level score is given by the maximum patch-level score within the stack.

See this image and copyright information in PMC

References

1. Gulshan V., et al. , Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016). - PubMed
1. Ehteshami Bejnordi B., et al. ; The CAMELYON16 Consortium , Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017). - PMC - PubMed
1. Esteva A., et al. , Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017). - PMC - PubMed
1. Chilamkurthy S., et al. , Deep learning algorithms for detection of critical findings in head CT scans: A retrospective study. Lancet 392, 2388–2396 (2018). - PubMed
1. Titano J. J., et al. , Automated deep-neural-network surveillance of cranial images for acute neurologic events. Nat. Med. 24, 1337–1341 (2018). - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Consumer Health Information
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Expert-level detection of acute intracranial hemorrhage on head computed tomography using deep learning

Affiliations

Expert-level detection of acute intracranial hemorrhage on head computed tomography using deep learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical