Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct;169(4):988-998.
doi: 10.1002/ohn.317. Epub 2023 Mar 8.

A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging

Affiliations

A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging

Andy S Ding et al. Otolaryngol Head Neck Surg. 2023 Oct.

Abstract

Objective: Preoperative planning for otologic or neurotologic procedures often requires manual segmentation of relevant structures, which can be tedious and time-consuming. Automated methods for segmenting multiple geometrically complex structures can not only streamline preoperative planning but also augment minimally invasive and/or robot-assisted procedures in this space. This study evaluates a state-of-the-art deep learning pipeline for semantic segmentation of temporal bone anatomy.

Study design: A descriptive study of a segmentation network.

Setting: Academic institution.

Methods: A total of 15 high-resolution cone-beam temporal bone computed tomography (CT) data sets were included in this study. All images were co-registered, with relevant anatomical structures (eg, ossicles, inner ear, facial nerve, chorda tympani, bony labyrinth) manually segmented. Predicted segmentations from no new U-Net (nnU-Net), an open-source 3-dimensional semantic segmentation neural network, were compared against ground-truth segmentations using modified Hausdorff distances (mHD) and Dice scores.

Results: Fivefold cross-validation with nnU-Net between predicted and ground-truth labels were as follows: malleus (mHD: 0.044 ± 0.024 mm, dice: 0.914 ± 0.035), incus (mHD: 0.051 ± 0.027 mm, dice: 0.916 ± 0.034), stapes (mHD: 0.147 ± 0.113 mm, dice: 0.560 ± 0.106), bony labyrinth (mHD: 0.038 ± 0.031 mm, dice: 0.952 ± 0.017), and facial nerve (mHD: 0.139 ± 0.072 mm, dice: 0.862 ± 0.039). Comparison against atlas-based segmentation propagation showed significantly higher Dice scores for all structures (p < .05).

Conclusion: Using an open-source deep learning pipeline, we demonstrate consistently submillimeter accuracy for semantic CT segmentation of temporal bone anatomy compared to hand-segmented labels. This pipeline has the potential to greatly improve preoperative planning workflows for a variety of otologic and neurotologic procedures and augment existing image guidance and robot-assisted systems for the temporal bone.

Keywords: automated segmentation; deep learning; neural network; temporal bone.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Under a license agreement between Galen Robotics Inc and Johns Hopkins University, Russell H. Taylor and the University are entitled to royalty distributions on technology related to the technology described in the study discussed in this publication. Russell H. Taylor also is a paid consultant to and owns equity in Galen Robotics Inc. This arrangement has been reviewed and approved by Johns Hopkins University in accordance with its conflict-of-interest policies.

Figures

Figure 1.
Figure 1.
Visual comparison between ground-truth (right) and predicted (left) segmentations from a representative data set. (A) Lateral view. (B) View of the facial recess. (C) View of the middle cranial fossa. Segment IDs are presented for clarity.
Figure 2.
Figure 2.
No new U-Net (nnU-Net) and segmentation propagation (Seg Prop) predictions. (A) Dice scores with an accuracy threshold (dotted line) of 0.75. (B) Modified Hausdorff distances with an accuracy threshold (dotted line) of 1 mm. p values: *<.05; **<.01; ***<.001; ****<.0001.
Figure 3.
Figure 3.
Accuracy comparisons for smaller/thinner (unshaded) versus larger/thicker (shaded) structures. Constrained linear regressions show the effects of modified Hausdorff distance on Dice score.
Figure 4.
Figure 4.
Example of Dice score insensitivity to shape variation and islands. While example (A) produces an inferior vestibular nerve prediction with a distant island and example (B) does not, they maintain similar Dice scores but significantly different modified Hausdorff distances (mHDs).

References

    1. Ding AS, Capostagno S, Razavi CR, et al. Volumetric accuracy analysis of virtual safety barriers for cooperative-control robotic mastoidectomy. Otol Neurotol. 2021;42(10):e1513–e1517. doi:10.1097/MAO.0000000000003309 - DOI - PMC - PubMed
    1. Li Z, Gordon A, Looi T, Drake J, Forrest C, Taylor RH. Anatomical mesh-based virtual fixtures for surgical robots. IEEE/RSJ International Conference on Intelligent Robots and Systems; 2020; pp. 3267–3273.
    1. Chen JX, Yu SE, Ding AS, et al. Augmented reality in otology/neurotology: a scoping review with implications for practice and education. Laryngoscope. Published online December 15, 2022. doi:10.1002/lary.30515 - DOI - PMC - PubMed
    1. Lim H, Matsumoto N, Cho B, et al. Semi-manual mastoidectomy assisted by human–robot collaborative control—a temporal bone replica study. Auris Nasus Larynx. 2016;43(2):161–165. doi:10.1016/j.anl.2015.08.008 - DOI - PubMed
    1. Rose AS, Kim H, Fuchs H, Frahm J-M. Development of augmented-reality applications in otolaryngology–head and neck surgery. Laryngoscope. 2019;129(S3):S1–S11. doi:10.1002/lary.28098 - DOI - PubMed

Publication types

MeSH terms