Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 30;14(1):2497.
doi: 10.1038/s41598-024-52929-0.

Diagnosing oral and maxillofacial diseases using deep learning

Affiliations

Diagnosing oral and maxillofacial diseases using deep learning

Junegyu Kang et al. Sci Rep. .

Abstract

The classification and localization of odontogenic lesions from panoramic radiographs is a challenging task due to the positional biases and class imbalances of the lesions. To address these challenges, a novel neural network, DOLNet, is proposed that uses mutually influencing hierarchical attention across different image scales to jointly learn the global representation of the entire jaw and the local discrepancy between normal tissue and lesions. The proposed approach uses local attention to learn representations within a patch. From the patch-level representations, we generate inter-patch, i.e., global, attention maps to represent the positional prior of lesions in the whole image. Global attention enables the reciprocal calibration of path-level representations by considering non-local information from other patches, thereby improving the generation of whole-image-level representation. To address class imbalances, we propose an effective data augmentation technique that involves merging lesion crops with normal images, thereby synthesizing new abnormal cases for effective model training. Our approach outperforms recent studies, enhancing the classification performance by up to 42.4% and 44.2% in recall and F1 scores, respectively, and ensuring robust lesion localization with respect to lesion size variations and positional biases. Our approach further outperforms human expert clinicians in classification by 10.7 % and 10.8 % in recall and F1 score, respectively.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Overview of DOLNet: (a) the first stage for extracting patch-level representations from the proposed mutually influencing hierarchical attention. The initial patch encoding is obtained from individual local attention, and they are aggregated to create a global attention map, shown as a dotted red box, which inversely calibrates the patch representations. (b) The second stage corresponding to Eq. (1) with two branches for lesion classification and localization. (c) Preprocessed input image and patch split (best viewed in color).
Figure 2
Figure 2
LesionMix, proposed data augmentation: (a) two samples of “AB” and “Normal” cases to generate a synthetic “AB” sample, and (b) comparisons of the result of LesionMix with that of Mixup by mixing the sources in the ratio 0.5:0.5 and Cutmix. For each method, crops containing lesions are highlighted.
Figure 3
Figure 3
Characteristics of a dataset for this study: (a) the distribution of lesion size in the images, and (b) the spatial distribution of lesions across the images in terms of the normalized heat map corresponding to all lesion segmentations in our dataset (best viewed in color).
Figure 4
Figure 4
Effects of lesion size and location on classification by comparing the proposed method with (a) existing models, Kwon and Hu using the dataset D and (b) human clinicians using the dataset Dtiny and on localization by comparing with (c) the models. For the three plots with their horizontal axes labeled as size, the numbers correspond to identifiers of image groups (i.e., bins) according to the lesion size. Bins with larger ids contain images with larger lesions. Also, for the plots with the horizontal axes labeled as position, the numbers correspond to the four quadrants.
Figure 5
Figure 5
Qualitative results. Each of the rows contains a test image with its target class and lesion boundary annotated in red (1st column), and the predictions from four methods as blue heatmaps indicating lesions: Kwon et al. (2nd), Hu et al. (3rd), the backbone of DOLNet (4th), and the full DOLNet (the last) (best viewed in color).

References

    1. Schwendicke F, Golla T, Dreher M, Krois J. Convolutional neural networks for dental image diagnostics: A scoping review. J. Dent. 2019;91:103226. doi: 10.1016/j.jdent.2019.103226. - DOI - PubMed
    1. Silva G, Oliveira L, Pithon M. Automatic segmenting teeth in x-ray images: Trends, a novel data set, benchmarking and future perspectives. Expert Syst. Appl. 2018;107:15–31. doi: 10.1016/j.eswa.2018.04.001. - DOI
    1. Amasya H, Cesur E, Yıldırım D, Orhan K. Validation of cervical vertebral maturation stages: Artificial intelligence vs human observer visual analysis. Am. J. Orthod. Dentofac. Orthop. 2020;158:e173–e179. doi: 10.1016/j.ajodo.2020.08.014. - DOI - PubMed
    1. Oh K, Oh I-S, Van NhatLe T, Lee D-W. Deep anatomical context feature learning for cephalometric landmark detection. IEEE J. Biomed. Health Inform. 2020;20:20. - PubMed
    1. Krois J, et al. Deep learning for the radiographic detection of periodontal bone loss. Sci. Rep. 2019;9:1–6. doi: 10.1038/s41598-019-44839-3. - DOI - PMC - PubMed