Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 7;21(1):698.
doi: 10.1186/s12967-023-04572-y.

Real-time detection of laryngopharyngeal cancer using an artificial intelligence-assisted system with multimodal data

Affiliations

Real-time detection of laryngopharyngeal cancer using an artificial intelligence-assisted system with multimodal data

Yun Li et al. J Transl Med. .

Abstract

Background: Laryngopharyngeal cancer (LPC) includes laryngeal and hypopharyngeal cancer, whose early diagnosis can significantly improve the prognosis and quality of life of patients. Pathological biopsy of suspicious cancerous tissue under the guidance of laryngoscopy is the gold standard for diagnosing LPC. However, this subjective examination largely depends on the skills and experience of laryngologists, which increases the possibility of missed diagnoses and repeated unnecessary biopsies. We aimed to develop and validate a deep convolutional neural network-based Laryngopharyngeal Artificial Intelligence Diagnostic System (LPAIDS) for real-time automatically identifying LPC in both laryngoscopy white-light imaging (WLI) and narrow-band imaging (NBI) images to improve the diagnostic accuracy of LPC by reducing diagnostic variation among on-expert laryngologists.

Methods: All 31,543 laryngoscopic images from 2382 patients were categorised into training, verification, and test sets to develop, validate, and internal test LPAIDS. Another 25,063 images from five other hospitals were used as external tests. Overall, 551 videos were used to evaluate the real-time performance of the system, and 200 randomly selected videos were used to compare the diagnostic performance of the LPAIDS with that of laryngologists. Two deep-learning models using either WLI (model W) or NBI (model N) images were constructed to compare with LPAIDS.

Results: LPAIDS had a higher diagnostic performance than models W and N, with accuracies of 0·956 and 0·949 in the internal image and video tests, respectively. The robustness and stability of LPAIDS were validated in external sets with the area under the receiver operating characteristic curve values of 0·965-0·987. In the laryngologist-machine competition, LPAIDS achieved an accuracy of 0·940, which was comparable to expert laryngologists and outperformed other laryngologists with varying qualifications.

Conclusions: LPAIDS provided high accuracy and stability in detecting LPC in real-time, which showed great potential for using LPAIDS to improve the diagnostic accuracy of LPC by reducing diagnostic variation among on-expert laryngologists.

Keywords: Deep-learning models; Diagnostic; Head and neck tumour; Laryngoscopic; Multicentre; Real-time.

PubMed Disclaimer

Conflict of interest statement

All the authors have no conflicts of interest to declare.

Figures

Fig. 1
Fig. 1
Flowchart for development and evaluation of the LPAIDS for laryngopharyngeal cancer diagnosis. LPAIDS: Laryngopharyngeal Artificial Intelligence Diagnostic System
Fig. 2
Fig. 2
Workflow and architecture of LPAIDS. a Procedure for detecting LPC from laryngoscopy videos. The WLI and NBI laryngoscopy video frames were extracted from laryngoscopy videos. After screening and annotation by highly experienced laryngoscopists, the images were fed into the model to localize the area with possible tumours; the diagnoses were based on the shape and size of the tumour area. Three pre-trained convolutional neural network models (model W, model N, and LPAIDS, based on U-Net) were developed to obtain the feature vectors from the WLI, NBI, and all images, respectively. b The detailed neural network architecture of LPAIDS based on U-Net. LPAIDS: Laryngopharyngeal Artificial Intelligence Diagnostic System; LPC: laryngopharyngeal cancer; NBI: narrow-band imaging; WLI: white-light imaging
Fig. 3
Fig. 3
Performance of LPAIDS for identifying laryngopharyngeal cancer in the internal image and video datasets. a ROC curves of LPAIDS using all images in the internal image test set. b ROC curves of LPAIDS and model W using WLI images in the internal image test set. c ROC curves of LPAIDS and model N using NBI images in the internal image testing set. d ROC curves of LPAIDS using all videos in the internal video test sets. e ROC curves of LPAIDS and model W using WLI videos in the internal video test sets. f ROC curves of LPAIDS and model N using NBI videos in the internal video test sets. LPAIDS: Laryngopharyngeal Artificial Intelligence Diagnostic System; ROC: receiver operating characteristic; NBI: narrow-band imaging; WLI: white-light imaging
Fig. 4
Fig. 4
Representative attention maps obtained by LPAIDS for identifying laryngopharyngeal cancer. The attention map is shown as a heatmap superimposed on the original image, where warmer colors indicate higher saliency. LPAIDS: Laryngopharyngeal Artificial Intelligence Diagnostic System
Fig. 5
Fig. 5
The accuracy of LPAIDS in segmentation of laryngopharyngeal cancer regions. a The distribution of IOU for the internal image test sets. b Representative prediction results correspond to various segmentation performances of LPAIDS for laryngopharyngeal cancer segmentation. The green line was labeled by the laryngoscopists, and the red line was labeled by LPAIDS automatic calculation. LPAIDS: Laryngopharyngeal Artificial Intelligence Diagnostic System; IOU: Intersection-Over-Union
Fig. 6
Fig. 6
ROC curves illustrating the performance of LPAIDS for identifying laryngopharyngeal cancer in multicentre imaging datasets. FAHSU: First Affiliated Hospital of Shenzhen University; LPAIDS: FAHSYSU: First Affiliated Hospital of Sun Yat-sen University; Laryngopharyngeal Artificial Intelligence Diagnostic System; NHSMU: Nanfang Hospital of Southern Medical University; ROC: receiver operating characteristic; SAHSYSU: Sixth Affiliated Hospital of Sun Yat-sen University; SYMSYSU: Sun Yat-sen Memorial Hospital of Sun Yat-sen University; TAHSYSU: Third Affiliated Hospital of Sun Yat-sen University
Fig. 7
Fig. 7
Diagnostic performance for identifying laryngopharyngeal cancer between the LPAIDS and laryngologists in 200 videos. a Receiver operating characteristic curves of LPAIDS, expert, senior, laryngologist residents, and trainees for comparison of the diagnostic performance. b Confusion matrices obtained by LPAIDS and ten laryngologists with varying degrees of expertise. Expert: a professor with > 20 years of experience in endoscopic procedures. Senior: attending doctors with more than five years of experience who had completed clinical and specific endoscopic training. Residents: residents with more than three years of endoscopic experience. Trainee: internsone year of endoscopic experience. LPAIDS: Laryngopharyngeal Artificial Intelligence Diagnostic System

Similar articles

Cited by

References

    1. Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249. doi: 10.3322/caac.21660. - DOI - PubMed
    1. Steuer CE, El-Deiry M, Parks JR, Higgins KA, Saba NF. An update on larynx cancer. CA Cancer J Clin. 2017;67(1):31–50. doi: 10.3322/caac.21386. - DOI - PubMed
    1. Marioni G, Marchese-Ragona R, Cartei G, et al. Current opinion in diagnosis and treatment of laryngeal carcinoma. Cancer Treat Rev. 2006;32(7):504–515. doi: 10.1016/j.ctrv.2006.07.002. - DOI - PubMed
    1. Mannelli G, Cecconi L, Gallo O. Laryngeal preneoplastic lesions and cancer: challenging diagnosis. Qualitative literature review and meta-analysis. Crit Rev Oncol Hematol. 2016;106:64–90. doi: 10.1016/j.critrevonc.2016.07.004. - DOI - PubMed
    1. Krausert CR, Olszewski AE, Taylor LN, et al. Mucosal wave measurement and visualization techniques. J Voice. 2011;25(4):395–405. doi: 10.1016/j.jvoice.2010.02.001. - DOI - PMC - PubMed

Publication types