Real-time detection of laryngopharyngeal cancer using an artificial intelligence-assisted system with multimodal data
- PMID: 37805551
- PMCID: PMC10559609
- DOI: 10.1186/s12967-023-04572-y
Real-time detection of laryngopharyngeal cancer using an artificial intelligence-assisted system with multimodal data
Abstract
Background: Laryngopharyngeal cancer (LPC) includes laryngeal and hypopharyngeal cancer, whose early diagnosis can significantly improve the prognosis and quality of life of patients. Pathological biopsy of suspicious cancerous tissue under the guidance of laryngoscopy is the gold standard for diagnosing LPC. However, this subjective examination largely depends on the skills and experience of laryngologists, which increases the possibility of missed diagnoses and repeated unnecessary biopsies. We aimed to develop and validate a deep convolutional neural network-based Laryngopharyngeal Artificial Intelligence Diagnostic System (LPAIDS) for real-time automatically identifying LPC in both laryngoscopy white-light imaging (WLI) and narrow-band imaging (NBI) images to improve the diagnostic accuracy of LPC by reducing diagnostic variation among on-expert laryngologists.
Methods: All 31,543 laryngoscopic images from 2382 patients were categorised into training, verification, and test sets to develop, validate, and internal test LPAIDS. Another 25,063 images from five other hospitals were used as external tests. Overall, 551 videos were used to evaluate the real-time performance of the system, and 200 randomly selected videos were used to compare the diagnostic performance of the LPAIDS with that of laryngologists. Two deep-learning models using either WLI (model W) or NBI (model N) images were constructed to compare with LPAIDS.
Results: LPAIDS had a higher diagnostic performance than models W and N, with accuracies of 0·956 and 0·949 in the internal image and video tests, respectively. The robustness and stability of LPAIDS were validated in external sets with the area under the receiver operating characteristic curve values of 0·965-0·987. In the laryngologist-machine competition, LPAIDS achieved an accuracy of 0·940, which was comparable to expert laryngologists and outperformed other laryngologists with varying qualifications.
Conclusions: LPAIDS provided high accuracy and stability in detecting LPC in real-time, which showed great potential for using LPAIDS to improve the diagnostic accuracy of LPC by reducing diagnostic variation among on-expert laryngologists.
Keywords: Deep-learning models; Diagnostic; Head and neck tumour; Laryngoscopic; Multicentre; Real-time.
© 2023. BioMed Central Ltd., part of Springer Nature.
Conflict of interest statement
All the authors have no conflicts of interest to declare.
Figures







Similar articles
-
Multi-Instance Learning for Vocal Fold Leukoplakia Diagnosis Using White Light and Narrow-Band Imaging: A Multicenter Study.Laryngoscope. 2024 Oct;134(10):4321-4328. doi: 10.1002/lary.31537. Epub 2024 May 27. Laryngoscope. 2024. PMID: 38801129
-
Self-Attention Mechanisms-Based Laryngoscopy Image Classification Technique for Laryngeal Cancer Detection.Head Neck. 2025 Mar;47(3):944-955. doi: 10.1002/hed.27999. Epub 2024 Nov 11. Head Neck. 2025. PMID: 39526389
-
Convolutional neural network based anatomical site identification for laryngoscopy quality control: A multicenter study.Am J Otolaryngol. 2023 Mar-Apr;44(2):103695. doi: 10.1016/j.amjoto.2022.103695. Epub 2022 Nov 24. Am J Otolaryngol. 2023. PMID: 36473265
-
Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study.Lancet Oncol. 2019 Dec;20(12):1645-1654. doi: 10.1016/S1470-2045(19)30637-0. Epub 2019 Oct 4. Lancet Oncol. 2019. PMID: 31591062 Clinical Trial.
-
Convolutional neural network-based artificial intelligence for the diagnosis of early esophageal cancer based on endoscopic images: A meta-analysis.Saudi J Gastroenterol. 2022 Sep-Oct;28(5):332-340. doi: 10.4103/sjg.sjg_178_22. Saudi J Gastroenterol. 2022. PMID: 35848703 Free PMC article. Review.
Cited by
-
Comparative Evaluation of High-Speed Videoendoscopy and Laryngovideostroboscopy for Functional Laryngeal Assessment in Clinical Practice.J Clin Med. 2025 Mar 4;14(5):1723. doi: 10.3390/jcm14051723. J Clin Med. 2025. PMID: 40095862 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical