Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review
- PMID: 36428870
- PMCID: PMC9689563
- DOI: 10.3390/diagnostics12112811
Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review
Abstract
Lingual ultrasound imaging is essential in linguistic research and speech recognition. It has been used widely in different applications as visual feedback to enhance language learning for non-native speakers, study speech-related disorders and remediation, articulation research and analysis, swallowing study, tongue 3D modelling, and silent speech interface. This article provides a comparative analysis and review based on quantitative and qualitative criteria of the two main streams of tongue contour segmentation from ultrasound images. The first stream utilizes traditional computer vision and image processing algorithms for tongue segmentation. The second stream uses machine and deep learning algorithms for tongue segmentation. The results show that tongue tracking using machine learning-based techniques is superior to traditional techniques, considering the performance and algorithm generalization ability. Meanwhile, traditional techniques are helpful for implementing interactive image segmentation to extract valuable features during training and postprocessing. We recommend using a hybrid approach to combine machine learning and traditional techniques to implement a real-time tongue segmentation tool.
Keywords: computer vision; image segmentation; lingual ultrasound; machine learning; medical imaging analysis; tongue contour tracking.
Conflict of interest statement
The authors declare no conflict of interest.
Figures








Similar articles
-
Fully-automated tongue detection in ultrasound images.Comput Biol Med. 2019 Aug;111:103335. doi: 10.1016/j.compbiomed.2019.103335. Epub 2019 Jun 27. Comput Biol Med. 2019. PMID: 31279163
-
Multi-hypothesis tracking of the tongue surface in ultrasound video recordings of normal and impaired speech.Med Image Anal. 2018 Feb;44:98-114. doi: 10.1016/j.media.2017.12.003. Epub 2017 Dec 5. Med Image Anal. 2018. PMID: 29232649
-
Encoder-decoder CNN models for automatic tracking of tongue contours in real-time ultrasound data.Methods. 2020 Jul 1;179:26-36. doi: 10.1016/j.ymeth.2020.05.011. Epub 2020 May 22. Methods. 2020. PMID: 32450205
-
A systematic review of the application of machine learning techniques to ultrasound tongue imaging analysis.J Acoust Soc Am. 2024 Sep 1;156(3):1796-1819. doi: 10.1121/10.0028610. J Acoust Soc Am. 2024. PMID: 39287468
-
A review of thyroid gland segmentation and thyroid nodule segmentation methods for medical ultrasound images.Comput Methods Programs Biomed. 2020 Mar;185:105329. doi: 10.1016/j.cmpb.2020.105329. Epub 2020 Jan 9. Comput Methods Programs Biomed. 2020. PMID: 31955006 Review.
Cited by
-
Super-Resolved Dynamic 3D Reconstruction of the Vocal Tract during Natural Speech.J Imaging. 2023 Oct 20;9(10):233. doi: 10.3390/jimaging9100233. J Imaging. 2023. PMID: 37888339 Free PMC article.
-
Quantifying articulatory variations across phonological environments: An atlas-based approach using dynamic magnetic resonance imaging.J Acoust Soc Am. 2024 Dec 1;156(6):4000-4009. doi: 10.1121/10.0034639. J Acoust Soc Am. 2024. PMID: 39670769
-
Vision transformer architecture and applications in digital health: a tutorial and survey.Vis Comput Ind Biomed Art. 2023 Jul 10;6(1):14. doi: 10.1186/s42492-023-00140-9. Vis Comput Ind Biomed Art. 2023. PMID: 37428360 Free PMC article. Review.
-
Speech disorders in patients with Tongue squamous cell carcinoma: A longitudinal observational study based on a questionnaire and acoustic analysis.BMC Oral Health. 2023 Apr 1;23(1):192. doi: 10.1186/s12903-023-02888-1. BMC Oral Health. 2023. PMID: 37005608 Free PMC article.
-
Ultrasound Imaging of Artificial Tongues During Compression and Shearing of Food Gels on a Biomimetic Testing Bench.J Texture Stud. 2025 Jun;56(3):e70030. doi: 10.1111/jtxs.70030. J Texture Stud. 2025. PMID: 40490851 Free PMC article.
References
-
- Palmatier R.W., Houston M.B., Hulland J. Review articles: Purpose, process, and structure. J. Acad. Mark. Sci. 2018;46:1–5. doi: 10.1007/s11747-017-0563-4. - DOI
-
- Al-hammuri K. Ph.D. Thesis. University of Victoria; Victoria, BC, Canada: 2019. Computer Vision-Based Tracking and Feature Extraction for Lingual Ultrasound.
Publication types
LinkOut - more resources
Full Text Sources