Review

. 2022 Nov 15;12(11):2811.

doi: 10.3390/diagnostics12112811.

Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review

Khalid Al-Hammuri¹, Fayez Gebali¹, Ilamparithi Thirumarai Chelvan¹, Awos Kanan²

Affiliations

¹ Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC V8W 2Y2, Canada.
² Department of Computer Engineering, Princess Sumaya University for Technology, Amman 11941, Jordan.

PMID: 36428870
PMCID: PMC9689563
DOI: 10.3390/diagnostics12112811

Review

Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review

Khalid Al-Hammuri et al. Diagnostics (Basel). 2022.

. 2022 Nov 15;12(11):2811.

doi: 10.3390/diagnostics12112811.

Authors

Khalid Al-Hammuri¹, Fayez Gebali¹, Ilamparithi Thirumarai Chelvan¹, Awos Kanan²

Affiliations

¹ Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC V8W 2Y2, Canada.
² Department of Computer Engineering, Princess Sumaya University for Technology, Amman 11941, Jordan.

PMID: 36428870
PMCID: PMC9689563
DOI: 10.3390/diagnostics12112811

Abstract

Lingual ultrasound imaging is essential in linguistic research and speech recognition. It has been used widely in different applications as visual feedback to enhance language learning for non-native speakers, study speech-related disorders and remediation, articulation research and analysis, swallowing study, tongue 3D modelling, and silent speech interface. This article provides a comparative analysis and review based on quantitative and qualitative criteria of the two main streams of tongue contour segmentation from ultrasound images. The first stream utilizes traditional computer vision and image processing algorithms for tongue segmentation. The second stream uses machine and deep learning algorithms for tongue segmentation. The results show that tongue tracking using machine learning-based techniques is superior to traditional techniques, considering the performance and algorithm generalization ability. Meanwhile, traditional techniques are helpful for implementing interactive image segmentation to extract valuable features during training and postprocessing. We recommend using a hybrid approach to combine machine learning and traditional techniques to implement a real-time tongue segmentation tool.

Keywords: computer vision; image segmentation; lingual ultrasound; machine learning; medical imaging analysis; tongue contour tracking.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Overview of ultrasound probe placement beneath the chin. The ultrasound wave is shown in a black arc generated from the acoustic probe and propagated in the direction of the tongue. The effect of the hyoid and mandible bones is blocking part of the ultrasound wave, as shown in a black colour. The head and oral cavity picture was modified from the original picture for the case, courtesy of Associate Professor Frank Gaillard, Radiopaedia.org, rID: 35836, [86].

**Figure 2**
Ultrasound image of the tongue showing the tongue tip and root in the sagittal plane. The ultrasound probe on the bottom and the shadowing effect of the mandible and hyoid bone are visualized. The copyright for this ultrasound picture belongs to the author of this article, Khalid Al-hammuri [5].

**Figure 3**
Ultrasound image acquisition system used in speech analysis. The system is also configured with a microphone and head-transducer stability system. The copyright for the ultrasound and head-transducer support system picture belongs to the author of this article, Khalid Al-hammuri [5].

**Figure 4**
Shape-based evaluation measure. Point (A) is on the dorsal tongue part, point (B) is the point on the tongue tip, point (C) is the apex. Point (D) is the projection of point (C) on the (AB) line. The copyright for this ultrasound picture belongs to the author of this article, Khalid Al-hammuri [5].

**Figure 5**
K-fold cross-validation process. (A) The K iterations of the cross-validation. (B) The training fold data and labels. (C) Evaluating model performance during the validation fold data stage.

**Figure 6**
The process of labelling ultrasound images and extracting tongue contour using a deep belief neural network. All labels from (A–D) are horizontally ordered. (A) Ultrasound image before processing. (B) Manually labelled ground truth data. (C) Extracted features from ultrasound images using a translational deep belief neural network. (A) Extracted tongue contour overlaid on the original ultrasound image [104].

**Figure 7**
Quality evaluation matrix. Usability, image quality, and shape consistency are scored on a 0–5 scale (0 is the lowest and 5 is the highest). The final quality score is shown on a percentile scale and a satisfaction rate from low to high.

**Figure 8**
Bar chart for the total qualitative score of tongue image segmentation categories. The Y-axis is the qualitative score probability, and the X-axis is the quality score category for each image segmentation technique.

See this image and copyright information in PMC

Cited by

Super-Resolved Dynamic 3D Reconstruction of the Vocal Tract during Natural Speech.
Isaieva K, Odille F, Laprie Y, Drouot G, Felblinger J, Vuissoz PA. Isaieva K, et al. J Imaging. 2023 Oct 20;9(10):233. doi: 10.3390/jimaging9100233. J Imaging. 2023. PMID: 37888339 Free PMC article.
Quantifying articulatory variations across phonological environments: An atlas-based approach using dynamic magnetic resonance imaging.
Xing F, Zhuo J, Stone M, Liu X, Reese TG, Wedeen VJ, Prince JL, Woo J. Xing F, et al. J Acoust Soc Am. 2024 Dec 1;156(6):4000-4009. doi: 10.1121/10.0034639. J Acoust Soc Am. 2024. PMID: 39670769
Vision transformer architecture and applications in digital health: a tutorial and survey.
Al-Hammuri K, Gebali F, Kanan A, Chelvan IT. Al-Hammuri K, et al. Vis Comput Ind Biomed Art. 2023 Jul 10;6(1):14. doi: 10.1186/s42492-023-00140-9. Vis Comput Ind Biomed Art. 2023. PMID: 37428360 Free PMC article. Review.
Speech disorders in patients with Tongue squamous cell carcinoma: A longitudinal observational study based on a questionnaire and acoustic analysis.
Guo K, Xiao Y, Deng W, Zhao G, Zhang J, Liang Y, Yang L, Liao G. Guo K, et al. BMC Oral Health. 2023 Apr 1;23(1):192. doi: 10.1186/s12903-023-02888-1. BMC Oral Health. 2023. PMID: 37005608 Free PMC article.
Ultrasound Imaging of Artificial Tongues During Compression and Shearing of Food Gels on a Biomimetic Testing Bench.
Glumac M, Gennisson JL, Mathieu V. Glumac M, et al. J Texture Stud. 2025 Jun;56(3):e70030. doi: 10.1111/jtxs.70030. J Texture Stud. 2025. PMID: 40490851 Free PMC article.

See all "Cited by" articles

References

1. Palmatier R.W., Houston M.B., Hulland J. Review articles: Purpose, process, and structure. J. Acad. Mark. Sci. 2018;46:1–5. doi: 10.1007/s11747-017-0563-4. - DOI
1. Li M., Kambhamettu C., Stone M. Automatic contour tracking in ultrasound images. Clin. Linguist. Phon. 2005;19:545–554. doi: 10.1080/02699200500113616. - DOI - PubMed
1. Tang L., Bressmann T., Hamarneh G. Tongue contour tracking in dynamic ultrasound via higher-order MRFs and efficient fusion moves. Med. Image Anal. 2012;16:1503–1520. doi: 10.1016/j.media.2012.07.001. - DOI - PubMed
1. Laporte C., Ménard L. Multi-hypothesis tracking of the tongue surface in ultrasound video recordings of normal and impaired speech. Med. Image Anal. 2018;44:98–114. doi: 10.1016/j.media.2017.12.003. - DOI - PubMed
1. Al-hammuri K. Ph.D. Thesis. University of Victoria; Victoria, BC, Canada: 2019. Computer Vision-Based Tracking and Feature Extraction for Lingual Ultrasound.

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review

Affiliations

Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

LinkOut - more resources

Full Text Sources