Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec;5(4):692-710.
doi: 10.3390/vibration5040041. Epub 2022 Oct 13.

Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck

Affiliations

Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck

Jennifer M Vojtech et al. Vibration. 2022 Dec.

Abstract

Silent speech interfaces (SSIs) enable speech recognition and synthesis in the absence of an acoustic signal. Yet, the archetypal SSI fails to convey the expressive attributes of prosody such as pitch and loudness, leading to lexical ambiguities. The aim of this study was to determine the efficacy of using surface electromyography (sEMG) as an approach for predicting continuous acoustic estimates of prosody. Ten participants performed a series of vocal tasks including sustained vowels, phrases, and monologues while acoustic data was recorded simultaneously with sEMG activity from muscles of the face and neck. A battery of time-, frequency-, and cepstral-domain features extracted from the sEMG signals were used to train deep regression neural networks to predict fundamental frequency and intensity contours from the acoustic signals. We achieved an average accuracy of 0.01 ST and precision of 0.56 ST for the estimation of fundamental frequency, and an average accuracy of 0.21 dB SPL and precision of 3.25 dB SPL for the estimation of intensity. This work highlights the importance of using sEMG as an alternative means of detecting prosody and shows promise for improving SSIs in future development.

Keywords: EMG; fundamental frequency; intensity; loudness; pitch; speech; voice.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: J.M.V., C.L.M., L.R., G.D.L. and J.C.K. are employed by Delsys, Inc., a commercial company that manufactures and markets sensor and software technologies for human movement, and Altec, Inc., an R&D company that performs research to reimagine human potential.

Figures

Figure 1.
Figure 1.
Configuration of sEMG sensors (pink) on the neck (left; sensors 1–4) and face (right; sensors 5–8).
Figure 2.
Figure 2.
Structure of the single-speaker deep regression neural networks used to estimate (a) fo and (b) intensity from sEMG signals.
Figure 3.
Figure 3.
Structure of the multi-speaker deep regression neural networks used to estimate (a) fo and (b) intensity from sEMG signals.
Figure 4.
Figure 4.
Example data for one participant from the phrase “Easy for you to say”. The normalized microphone signal is shown (a), with observed (navy lines) and predicted (pink lines) contours for (b) fo and (c) intensity. Contours for fo have been converted from semitones to Hertz (Hz) for visualization purposes.

References

    1. Keszte J; Danker H; Dietz A; Meister EF; Pabst F; Vogel H-J; Meyer A; Singer S Mental disorders and psychosocial support during the first year after total laryngectomy: A prospective cohort study. Clin. Otolaryngol 2013, 38, 494–501. - PubMed
    1. Terrell JE; Fisher SG; Wolf GT Long-term Quality of Life After Treatment of Laryngeal Cancer. Arch. Otolaryngol. Head Neck Surg 1998, 124, 964–971. - PubMed
    1. Bickford JM; Coveney J; Baker J; Hersh D Self-expression and identity after total laryngectomy: Implications for support. Psycho-Oncology 2018, 27, 2638–2644. - PubMed
    1. Lúcio GDS; Perilo TVDC; Vicente LCC; Friche AADL The impact of speech disorders quality of life: A questionnaire proposal. CoDAS 2013, 25, 610–613. - PubMed
    1. Garcia SM; Weaver K; Moskowitz GB; Darley JM Crowded minds: The implicit bystander effect. J. Pers. Soc. Psychol 2002, 83, 843–853. - PubMed

LinkOut - more resources