Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct;140(4):2614.
doi: 10.1121/1.4964509.

Mechanics of human voice production and control

Affiliations

Mechanics of human voice production and control

Zhaoyan Zhang. J Acoust Soc Am. 2016 Oct.

Abstract

As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
(Color online) (a) Coronal view of the vocal folds and the airway; (b) histological structure of the vocal fold lamina propria in the coronal plane (image provided by Dr. Jennifer Long of UCLA); (c) superior view of the vocal folds, cartilaginous framework, and laryngeal muscles; (d) medial view of the cricoarytenoid joint formed between the arytenoid and cricoid cartilages; (e) posterolateral view of the cricothyroid joint formed by the thyroid and the cricoid cartilages. The arrows in (d) and (e) indicate direction of possible motions of the arytenoid and cricoid cartilages due to LCA and CT muscle activation, respectively.
FIG. 2.
FIG. 2.
Typical tensile stress-strain curve of the vocal fold along the anterior-posterior direction during loading and unloading at 1 Hz. The slope of the tangent line (dashed lines) to the stress-strain curve quantifies the tangent stiffness. The stress is typically higher during loading than unloading due to the viscous behavior of the vocal folds. The curve was obtained by averaging data over 30 cycles after a 10-cycle preconditioning.
FIG. 3.
FIG. 3.
Activation of the LCA/IA muscles completely closes the posterior glottis but leaves a small gap in the membranous glottis, whereas TA activation completely closes the anterior glottis but leaves a gap at the posterior glottis. From unpublished stroboscopic recordings from the in vivo canine larynx experiments in Choi et al. (1993).
FIG. 4.
FIG. 4.
(Color online) Typical glottal flow waveform and its time derivative (left) and their correspondence to the spectral slopes of the low-frequency and high-frequency portions of the voice source spectrum (right).
FIG. 5.
FIG. 5.
Two energy transfer mechanisms. Top row: the presence of a vertical phase difference leads to different medial surface shapes between glottal opening (dashed lines 5 and 6; upper left panel) and closing (solid lines 2 and 3) when the lower margin of the medial surface crosses the same locations, which leads to higher air pressure during glottal opening than closing and net energy transfer from airflow into vocal folds at the lower margin of the medial surface. Middle row: without a vertical phase difference, vocal fold vibration produces an alternatingly convergent-divergent but identical glottal channel geometry between glottal opening and closing (bottom left panel), thus zero energy transfer (middle row). Bottom row: without a vertical phase difference, air pressure asymmetry can be imposed by a negative damping mechanism.
FIG. 6.
FIG. 6.
Typical vocal fold eigenmodes exhibiting (a) a dominantly superior-inferior motion, (b) a medial-lateral in-phase motion, and (c) a medial-lateral out-of-phase motion along the medial surface.
FIG. 7.
FIG. 7.
A typical eigenmode synchronization pattern. The evolution of the first three eigenmodes is shown as a function of the subglottal pressure. As the subglottal pressure increases, the frequencies (top) of the second and third vocal fold eigenmodes gradually approach each other and, at a threshold subglottal pressure, synchronize to the same frequency. At the same time, the growth rate (bottom) of the second mode becomes positive, indicating the coupled airflow-vocal fold system becomes linearly unstable and phonation starts.
FIG. 8.
FIG. 8.
(Color online) The closed quotient CQ and vertical phase difference VPD as a function of the medial surface thickness, the AP stiffness (Gap), and the resting glottal angle (α). Reprinted with permission of ASA from Zhang (2016a).
FIG. 9.
FIG. 9.
A three-dimensional map of normal (N), breathy (B), and rough (R) phonation in the parameter space of the prephonatory glottal area (Ag0), subglottal pressure (Ps), vocal fold stiffness (k). Reprinted with permission of Springer from Isshiki (1989).

References

    1. Alipour, F. , Berry, D. A. , and Titze, I. R. (2000). “ A finite-element model of vocal-fold vibration,” J. Acoust. Soc. Am. 108, 3003–3012.10.1121/1.1324678 - DOI - PubMed
    1. Alipour, F. , and Scherer, R. (2000). “ Vocal fold bulging effects on phonation using a biophysical computer model,” J. Voice 14, 470–483.10.1016/S0892-1997(00)80004-1 - DOI - PubMed
    1. Alipour, F. , and Scherer, R. C. (2004). “ Flow separation in a computational oscillating vocal fold model,” J. Acoust. Soc. Am. 116, 1710–1719.10.1121/1.1779274 - DOI - PubMed
    1. Alipour, F. , and Vigmostad, S. (2012). “ Measurement of vocal folds elastic properties for continuum modeling,” J. Voice 26, 816.e21–816.e29.10.1016/j.jvoice.2012.04.010 - DOI - PMC - PubMed
    1. Berke, G. , Mendelsohn, A. , Howard, N. , and Zhang, Z. (2013). “Neuromuscular induced phonation in a human ex vivo perfused larynx preparation,” J. Acoust. Soc. Am. 133(2), EL114–EL117.10.1121/1.4776776 - DOI - PMC - PubMed

Publication types