Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jun;123(6):4466-81.
doi: 10.1121/1.2902168.

A magnetic resonance imaging-based articulatory and acoustic study of "retroflex" and "bunched" American English /r/

Affiliations

A magnetic resonance imaging-based articulatory and acoustic study of "retroflex" and "bunched" American English /r/

Xinhui Zhou et al. J Acoust Soc Am. 2008 Jun.

Abstract

Speakers of rhotic dialects of North American English show a range of different tongue configurations for /r/. These variants produce acoustic profiles that are indistinguishable for the first three formants [Delattre, P., and Freeman, D. C., (1968). "A dialect study of American English r's by x-ray motion picture," Linguistics 44, 28-69; Westbury, J. R. et al. (1998), "Differences among speakers in lingual articulation for American English /r/," Speech Commun. 26, 203-206]. It is puzzling why this should be so, given the very different vocal tract configurations involved. In this paper, two subjects whose productions of "retroflex" /r/ and "bunched" /r/ show similar patterns of F1-F3 but very different spacing between F4 and F5 are contrasted. Using finite element analysis and area functions based on magnetic resonance images of the vocal tract for sustained productions, the results of computer vocal tract models are compared to actual speech recordings. In particular, formant-cavity affiliations are explored using formant sensitivity functions and vocal tract simple-tube models. The difference in F4/F5 patterns between the subjects is confirmed for several additional subjects with retroflex and bunched vocal tract configurations. The results suggest that the F4/F5 differences between the variants can be largely explained by differences in whether the long cavity behind the palatal constriction acts as a half- or a quarter-wavelength resonator.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(Color online) Top panel: Midsagittal MR images of two tongue configurations for American English ∕r∕. Middle panel: Spectrograms for nonsense word “warav.” Lower panel: Spectra of sustained ∕r∕ utterance. The left side is for S1 and the right side is for S2.
Figure 2
Figure 2
FEM mesh of the reconstructed 3D vocal tract. (a) The retroflex tongue shape. (b) The bunched tongue shape.
Figure 3
Figure 3
(Color online) Pressure isosurface plots of wave propagation inside the vocal tracts of the retroflex ∕r∕ (S1 on the right side) and the bunched ∕r∕ (S2 on the right side) at different frequencies. (Pressure isosurfaces are coded by color: the red color stands for high amplitude and the blue color stands for low amplitude.) (a) 400 Hz, (b) 1000 Hz, (c) 1500 Hz, (d) 3500 Hz, (e) 5400 Hz, and (f) 6000 Hz.
Figure 4
Figure 4
(Color online) Top panel: Grid lines for area function extraction inside the vocal tract. Lower panel: Area function based on the grid lines. (In each panel, the left side is for S1 and the right side is for S2.)
Figure 5
Figure 5
(Color online) For S1 and S2: (a) Spectrum of sustained ∕r∕ utterance in MRI session, (b) spectrum of sustained ∕r∕ utterance in the sound booth acoustic data, (c) the acoustic response based on 3D FEM, and (d) the acoustic response based on the area function.
Figure 6
Figure 6
(Color online) Acoustic sensitivity functions of F1–F5 for the retroflex ∕r∕ of S1 and S2.
Figure 7
Figure 7
(Color online) Acoustic response of S1’s retroflex ∕r∕ area function with front and back cavities separately modeled. (The left side is the area function and the right side is the corresponding acoustic response). (a) Area function of the whole vocal tract and its corresponding acoustic response. (b) Area function of the front cavity and its corresponding acoustic response. (c) Area function of the back cavity and its corresponding acoustic response.
Figure 8
Figure 8
(Color online) Acoustic response of S2’s bunched ∕r∕ area function with front and back cavities separately modeled. (The left side is the area function and the right side is the corresponding acoustic response). (a) The dividing point between the front cavity and the back cavity at about 12 cm. (b) The dividing point between the front cavity and the back cavity at about 15 cm.
Figure 9
Figure 9
(Color online) F2/F3 cavity affiliation switching with the change of the front cavity volume by varying its length (based on the area function data of S1).
Figure 10
Figure 10
(Color online) Simple-tube models overlaid on FEM-derived area functions (top panel) and corresponding acoustic responses (bottom panel). (a) Four element simple-tube model of the retroflex ∕r∕ of S1. (b) Seven element simple-tube model of the retroflex ∕r∕ of S1. (c) Three element simple-tube model of the bunched ∕r∕ of S2. (d) Eight element simple-tube model of the bunched ∕r∕ of S2.
Figure 11
Figure 11
(Color online) Spectra of sustained ∕r∕ utterances from six speakers (three retroflex ∕r∕ and three bunched ∕r∕). (a) Retroflex ∕r∕ (left: S1; middle: S3; right: S4.) (b) Bunched ∕r∕ (left: S2; middle: S5; right: S6).
Figure 12
Figure 12
Spectrograms for nonsense word “warav” from six speakers (three retroflex ∕r∕ and three bunched ∕r∕; only portions of spectrograms are shown in the figure with ∕r∕ in the middle). (a) Retroflex ∕r∕ (left: S1; middle: S3; right: S4). (b) Bunched ∕r∕ (left: S2; middle: S5; right: S6).

References

    1. Alwan, A., Narayanan, S., and Haker, K. (1997). “Toward articulatory-acoustic models for liquid approximants based on MRI and EPG data. Part II. The rhotics,” J. Acoust. Soc. Am. JASMAN10.1121/1.417972 101, 1078–1089. - DOI - PubMed
    1. Baer, T., Gore, J. C., Gracco, L. C., and Nye, P. W. (1991). “Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels,” J. Acoust. Soc. Am. JASMAN10.1121/1.401949 90, 799–828. - DOI - PubMed
    1. Chiba, T., and Kajiyama, M. (1941). The Vowel: Its Nature and Structure (Tokyo-Kaiseikan, Tokyo: ).
    1. Comsol (2007). COMSOL MULTIPHYSICS (http://www.comsol.com, accessed 12/20/2007).
    1. Dalston, R. M. (1975). “Acoustic characteristics of English ∕w,r,l∕ spoken correctly by young children and adults,” J. Acoust. Soc. Am. JASMAN10.1121/1.380469 57, 462–469. - DOI - PubMed

Publication types