Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb;135(2):EL115-21.
doi: 10.1121/1.4862880.

Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging

Affiliations

Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging

Jangwon Kim et al. J Acoust Soc Am. 2014 Feb.

Abstract

This paper describes a spatio-temporal registration approach for speech articulation data obtained from electromagnetic articulography (EMA) and real-time Magnetic Resonance Imaging (rtMRI). This is motivated by the potential for combining the complementary advantages of both types of data. The registration method is validated on EMA and rtMRI datasets obtained at different times, but using the same stimuli. The aligned corpus offers the advantages of high temporal resolution (from EMA) and a complete mid-sagittal view (from rtMRI). The co-registration also yields optimum placement of EMA sensors as articulatory landmarks on the magnetic resonance images, thus providing richer spatio-temporal information about articulatory dynamics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
APD of each phoneme for MFCC-only alignment and MFCC + articulatory (Artic) alignment. The number on top of each bar is the percentage of change from MFCC-based alignment to MFCC + Artic-based alignment. Phone list is sorted by the percentage from low to high. “−” indicates that APD decreases by adding articulatory information.
Figure 2
Figure 2
(Color online) Clean speech waveform (top plot) for the word “harms” and corresponding time series of velic (the second plot), pharyngeal (the third plot), and labial (bottom plot) opening. The velic and pharyngeal opening parameters extracted from rtMRI are synchronized with the labial opening parameter extracted from the EMA by JAATA.
Figure 3
Figure 3
(Color online) Left: Six EMA sensors (circles) overlaid on the MRI image with estimated vocal tract boundaries (outer and inner lines in the vocal tract) and grid lines after co-registration. Right: Constriction degrees of the tongue tip (top plot) and tongue dorsum (bottom plot) extracted from up-sampled rtMRI data for the sentence “Publicity and notoriety go hand in hand.” The circle for each phone is placed on the trajectory of the critical articulator of the phone, indicating the frame index for the phone in the registered data.

References

    1. Fujimura O., Kiritani S., and Ishida H., “Computer controlled radiography for observation of movements of articulatory and other human organs,” Comp. Biol. Med. 3(4), 371–384 (1973).10.1016/0010-4825(73)90003-6 - DOI - PubMed
    1. Stone M., “A guide to analyzing tongue motion from ultrasound images,” Clin. Ling. Phon. 19(6–7), 455–501 (2005).10.1080/02699200500113558 - DOI - PubMed
    1. Iskarous K., Pouplier M., Marin S., and Harrington J., “The interaction between prosodic boundaries and accent in the production of sibilants,” in ISCA Proceedings of the 5th International Conference on Speech Prosody, Chicago (2010), pp. 1–4.
    1. Perkell J. S., Cohen M. H., Svirsky M. A., Matthies M. L., Garabieta I., and Jackson M. T., “Electromagnetic mid-sagittal articulometer systems for transducing speech articulatory movements,” J. Acoust. Soc. Am. 92(6), 3078–3096 (1992).10.1121/1.404204 - DOI - PubMed
    1. Narayanan S. S., Nayak K., Lee S., Sethy A., and Byrd D., “An approach to real-time magnetic resonance imaging for speech production,” J. Acoust. Soc. Am. 115(4), 1771–1776 (2004).10.1121/1.1652588 - DOI - PubMed

Publication types