. 2013 Dec 16:7:865.

doi: 10.3389/fnhum.2013.00865. eCollection 2013.

Using auditory classification images for the identification of fine acoustic cues used in speech perception

Léo Varnet¹, Kenneth Knoblauch², Fanny Meunier³, Michel Hoen¹

Affiliations

¹ Neuroscience Research Centre, Brain Dynamics and Cognition Team, INSERM U1028, CNRS UMR5292 Lyon, France ; Ecole Doctorale Neurosciences et Cognition, Université de Lyon, Université Lyon 1 Lyon, France.
² Integrative Neuroscience Department, Stem Cell and Brain Research Institute, INSERM U846 Bron, France.
³ Neuroscience Research Centre, Brain Dynamics and Cognition Team, INSERM U1028, CNRS UMR5292 Lyon, France ; Ecole Doctorale Neurosciences et Cognition, Université de Lyon, Université Lyon 1 Lyon, France ; Laboratoire sur le Langage le Cerveau et la Cognition, CNRS UMR5304 Lyon, France.

PMID: 24379774
PMCID: PMC3863756
DOI: 10.3389/fnhum.2013.00865

Using auditory classification images for the identification of fine acoustic cues used in speech perception

Léo Varnet et al. Front Hum Neurosci. 2013.

. 2013 Dec 16:7:865.

doi: 10.3389/fnhum.2013.00865. eCollection 2013.

Authors

Léo Varnet¹, Kenneth Knoblauch², Fanny Meunier³, Michel Hoen¹

Affiliations

¹ Neuroscience Research Centre, Brain Dynamics and Cognition Team, INSERM U1028, CNRS UMR5292 Lyon, France ; Ecole Doctorale Neurosciences et Cognition, Université de Lyon, Université Lyon 1 Lyon, France.
² Integrative Neuroscience Department, Stem Cell and Brain Research Institute, INSERM U846 Bron, France.
³ Neuroscience Research Centre, Brain Dynamics and Cognition Team, INSERM U1028, CNRS UMR5292 Lyon, France ; Ecole Doctorale Neurosciences et Cognition, Université de Lyon, Université Lyon 1 Lyon, France ; Laboratoire sur le Langage le Cerveau et la Cognition, CNRS UMR5304 Lyon, France.

PMID: 24379774
PMCID: PMC3863756
DOI: 10.3389/fnhum.2013.00865

Abstract

An essential step in understanding the processes underlying the general mechanism of perceptual categorization is to identify which portions of a physical stimulation modulate the behavior of our perceptual system. More specifically, in the context of speech comprehension, it is still a major open challenge to understand which information is used to categorize a speech stimulus as one phoneme or another, the auditory primitives relevant for the categorical perception of speech being still unknown. Here we propose to adapt a method relying on a Generalized Linear Model with smoothness priors, already used in the visual domain for the estimation of so-called classification images, to auditory experiments. This statistical model offers a rigorous framework for dealing with non-Gaussian noise, as it is often the case in the auditory modality, and limits the amount of noise in the estimated template by enforcing smoother solutions. By applying this technique to a specific two-alternative forced choice experiment between stimuli "aba" and "ada" in noise with an adaptive SNR, we confirm that the second formantic transition is key for classifying phonemes into /b/ or /d/ in noise, and that its estimation by the auditory system is a relative measurement across spectral bands and in relation to the perceived height of the second formant in the preceding syllable. Through this example, we show how the GLM with smoothness priors approach can be applied to the identification of fine functional acoustic cues in speech perception. Finally we discuss some assumptions of the model in the specific case of speech perception.

Keywords: GLM; acoustic cues; classification images; phoneme recognition; phonetics; speech perception.

PubMed Disclaimer

Figures

**Figure 1**
**Spectrograms of target-signals t₀ (/aba/) and t₁ (/ada/) used for the vectorized spectrograms T0 and T1, on a logarithmic scale (dB)**. Blue boxes indicate the second formantic transition (F2).

**Figure 2**
**(A)** Evolution of SNR across trials (mean SNR by blocks of 1000 trials) for each participant, and overall mean SNR (red dotted line). **(B)** Psychometric function of each participant: detectability index d' (defined as d' = Φ^-1 (*P_H*)- Φ^-1(*P_FA*) as a function of signal contrast (values calculated on less than 20 observations are not included).

**Figure 3**
**Prediction accuracy of the model (in terms of 10-fold cross-validation rate) as a function of regularization parameters λ₁ (x-axis) and λ₂ (y-axis) in logarithmic scale, for one participant (MH)**. Around are shown classification images obtained with different pairs of regularization parameters (λ₁, λ₂) (n = 10000 trials for each estimate).

**Figure 4**
**(A)** Classification Image $\hat{β}$ for each participants, estimated with optimal smoothness hyperparameters λ₁ and λ₂ (n = 10000 trials for each estimate). Weights are divided by their maximum absolute values. Boxes corresponds to the position of the second formantic transition (F2) in the original stimuli spectrograms. **(B)** Same as above except that non-significant weights are shown in gray scale (p < 0.005, permutation test).

**Figure 5**
**(A)** Difference template w used by the template matcher (difference between spectrograms of the targets). **(B)** Estimated model parameters for the template-matcher optimal hyperparameters λ₁ and λ₂ (n = 10000 trials). Weights are divided by their maximum absolute values.

**Figure 6**
**Correlation between coefficients of the Classification Images estimated on n trials and the “overall” Classification Image, for participant MH**. Examples of Classification Images are shown at 3000, 6000, and 10,000 trials.

**Figure 7**
**Classification Images $\hat{β}$ ₀ and $\hat{β}$ ₁ estimated on the trials where t₀ (/aba/) or t₁ (/ada/) was presented respectively (n = 5000 trials for each estimate)**. Hyperparameters values are the same as for the “overall” Classification Images Figure 4. Weights are divided by their maximum absolute values.

**Figure 8**
Classification Images $\hat{β}$ for conditions lowest SNR (min to median SNR) or highest SNR (median to max SNR), estimated using GLM approach with smoothness priors (n = 5000 trials for each estimate). Hyperparameters values are the same as for the “overall” Classification Images Figure 4. Weights are divided by their maximum absolute values.

See this image and copyright information in PMC

Cited by

Mapping the spectrotemporal regions influencing perception of French stop consonants in noise.
Carranante G, Cany C, Farri P, Giavazzi M, Varnet L. Carranante G, et al. Sci Rep. 2024 Nov 8;14(1):27183. doi: 10.1038/s41598-024-77634-w. Sci Rep. 2024. PMID: 39516258 Free PMC article.
Direct Viewing of Dyslexics' Compensatory Strategies in Speech in Noise Using Auditory Classification Images.
Varnet L, Meunier F, Trollé G, Hoen M. Varnet L, et al. PLoS One. 2016 Apr 21;11(4):e0153781. doi: 10.1371/journal.pone.0153781. eCollection 2016. PLoS One. 2016. PMID: 27100662 Free PMC article.
How musical expertise shapes speech perception: evidence from auditory classification images.
Varnet L, Wang T, Peter C, Meunier F, Hoen M. Varnet L, et al. Sci Rep. 2015 Sep 24;5:14489. doi: 10.1038/srep14489. Sci Rep. 2015. PMID: 26399909 Free PMC article.
The sound of trustworthiness: Acoustic-based modulation of perceived voice personality.
Belin P, Boehme B, McAleer P. Belin P, et al. PLoS One. 2017 Oct 12;12(10):e0185651. doi: 10.1371/journal.pone.0185651. eCollection 2017. PLoS One. 2017. PMID: 29023462 Free PMC article. Clinical Trial.
A psychophysical imaging method evidencing auditory cue extraction during speech perception: a group analysis of auditory classification images.
Varnet L, Knoblauch K, Serniclaes W, Meunier F, Hoen M. Varnet L, et al. PLoS One. 2015 Mar 17;10(3):e0118009. doi: 10.1371/journal.pone.0118009. eCollection 2015. PLoS One. 2015. PMID: 25781470 Free PMC article.

See all "Cited by" articles

References

1. Abbey C. K., Eckstein M. P. (2002). Classification image analysis: estimation and statistical inference for two-alternative forced-choice experiments. J. Vis. 2, 66–78 10.1167/2.1.5 - DOI - PubMed
1. Abbey C. K., Eckstein M. P. (2006). Classification images for detection, contrast discrimination, and identification tasks with a common ideal observer. J. Vis. 6, 335–355 10.1167/6.4.4 - DOI - PubMed
1. Ahumada A. J. (1996). Perceptual classification images from vernier acuity masked by noise. Perception 25(ECVP Suppl.), 18.
1. Ahumada A. J. (2002). Classification image weights and internal noise level estimation. J. Vis. 2, 121–131 10.1167/2.1.8 - DOI - PubMed
1. Ahumada A., Lovell J. (1971). Stimulus features in signal detection. J. Acoust. Soc. Am. 49, 1751–1756 10.1121/1.1912577 - DOI

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Using auditory classification images for the identification of fine acoustic cues used in speech perception

Affiliations

Using auditory classification images for the identification of fine acoustic cues used in speech perception

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials