Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Oct;15(5):823-37.
doi: 10.1007/s10162-014-0475-7. Epub 2014 Jul 8.

Computational model predictions of cues for concurrent vowel identification

Affiliations

Computational model predictions of cues for concurrent vowel identification

Ananthakrishna Chintanpalli et al. J Assoc Res Otolaryngol. 2014 Oct.

Abstract

Although differences in fundamental frequencies (F0s) between vowels are beneficial for their segregation and identification, listeners can still segregate and identify simultaneous vowels that have identical F0s, suggesting that additional cues are contributing, including formant frequency differences. The current perception and computational modeling study was designed to assess the contribution of F0 and formant difference cues for concurrent vowel identification. Younger adults with normal hearing listened to concurrent vowels over a wide range of levels (25-85 dB SPL) for conditions in which F0 was the same or different between vowel pairs. Vowel identification scores were poorer at the lowest and highest levels for each F0 condition, and F0 benefit was reduced at the lowest level as compared to higher levels. To understand the neural correlates underlying level-dependent changes in vowel identification, a computational auditory-nerve model was used to estimate formant and F0 difference cues under the same listening conditions. Template contrast and average localized synchronized rate predicted level-dependent changes in the strength of phase locking to F0s and formants of concurrent vowels, respectively. At lower levels, poorer F0 benefit may be attributed to poorer phase locking to both F0s, which resulted from lower firing rates of auditory-nerve fibers. At higher levels, poorer identification scores may relate to poorer phase locking to the second formant, due to synchrony capture by lower formants. These findings suggest that concurrent vowel identification may be partly influenced by level-dependent changes in phase locking of auditory-nerve fibers to F0s and formants of both vowels.

PubMed Disclaimer

Figures

FIG. 1
FIG. 1
Envelope spectrum for each of the five vowels presented at 65 dB SPL, computed using linear predictive coding. The local maxima correspond to the formant frequencies of each vowel.
FIG. 2
FIG. 2
Identification scores (rau) of both vowels for vowel pairs with the same F0 (blue triangles) or different F0 (red circles) as a function of vowel level. A Identification scores for all vowel pairs (25 pairs). B Identification scores for identical vowel pairs (5 pairs). C Identification scores for different vowel pairs (20 pairs). For each panel, error bars indicate ±1 SEM and asterisks indicate significant changes in scores (p < 0.05) with vowel level for both F0 conditions. Note that scores in panel A are not the average of the scores in panels B and C because the numbers of stimuli are different.
FIG. 3
FIG. 3
Block diagram describing procedure to quantify the strength of phase locking of AN fibers to vowel formants and F0s using the auditory-nerve model. Step 1 shows peri-stimulus time histograms (PSTHs) predicted from the model. CFi corresponds to the ith CF of the AN fiber. The value of CFs ranged logarithmically between 100 and 4000 Hz, where CF1 = 100 Hz and CF100 = 4000 Hz. Step 2 shows average localized synchronized rate (ALSR) for quantifying phase locking to vowel formants (i.e., to F1 and F2 of each vowel in the pair). Step 3 shows template contrast for quantifying phase locking to F0 of each vowel in the pair. The auto-correlation function (ACF) is computed from PSTH and then multiplied by a CF-dependent exponential function to estimate F0 information at each fiber. The multiplied ACFs across many CFs are summed to obtain the pooled ACF (i.e., estimating F0 information across a population of fibers). Template contrast is computed from the pooled ACF. Template contrast > 1 indicates good phase locking to F0. See text for additional details.
FIG. 4
FIG. 4
Synchronized rate from two simulated fibers for the vowel pair /ɑ, æ/ with the same F0. The CF of each fiber was closer to F2 of each vowel. First column (panels AD) shows synchronized rates at CF = 1015 Hz (near F2 of /ɑ/), whereas the second column (panels EF) shows synchronized rates at CF = 1426 Hz (near F2 of /æ/). Each row corresponds to a different vowel level. Both vowels have the same F1 (750 Hz). The arrow in each panel shows phase locking to the vowel formant. See text for additional details.
FIG. 5
FIG. 5
Similar to Figure 4, except for the vowel pair /i, æ/ with the same F0 and at two different simulated fibers. AD CF = 2235 Hz (near F2 of /i/). EH CF = 1426 Hz (near F2 of /æ/). The location of F1 of /i/ is also shown by arrows in panels AD. Note that there is no synchrony capture by F1 of /i/.
FIG. 6
FIG. 6
Predicted phase locking of AN fibers to formants and identification scores for the vowel pair /ɑ, æ/ with the same F0. A Average localized synchronized rate (ALSR) for /ɑ, æ/ at 25, 50, 65, and 85 dB SPL. A peak (indicated by the arrow) occurs at the harmonic of 100 Hz nearest to each of the formant frequencies (F1, F2) of /ɑ/ and /æ/. B ALSR for F1 and F2 of /ɑ/ and /æ/ as a function of vowel level, obtained from A. The ALSR for F1 and F2 of the two vowels are shown by blue diamonds and red squares, respectively. The solid blue line shows F1 of /ɑ/ (or /æ/). The solid red line indicates F2 of /ɑ/, whereas the dotted red line indicates F2 of /æ/. C Identification scores of the vowel pair /ɑ, æ/ with the same F0 as a function of vowel level. Error bars indicate ±1 SEM, and asterisks indicate significant changes in scores (p < 0.05) with increasing vowel level.
FIG. 7
FIG. 7
Similar to Figure 6, except for the /i, æ/ vowel pair. The harmonic shift at 2100 Hz for F2 of /i/ is shown by the arrow for 85 dB SPL in panel A. Note that the maximum ALSR value around F1 of /i/ occurs at 200 Hz across vowel levels.
FIG. 8
FIG. 8
Predicted phase locking of AN fibers to formants and F0s and identification scores for the vowel pair /ɑ, æ/ with different F0. A Average localized synchronized rate (ALSR) for /ɑ, æ/ at 25, 50, 65, and 85 dB SPL. A peak (indicated by the arrow) occurs at the harmonic of 100 Hz nearest to each of the formant frequencies of /ɑ/ and at the harmonic of 126 Hz nearest to each of the formant frequencies of /æ/. B ALSR for F1 and F2 of /ɑ/ and /æ/ as a function of vowel level. Legends are the same as in Figure 6B. C Template contrast for 100 Hz of /ɑ/ (gray upward triangles) and 126 Hz of /æ/ (black downward triangles) as a function of vowel level. Template contrast >1 indicates good phase locking to F0 whereas ≤1 indicates poor phase locking. D Identification scores of the vowel pair /ɑ, æ/ with different F0 (red circles) as a function of vowel level. Error bars indicate ± 1 SEM and asterisks indicate significant changes in scores with increasing vowel level (p < 0.05). Identification scores for same F0 (blue triangles) are re-plotted from Fig. 6C.
FIG. 9
FIG. 9
Similar to Figure 8, except for the /i, æ/ vowel pair. Note that the maximum ALSR value around F1 of /i/ occurs either at 200 or 300 Hz at each vowel level. The arrow for F1 of /i/ is shown at 300 Hz for 85 dB SPL.

Similar articles

Cited by

References

    1. Arehart KH, King CA, McLean-Mudgett KS. Role of fundamental frequency differences in the perceptual separation of competing vowel sounds by listeners with normal hearing and listeners with hearing loss. J Speech Lang Hear Res. 1997;40:1434–1444. doi: 10.1044/jslhr.4006.1434. - DOI - PubMed
    1. Assmann PF, Summerfield Q. Modeling the perception of concurrent vowels: vowels with different fundamental frequencies. J Acoust Soc Am. 1990;88:680–697. doi: 10.1121/1.399772. - DOI - PubMed
    1. Assmann PF, Summerfield Q. The contribution of waveform interactions to the perception of concurrent vowels. J Acoust Soc Am. 1994;95:471–484. doi: 10.1121/1.408342. - DOI - PubMed
    1. Bernstein JG, Oxenham AJ. An autocorrelation model with place dependence to account for the effect of harmonic number on fundamental frequency discrimination. J Acoust Soc Am. 2005;117:3816–3831. doi: 10.1121/1.1904268. - DOI - PMC - PubMed
    1. Bruce IC, Sachs MB, Young ED. An auditory-periphery model of the effects of acoustic trauma on auditory nerve responses. J Acoust Soc Am. 2003;113:369–388. doi: 10.1121/1.1519544. - DOI - PubMed

Publication types

LinkOut - more resources