. 2021 Jul;22(4):443-461.

doi: 10.1007/s10162-021-00790-7. Epub 2021 Apr 20.

An Alternative Explanation for Difficulties with Speech in Background Talkers: Abnormal Fusion of Vowels Across Fundamental Frequency and Ears

Lina A J Reiss¹, Michelle R Molis^{2

3}

Affiliations

¹ Department of Otolaryngology, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, Portland, OR, 97239, USA. reiss@ohsu.edu.
² Department of Otolaryngology, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, Portland, OR, 97239, USA.
³ National Center for Rehabilitative Auditory Research, VA Portland Health Care System, 3710 SW US Veterans Road, Portland, OR, 97239, USA.

PMID: 33877470
PMCID: PMC8329143
DOI: 10.1007/s10162-021-00790-7

An Alternative Explanation for Difficulties with Speech in Background Talkers: Abnormal Fusion of Vowels Across Fundamental Frequency and Ears

Lina A J Reiss et al. J Assoc Res Otolaryngol. 2021 Jul.

. 2021 Jul;22(4):443-461.

doi: 10.1007/s10162-021-00790-7. Epub 2021 Apr 20.

Authors

Lina A J Reiss¹, Michelle R Molis^{2

3}

Affiliations

¹ Department of Otolaryngology, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, Portland, OR, 97239, USA. reiss@ohsu.edu.
² Department of Otolaryngology, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, Portland, OR, 97239, USA.
³ National Center for Rehabilitative Auditory Research, VA Portland Health Care System, 3710 SW US Veterans Road, Portland, OR, 97239, USA.

PMID: 33877470
PMCID: PMC8329143
DOI: 10.1007/s10162-021-00790-7

Erratum in

Correction to: An Alternative Explanation for Difficulties with Speech in Background Talkers: Abnormal Fusion of Vowels Across Fundamental Frequency and Ears.
Reiss LAJ, Molis MR. Reiss LAJ, et al. J Assoc Res Otolaryngol. 2021 Jul;22(4):463. doi: 10.1007/s10162-021-00802-6. J Assoc Res Otolaryngol. 2021. PMID: 34272623 Free PMC article. No abstract available.

Abstract

Normal-hearing (NH) listeners use frequency cues, such as fundamental frequency (voice pitch), to segregate sounds into discrete auditory streams. However, many hearing-impaired (HI) individuals have abnormally broad binaural pitch fusion which leads to fusion and averaging of the original monaural pitches into the same stream instead of segregating the two streams (Oh and Reiss, 2017) and may similarly lead to fusion and averaging of speech streams across ears. In this study, using dichotic speech stimuli, we examined the relationship between speech fusion and vowel identification. Dichotic vowel perception was measured in NH and HI listeners, with across-ear fundamental frequency differences varied. Synthetic vowels /i/, /u/, /a/, and /ae/ were generated with three fundamental frequencies (F₀) of 106.9, 151.2, and 201.8 Hz and presented dichotically through headphones. For HI listeners, stimuli were shaped according to NAL-NL2 prescriptive targets. Although the dichotic vowels presented were always different across ears, listeners were not informed that there were no single vowel trials and could identify one vowel or two different vowels on each trial. When there was no F₀ difference between the ears, both NH and HI listeners were more likely to fuse the vowels and identify only one vowel. As ΔF₀ increased, NH listeners increased the percentage of two-vowel responses, but HI listeners were more likely to continue to fuse vowels even with large ΔF₀. Binaural tone fusion range was significantly correlated with vowel fusion rates in both NH and HI listeners. Confusion patterns with dichotic vowels differed from those seen with concurrent monaural vowels, suggesting different mechanisms behind the errors. Together, the findings suggest that broad fusion leads to spectral blending across ears, even for different ΔF₀, and may hinder the stream segregation and understanding of speech in the presence of competing talkers.

Keywords: binaural fusion; concurrent vowel; dichotic; hearing loss.

PubMed Disclaimer

Figures

**Fig. 1**
Hearing thresholds for the subjects in this study. The left and right panels show the left and right ears, respectively. Normal-hearing group average audiograms are shown as thick gray lines and circles, with the gray shaded area indicating the minimum and maximum thresholds. Hearing-impaired individual audiograms are shown as thin lines with symbols, and the group average audiograms are shown as black lines. (Color online)

**Fig. 2**
Touchscreen response options. Subjects were instructed to select either one or two vowels depending on how many they heard (highlighted in a different color when selected), then select “Done.” Subjects also had the option to repeat the vowel presentation if needed

**Fig. 3**
Binaural fusion ranges of the subjects in the study. Fusion ranges are shown for reference tones of 2000 Hz for HI listeners (solid black circles) and NH listeners (solid gray circles), as well as 1000 Hz for HI listeners (open triangles). HI listeners show broader fusion ranges as well as greater variation in fusion ranges compared with NH listeners

**Fig. 4**
Relationship of percent single-vowel responses and vowel recognition scores to the fundamental frequency difference between dichotic vowels. a Percent single-vowel responses versus ΔF₀ for HI listeners. b Percent single-vowel responses versus ΔF₀ for NH listeners. c Percent correct recognition of both vowels versus ΔF₀ for HI listeners. d Percent correct recognition of both vowels versus ΔF₀ for NH listeners. e Percent correct recognition of at least one vowel versus ΔF₀ for HI listeners. f Percent correct recognition of at least one vowel versus ΔF₀ for NH listeners. Thin colored lines with markers show individual data, and thick black dashed lines show group average trends. Fusion decreases and performance increases with ΔF₀ for both groups, with NH listeners showing more benefit of ΔF₀. For legend for HI subjects, see Fig. 1

**Fig. 5**
Relationship of vowel recognition score to vowel fusion, indicated by percentage of single vowel responses. Percentage correct recognition of one vowel in the pair given that a single vowel was selected is plotted versus the percentage of single-vowel responses for HI listeners (black squares) and NH listeners (gray squares). The solid line indicates a linear fit of the NH and HI data pooled together; the correlation is significant. Note that the y-axis starts at 50 %, and the chance level for when one vowel is selected

**Fig. 6**
Relationship of vowel fusion to tone fusion range. The percentage of single vowel responses is significantly correlated with the fusion range measured using tones for all listeners pooled together and for NH listeners alone (gray circles), but not for HI listeners alone (black circles). Linear fits and R² values are shown for NH, HI, and pooled (all) data as dotted, dashed, and dash-dot lines, respectively

**Fig. 7**
Relationship of percent single-vowel responses and vowel recognition scores to the fundamental frequency difference between monaural concurrent vowels. Only data from the left ear is shown; similar results were observed for left and right ears. a Percent single-vowel responses versus ΔF₀ for HI listeners. b Percent single-vowel responses versus ΔF₀ for NH listeners. c Percent correct recognition of both vowels versus ΔF₀ for HI listeners. d Percent correct recognition of both vowels versus ΔF₀ for NH listeners. e Percent correct recognition of at least one vowel versus ΔF₀ for HI listeners. f Percent correct recognition of at least one vowel versus ΔF₀ for NH listeners. Thin colored lines with markers show individual data, and thick black dashed lines show group average trends. Fusion decreases and performance increases with ΔF₀ for both groups

**Fig. 8**
Group average dichotic vowel confusion matrices. Left and right panels show NH and HI groups, respectively. Top and bottom panels show confusions for ΔF₀ = 0 and ΔF₀ = 11 semitones, respectively. Each column indicates a specific vowel pair presented, while rows indicate specific response combinations, including single vowel responses. The numbers and darkness of shading in each cell indicate the number of times a response combination (specified by row location) was selected in response to a vowel pair (indicated by column location). For ΔF₀ = 0, single-vowel responses in the first 4 rows predominate and include selections of single vowel responses not in the original vowel pair, such as /ae/ for /a/ + /i/. For ΔF₀ = 11, more responses on the diagonal, indicating correct double vowel responses, are seen, especially for NH listeners

**Fig. 9**
Example individual dichotic vowel confusion matrices for ΔF₀ = 11 semitones, for NH and HI listeners with narrow, medium, and broad binaural tone fusion. The NH listener with narrow fusion (top left panel) has a high proportion of both vowels correct, indicated by responses on the diagonal. NH and HI listeners with medium binaural tone fusion (right panels) have confusion matrices with similar proportions of correct responses for both vowels (on the diagonal) and fused single-vowel responses (top 4 rows). One HI listener with broad binaural tone fusion mainly fused all vowel pairs into single vowels (top 4 rows), even with a ΔF₀ of 11 semitones

**Fig. 10**
Comparison of dichotic versus monaural concurrent vowel confusion matrices for NH62, a listener with narrow tone fusion, at ΔF₀ = 0 semitones. This listener had a high proportion of single-vowel responses (top 4 rows) in the dichotic condition (top) as well as in the monaural left ear and monaural right ear conditions (left and right, respectively). However, note that the specific vowel confusions differed between the dichotic condition and both monaural conditions (cells highlighted in thick red borders)

**Fig. 11**
Comparison of dichotic versus monaural concurrent vowel confusion matrices for NH78, a listener with somewhat narrow tone fusion, at ΔF₀ = 0 semitones. Plotted as in Fig. 10. Again, note that specific vowel confusions differed between the dichotic and both monaural conditions

**Fig. 12**
Comparison of dichotic versus monaural concurrent vowel confusion matrices for HI25, a listener with broad tone fusion, at ΔF₀ = 0 semitones. Again, note that specific vowel confusions differed between the dichotic (top) and the left ear monaural condition (bottom; the right ear was not tested in this subject)

**Fig. 13**
Simple model of how the perception of /u/ and /ae/ could arise from monaural and dichotic /a/ + /i/, respectively. a–d The filtered spectra of the original vowels /a/, /ae/, /i/, and /u/, with circles indicating the formant peaks selected automatically by estimation of the first two local maxima. e Monaural perception of /a/ + /i/ is modeled as the addition of the two original signals in the time domain, followed by the same spectral filtering as the original vowels. This spectrum predicts two formant peaks most similar to the peaks observed for /u/ (D). f Dichotic perception of /a/ + /i/ is modeled as the linear averaging of the spectra of /a/ (A) and /i/ (C), with wider smoothing windows. The resulting spectrum predicts two formant peaks most similar to the peaks observed for /ae/ (b)

See this image and copyright information in PMC

Cited by

Level differences impact the fusion of concurrent vowels dissimilarly within versus across ears.
Fan L, Reiss LAJ, Molis MR. Fan L, et al. JASA Express Lett. 2022 Sep;2(9):094401. doi: 10.1121/10.0013996. Epub 2022 Sep 7. JASA Express Lett. 2022. PMID: 36097604 Free PMC article.
Fusion of dichotic consonants in normal-hearing and hearing-impaired listenersa).
Sathe NC, Kain A, Reiss LAJ. Sathe NC, et al. J Acoust Soc Am. 2024 Jan 1;155(1):68-77. doi: 10.1121/10.0024245. J Acoust Soc Am. 2024. PMID: 38174963 Free PMC article.
Onset Asynchrony: Cue to Aid Dichotic Vowel Segregation in Listeners With Normal Hearing and Hearing Loss.
Eddolls MS, Molis MR, Reiss LAJ. Eddolls MS, et al. J Speech Lang Hear Res. 2022 Jul 18;65(7):2709-2719. doi: 10.1044/2022_JSLHR-21-00411. Epub 2022 Jun 21. J Speech Lang Hear Res. 2022. PMID: 35728021 Free PMC article.
A one-man bilingual cocktail party: linguistic and non-linguistic effects on bilinguals' speech recognition in Mandarin and English.
Smith ED, Holt LL, Dick F. Smith ED, et al. Cogn Res Princ Implic. 2024 Jun 5;9(1):35. doi: 10.1186/s41235-024-00562-w. Cogn Res Princ Implic. 2024. PMID: 38834918 Free PMC article.
Sequential auditory grouping reduces binaural pitch fusion in listeners with normal hearing, hearing aids, and cochlear implantsa).
Oh Y, Dean N, Gallun FJ, Reiss LAJ. Oh Y, et al. J Acoust Soc Am. 2024 Nov 1;156(5):3217-3231. doi: 10.1121/10.0034366. J Acoust Soc Am. 2024. PMID: 39535240

See all "Cited by" articles

References

1. Arehart KH, King CA, Mclean-Mudgett KS. Role of fundamental frequency differences in the perceptual separation of competing vowel sounds by listeners with normal hearing and listeners with hearing loss. J Speech Lang Hear Res. 1997;40:1434–1444. doi: 10.1044/jslhr.4006.1434. - DOI - PubMed
1. Arehart KH, Katz-Rossi J, Prustman JS. Double-vowel perception in listeners with cochlear hearing loss: differences in fundamental frequency, ear of presentation, and relative amplitude. J Speech Lang Hear Res. 2005;48:236–252. doi: 10.1044/1092-4388(2005/017). - DOI - PubMed
1. Boersma P, Weenink D. Praat: doing phonetics by computer [Computer program] Version. 2016;6:22.
1. Brungart D. Informational and energetic masking effects in the perception of two simultaneous talkers. J Acoust Soc Am. 2001;109:1101–1109. doi: 10.1121/1.1345696. - DOI - PubMed
1. Carney LH, Mcdonough JM. Nonlinear auditory models yield new insights into representations of vowels. Atten Percept Psychophys. 2019;81(4):1034–1046. doi: 10.3758/s13414-018-01644-w. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 DC013307/DC/NIDCD NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An Alternative Explanation for Difficulties with Speech in Background Talkers: Abnormal Fusion of Vowels Across Fundamental Frequency and Ears

Affiliations

An Alternative Explanation for Difficulties with Speech in Background Talkers: Abnormal Fusion of Vowels Across Fundamental Frequency and Ears

Authors

Affiliations

Erratum in

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Erratum in

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials