Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun 26:5:11628.
doi: 10.1038/srep11628.

Musical training, individual differences and the cocktail party problem

Affiliations

Musical training, individual differences and the cocktail party problem

Jayaganesh Swaminathan et al. Sci Rep. .

Erratum in

Abstract

Are musicians better able to understand speech in noise than non-musicians? Recent findings have produced contradictory results. Here we addressed this question by asking musicians and non-musicians to understand target sentences masked by other sentences presented from different spatial locations, the classical 'cocktail party problem' in speech science. We found that musicians obtained a substantial benefit in this situation, with thresholds ~6 dB better than non-musicians. Large individual differences in performance were noted particularly for the non-musically trained group. Furthermore, in different conditions we manipulated the spatial location and intelligibility of the masking sentences, thus changing the amount of 'informational masking' (IM) while keeping the amount of 'energetic masking' (EM) relatively constant. When the maskers were unintelligible and spatially separated from the target (low in IM), musicians and non-musicians performed comparably. These results suggest that the characteristics of speech maskers and the amount of IM can influence the magnitude of the differences found between musicians and non-musicians in multiple-talker "cocktail party" environments. Furthermore, considering the task in terms of the EM-IM distinction provides a conceptual framework for future behavioral and neuroscientific studies which explore the underlying sensory and cognitive mechanisms contributing to enhanced "speech-in-noise" perception by musicians.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A: Speaker locations relative to listener; B&C: Example target and masker waveforms and spectrograms for forward and reversed speech. Target: “Jane took two new toys”; Forward masker1: “Sue bought six red pens”; Forward masker2: “Lynn held nine cold bags”.
Figure 2
Figure 2. Musicians achieved substantially lower thresholds than non-musicians for hearing speech masked by interfering speech.
Panel A: Individual target-to-masker ratio at threshold (TMR) for musicians (red squares) and non-musicians (blue triangles) measured in colocated and separated configurations. The left side of the panel shows results with forward (FWD) maskers, while the right side shows results with reversed (REV) maskers. TMR was calculated as the level of the target at adaptive threshold minus the fixed masker level (55 dB SPL). Panel B: Group mean TMRs for conditions shown in panel A. Panel C: Mean spatial release from masking (SRM = colocated – separated thresholds) for forward and reversed masker configurations measured from musicians and non-musicians. Error bars are ±1 standard error of the mean. *- Statistically significant group difference.
Figure 3
Figure 3. For separated target and maskers, thresholds were correlated across the two masker types (forward and reverse).
However, listeners achieved lower thresholds with reversed maskers (low IM) than forward maskers (high IM). Scatter plot shows thresholds for forward and reversed maskers in the 2 masker separated configurations. Solid line shows least-squares fit to the data points.
Figure 4
Figure 4. Musicians achieved substantially lower thresholds than non-musicians with forward separated maskers and not with reversed separated maskers.
Plot shows group mean TMRs for musicians and non-musicians with spatialized maskers presented as forward (FWD) or reversed (REV) speech. Error bars are ±1 standard error of the mean. *- Statistically significant group difference.

Similar articles

Cited by

References

    1. Wild C. J. et al. Effortful listening: the processing of degraded speech depends critically on attention. J Neurosci 32, 14010–14021 (2012). - PMC - PubMed
    1. Abrams D. A. et al. Decoding temporal structure in music and speech relies on shared brain resources but elicits different fine-scale spatial patterns. Cereb Cortex 21, 1507–1518 (2011). - PMC - PubMed
    1. Leaver A. M. & Rauschecker J. P. Cortical representation of natural complex sounds: effects of acoustic features and auditory object category. J Neurosci 30, 7604–7612 (2010). - PMC - PubMed
    1. Schulze K., Zysset S., Mueller K., Friederici A. D. & Koelsch S. Neuroarchitecture of verbal and tonal working memory in nonmusicians and musicians. Hum Brain Mapp 32, 771–783 (2011). - PMC - PubMed
    1. Angulo-Perkins A. et al. Music listening engages specific cortical regions within the temporal lobes: Differences between musicians and non-musicians. Cortex 59, 126–137 (2014). - PubMed

Publication types

LinkOut - more resources