Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008 Aug;122(3):235-51.
doi: 10.1037/0735-7036.122.3.235.

The cocktail party problem: what is it? How can it be solved? And why should animal behaviorists study it?

Affiliations
Review

The cocktail party problem: what is it? How can it be solved? And why should animal behaviorists study it?

Mark A Bee et al. J Comp Psychol. 2008 Aug.

Abstract

Animals often use acoustic signals to communicate in groups or social aggregations in which multiple individuals signal within a receiver's hearing range. Consequently, receivers face challenges related to acoustic interference and auditory masking that are not unlike the human cocktail party problem, which refers to the problem of perceiving speech in noisy social settings. Understanding the sensory solutions to the cocktail party problem has been a goal of research on human hearing and speech communication for several decades. Despite a general interest in acoustic signaling in groups, animal behaviorists have devoted comparatively less attention toward understanding how animals solve problems equivalent to the human cocktail party problem. After illustrating how humans and nonhuman animals experience and overcome similar perceptual challenges in cocktail-party-like social environments, this article reviews previous psychophysical and physiological studies of humans and nonhuman animals to describe how the cocktail party problem can be solved. This review also outlines several basic and applied benefits that could result from studies of the cocktail party problem in the context of animal acoustic communication.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Spectrograms (top traces) and oscillograms (bottom traces) of animal vocalizations. a. Human speech (“Up to half of all North American bird species nest or feed in wetlands.”) spoken by President George W. Bush during an Earth Day celebration at the Laudholm Farm in Wells, Maine, on April 22, 2004 (courtesy “The George W. Bush Public Domain Audio Archive” at http://thebots.net/GWBushSampleArchive.htm). b. “Phee” calls of the common marmoset, Callithrix jacchus (courtesy Rama Ratnam). c. Advertisement call of the gray treefrog, Hyla chrysoscelis (recorded by the first author). d. Song motif from a European starling, Sturnus vulgaris (courtesy Lang Elliot). e. Portion of an advertisement call of the plains leopard frog, Rana blairi (recorded by the first author). Note that in all cases, the vocalizations consist of sequences of sound elements (e.g., syllables and words [a], call notes [b,e], pulses [c], and song syllables [d]), many of which are comprised of simultaneous spectral components (e.g., harmonics), thus illustrating the potential necessity for sequential and simultaneous integration, as illustrated in part a.
Figure 2
Figure 2
Schematic spectrograms illustrating experimental stimuli for investigating auditory streaming and simultaneous integration/segregation. a. An “ABA–ABA–...” tone sequence with a small difference in frequency (ΔF) between the A and B tones and a long tone repetition time (TRT); such a sequence would be perceived as a single, integrated stream of alternating tones with a galloping rhythm. b. An “ABA–ABA–...” tone sequence with a large ΔF and a short TRT; such a sequence would be perceived as two segregated streams, each with an isochronous rhythm. The dashed lines in a and b indicate the percept (one versus two streams, respectively). c. Three harmonic tone complexes showing a “normal” tone complex (left), a tone complex with a mistuned second harmonic (middle), and a tone complex with an asynchronous second harmonic that begins earlier than other the harmonics. In the later two cases, the second harmonic would likely be segregated from the rest of the integrated tone complex.
Figure 3
Figure 3
Spectrograms illustrating the diversity of call timing interactions in five species of frogs in the African genus Kassina (from Grafe, 2005). a. Alternation. b. Entrainment with occasional overlap. c. Synchrony. d. Entrainment with alternating calls. In each panel, the calls of two different males are labeled as ‘A’ and ‘B’. Note the general similarity between the alternating calls in d and the artificial ABA– tone sequences depicted in Figure 2.
Figure 4
Figure 4
Schematic spectrograms illustrating experimental paradigms for investigating comodulation masking release (CMR) for detecting a short tone signal. a. The “band-widening” paradigm showing a modulated narrowband noise (left) and a modulated broadband noise (right). b. The “flanking band” paradigm showing the on-signal masker, the flanking band masker and either the “uncorrelated” condition (left) or the “comodulated” condition (right). In the schematic examples depicted here, the magnitude of CMR would be greater in the conditions illustrated in the right panel for both paradigms.

References

    1. Assmann PF, Summerfield Q. The contribution of waveform interactions to the perception of concurrent vowels. Journal of the Acoustical Society of America. 1994;95:471–484. - PubMed
    1. Aubin T, Jouventin P. Cocktail-party effect in king penguin colonies. Proceedings of the Royal Society of London Series B-Biological Sciences. 1998;265:1665–1673.
    1. Aubin T, Jouventin P. How to vocally identify kin in a crowd: The penguin model. Advances in the Study of Behavior. 2002;31:243–277.
    1. Barker J. Robust automatic speech recognition. In: Wang D, Brown GJ, editors. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Johne Wiley & Sons, Inc.; Hoboken, NK: 2006. pp. 297–350.
    1. Beauvois MW, Meddis R. Computer simulation of auditory stream segregation in alternating-tone sequences. Journal of the Acoustical Society of America. 1996;99:2270–2280. - PubMed

Publication types