Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 25;14(7):e0219955.
doi: 10.1371/journal.pone.0219955. eCollection 2019.

Vocal imitation of percussion sounds: On the perceptual similarity between imitations and imitated sounds

Affiliations

Vocal imitation of percussion sounds: On the perceptual similarity between imitations and imitated sounds

Adib Mehrabi et al. PLoS One. .

Erratum in

Abstract

Recent studies have demonstrated the effectiveness of the voice for communicating sonic ideas, and the accuracy with which it can be used to imitate acoustic instruments, synthesised sounds and environmental sounds. However, there has been little research on vocal imitation of percussion sounds, particularly concerning the perceptual similarity between imitations and the sounds being imitated. In the present study we address this by investigating how accurately musicians can vocally imitate percussion sounds, in terms of whether listeners consider the imitations 'more similar' to the imitated sounds than to other same-category sounds. In a vocal production task, 14 musicians imitated 30 drum sounds from five categories (cymbals, hats, kicks, snares, toms). Listeners were then asked to rate the similarity between the imitations and same-category drum sounds via web based listening test. We found that imitated sounds received the highest similarity ratings for 16 of the 30 sounds. The similarity between a given drum sound and its imitation was generally rated higher than for imitations of another same-category sound, however for some drum categories (snares and toms) certain sounds were consistently considered most similar to the imitations, irrespective of the sound being imitated. Finally, we apply an existing auditory image based measure for perceptual similarity between same-category drum sounds, to model the similarity ratings using linear mixed effect regression. The results indicate that this measure is a good predictor of perceptual similarity between imitations and imitated sounds, when compared to acoustic features containing only temporal or spectral features.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Graphical interface of a single test page used for the online listening test.
Listeners were asked to rate the similarity between the imitation (reference) and 6 test items (same–category drum sounds), on a continuous scale from ‘less similar’ to ‘more similar’.
Fig 2
Fig 2. Contingency tables of the highest rated sound for each imitated (i.e. target) sound, by drum category.
Cell values and shading indicate the proportion (0–1) of tests for a given imitated sound where the rated sound was considered most similar to the imitation. Asterisks in the diagonals indicate cases where the imitated sound was rated most similar to the imitation, significantly above chance (padj <0.05). (A) Cymbals, (B) Hats, (C) Kicks, (D) Snares, (E) Toms.
Fig 3
Fig 3. Comparison of similarity ratings between imitations and target vs. non–target sounds, by drum category.
Values are mean rating parameter estimates with 95% Wald confidence intervals.
Fig 4
Fig 4. Comparison of similarity ratings between imitations and target vs. non–target sounds, by imitator.
Values are mean ratings with 95% Wald confidence intervals.
Fig 5
Fig 5. Slope estimates for the LMER model fitted using the distance measure from method PHG.
A negative slope indicates a decrease in perceptual similarity with an increase in distance, i.e. sounds for which the method performs well. Values are mean estimates across all imitations for each drum sound, with 95% Wald confidence intervals.

Similar articles

Cited by

References

    1. Andersen K, Grote F. GiantSteps: Semi-structured conversations with musicians. In: Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. Seoul, Korea; 2015. p. 2295–2300.
    1. Sundberg J. The Science of the Singing Voice. Illinois, USA: Northern Illinois University Press; 1989.
    1. Atherton M. Rhythm-Speak: Mnemonic, Language play or Song? In: Proceedings of the International Conference on Music Communication Science. Sydney, Australia; 2007. p. 15–18.
    1. Lemaitre G, Dessein A, Susini P, Aura K. Vocal imitations and the identification of sound events. Ecological Psychology. 2011;23(4):267–307. 10.1080/10407413.2011.617225 - DOI
    1. Lemaitre G, Rocchesso D. On the effectiveness of vocal imitations and verbal descriptions of sounds. The Journal of the Acoustical Society of America. 2014;135(2):862–873. 10.1121/1.4861245 - DOI - PubMed

Publication types