Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Dec 16;11(12):e0168167.
doi: 10.1371/journal.pone.0168167. eCollection 2016.

Vocal Imitations of Non-Vocal Sounds

Affiliations

Vocal Imitations of Non-Vocal Sounds

Guillaume Lemaitre et al. PLoS One. .

Abstract

Imitative behaviors are widespread in humans, in particular whenever two persons communicate and interact. Several tokens of spoken languages (onomatopoeias, ideophones, and phonesthemes) also display different degrees of iconicity between the sound of a word and what it refers to. Thus, it probably comes at no surprise that human speakers use a lot of imitative vocalizations and gestures when they communicate about sounds, as sounds are notably difficult to describe. What is more surprising is that vocal imitations of non-vocal everyday sounds (e.g. the sound of a car passing by) are in practice very effective: listeners identify sounds better with vocal imitations than with verbal descriptions, despite the fact that vocal imitations are inaccurate reproductions of a sound created by a particular mechanical system (e.g. a car driving by) through a different system (the voice apparatus). The present study investigated the semantic representations evoked by vocal imitations of sounds by experimentally quantifying how well listeners could match sounds to category labels. The experiment used three different types of sounds: recordings of easily identifiable sounds (sounds of human actions and manufactured products), human vocal imitations, and computational "auditory sketches" (created by algorithmic computations). The results show that performance with the best vocal imitations was similar to the best auditory sketches for most categories of sounds, and even to the referent sounds themselves in some cases. More detailed analyses showed that the acoustic distance between a vocal imitation and a referent sound is not sufficient to account for such performance. Analyses suggested that instead of trying to reproduce the referent sound as accurately as vocally possible, vocal imitations focus on a few important features, which depend on each particular sound category. These results offer perspectives for understanding how human listeners store and access long-term sound representations, and sets the stage for the development of human-computer interfaces based on vocalizations.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Method to create auditory sketches.
Fig 2
Fig 2. Structure of the identification experiment.
Fig 3
Fig 3. Discrimination sensitivity indices (d′) and accuracy (assuming no bias) for the four morphological profiles in the family of product sounds.
The left panels represent the data for the ten imitators (I32, I23, etc. are the code of each imitator). The right panels zoom on the best imitation for each morphological profile and compare it to the three auditory sketches and the referent sounds. Gray shadings represent the quality of the sketches (from light gray—Q1—to dark gray—Q3—and to black—referent sound). The right panels also represent the results of four t-tests comparing the best imitator to each of the three auditory sketches and the referent sounds. When the best imitation is not significantly different from an auditory sketch (with an alpha-value of.05/4), it receives the same shading. Vertical bars represent the 95% confidence interval of the mean. *significantly different from chance level after Bonferroni correction (p<.05/4).
Fig 4
Fig 4. Discrimination sensitivity indices (d′) and accuracy (assuming no bias) for the four morphological profiles in the family of basic mechanical interactions.
See Fig 3 for detail.
Fig 5
Fig 5. Indices of discrimination sensitivity (d′) as a function of the auditory distance between each sound and its corresponding referent sound.
Auditory differences are calculated by computing the cost of aligning the auditory spectrograms of the two sounds [76]. Circles represent the referent sounds (in this case, the distance is therefore null) and the three sketches. Black stars represent the ten imitators. The dashed line represents the regression line between the auditory distances and the d′ values.
Fig 6
Fig 6. Indices of discrimination sensitivity (d′) as a function of the feature distance between each sound and its corresponding referent sound.
Feature differences are calculated by computing the Euclidean norm of the features defined by [80].

Similar articles

Cited by

References

    1. Porcello T. Speaking of sound: language and the professionalization of sound-recording engineers. Social Studies of Science. 2004;34(5):733–758. 10.1177/0306312704047328 - DOI
    1. Wright P. Linguistic description of auditory signals. Journal of applied psychology. 1971;55(3):244–250. 10.1037/h0031025 - DOI
    1. Houix O, Lemaitre G, Misdariis N, Susini P, Urdapilleta I. A lexical analysis of environmental sound categories. Journal of Experimental Psychology: Applied. 2012;18(1):52–80. - PubMed
    1. Lemaitre G, Susini P, Rocchesso D, Lambourg C, Boussard P. Non-Verbal Imitations as a Sketching Tool for Sound Design In: Aramaki M, Derrien O, Kronland-Martinet R, Ystad S, editors. Sound, Music, and Motion. Lecture Notes in Computer Sciences. Berlin, Heidelberg, Germany: Springer; 2014. p. 558–574.
    1. Lemaitre G, Dessein A, Susini P, Aura K. Vocal imitations and the identification of sound events. Ecological Psychology. 2011;23:267–307. 10.1080/10407413.2011.617225 - DOI

LinkOut - more resources