Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 26;13(6):e0199443.
doi: 10.1371/journal.pone.0199443. eCollection 2018.

Auditory traits of "own voice"

Affiliations

Auditory traits of "own voice"

Marino Kimura et al. PLoS One. .

Abstract

People perceive their recorded voice differently from their actively spoken voice. The uncanny valley theory proposes that as an object approaches humanlike characteristics, there is an increase in the sense of familiarity; however, eventually a point is reached where the object becomes strangely similar and makes us feel uneasy. The feeling of discomfort experienced when people hear their recorded voice may correspond to the floor of the proposed uncanny valley. To overcome the feeling of eeriness of own-voice recordings, previous studies have suggested equalization of the recorded voice with various types of filters, such as step, bandpass, and low-pass, yet the effectiveness of these filters has not been evaluated. To address this, the aim of experiment 1 was to identify what type of voice recording was the most representative of one's own voice. The voice recordings were presented in five different conditions: unadjusted recorded voice, step filtered voice, bandpass filtered voice, low-pass filtered voice, and a voice for which the participants freely adjusted the parameters. We found large individual differences in the most representative own-voice filter. In order to consider roles of sense of agency, experiment 2 investigated if lip-synching would influence the rating of own voice. The result suggested lip-synching did not affect own voice ratings. In experiment 3, based on the assumption that the voices used in previous experiments corresponded to continuous representations of non-own voice to own voice, the existence of an uncanny valley was examined. Familiarity, eeriness, and the sense of own voice were rated. The result did not support the existence of an uncanny valley. Taken together, the experiments led us to the following conclusions: there is no general filter that can represent own voice for everyone, sense of agency has no effect on own voice rating, and the uncanny valley does not exist for own voice, specifically.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Conceptual diagram of the uncanny valley in the voice field.
Adapted from “The Uncanny Valley,” by M. Mori, 1970. Conceptual diagram of the theoretical graph presented in the original uncanny valley theory. X-axis corresponds to similarity between robots and humans and y-axis corresponds to familiarity of the robots. Recorded voice may represent the valley part and own voice the highest point after the valley. Sense of one’s self instead of similarity was used in the present study.
Fig 2
Fig 2. Experiment 1.
Schematic of the task. After the presentation of stimuli, participants chose which of the stimuli sounded more like own-voice by button press.
Fig 3
Fig 3. Experiment 1.
Individual results of pairwise comparison. The bar represents the similarity to own voice, rightmost represents the most own-voice like and leftmost represents the least own-voice like rating. The numbers on the top-half of the bar represents the result of the second session and the ones on bottom-half of the bar are the results of the third session. The numbers are for types of conditions: 1) Recorded voice, 2) Step filtered voice, 3) Bandpass filtered voice, 4) Lowpass filtered voice, 5) Adjusted voice.
Fig 4
Fig 4. Experiment 1.
Consistency of own-voice rating across trials. The consistency of the most and the least own-voice like rating is presented. Blue represents the number of participants who rated both the most and least own voice-like voice consistently, orange represents the number of participants who rated only the least own voice-like sound consistently, yellow represents the number of participants who rated only the most voice-like sound consistently, and gray represents the number of participants who rated both the most and the least own voice-like sound inconsistently.
Fig 5
Fig 5. Experiment 2.
Individual results of pairwise comparison. The bar represents the similarity to own voice, rightmost as the most own-voice like and leftmost as the least own-voice like rating. There were two non-lip synchronization sessions conducted and the results are presented as the numbers on the top bar. The numbers on the bottom bar represent the results of the lip synchronization session. The numbers are the types of conditions: 1) Recorded voice, 2) Step-filtered voice, 3) Bandpass filtered voice, 4) Lowpass filtered voice, 5) Adjusted voice.
Fig 6
Fig 6. Experiment 2.
Participant own voice rating consistency across days. The consistency of own voice-like rating across participants is charted. Blue represents the number of participants who rated both the most and least own voice-like voice consistently, orange represents least choice consistency only, yellow represent most choice consistency only, and gray represents inconsistency for both the most and least own voice-like voice.
Fig 7
Fig 7. Experiment 3.
Schematic of the experimental task. After the presentation of stimulus, the participant rated the stimulus in terms of the presented feature from one to nine by moving a cursor.
Fig 8
Fig 8. Experiment 3.
Results of voice features scoring. The X-axis represents sense of oneself, y-axis represents familiarity for A and eeriness for B. Each individual score is plotted as green dots. The dotted line shows the Pearson’s correlation and the solid represents the cubic equation.

References

    1. Decartese R. Discourse on the Method (I. Maclean, Trans.) New York, NY: Oxford World’s Classics; 2008.
    1. Young JZ. Philosophy and the brain New York, NY: Oxford University Press; 1987.
    1. van Rijn RH. Self-portrait Numerberg, Bavaria: Germanisches Nationalmuseum; 1629.
    1. van Rijn RH. Self-portrait with Beret and Turned-Up Collar. Washington, the United States of America: National Gallery of Art; 1659.
    1. Tonndorf J. A New Concept of Bone Conduction. Arch Otolaryngol. 1968;87: 595–600. - PubMed

Publication types

LinkOut - more resources