Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 29;17(6):e0267432.
doi: 10.1371/journal.pone.0267432. eCollection 2022.

Set the tone: Trustworthy and dominant novel voices classification using explicit judgement and machine learning techniques

Affiliations

Set the tone: Trustworthy and dominant novel voices classification using explicit judgement and machine learning techniques

Cyrielle Chappuis et al. PLoS One. .

Abstract

Prior research has established that valence-trustworthiness and power-dominance are the two main dimensions of voice evaluation at zero-acquaintance. These impressions shape many of our interactions and high-impact decisions, so it is crucial for many domains to understand this dynamic. Yet, the relationship between acoustical properties of novel voices and personality/attitudinal traits attributions remains poorly understood. The fundamental problem of understanding vocal impressions and relative decision-making is linked to the complex nature of the acoustical properties in voices. In order to disentangle this relationship, this study extends the line of research on the acoustical bases of vocal impressions in two ways. First, by attempting to replicate previous finding on the bi-dimensional nature of first impressions: using personality judgements and establishing a correspondence between acoustics and voice-first-impression (VFI) dimensions relative to sex (Study 1). Second (Study 2), by exploring the non-linear relationships between acoustical parameters and VFI by the means of machine learning models. In accordance with literature, a bi-dimensional projection comprising valence-trustworthiness and power-dominance evaluations is found to explain 80% of the VFI. In study 1, brighter (high center of gravity), smoother (low shimmers), and louder (high minimum intensity) voices reflected trustworthiness, while vocal roughness (harmonic to noise-ratio), energy in the high frequencies (Energy3250), pitch (Quantile 1, Quantile 5) and lower range of pitch values reflected dominance. In study 2, above chance classification of vocal profiles was achieved by both Support Vector Machine (77.78%) and Random-Forest (Out-Of-Bag = 36.14) classifiers, generally confirming that machine learning algorithms could predict first impressions from voices. Hence results support a bi-dimensional structure to VFI, emphasize the usefulness of machine learning techniques in understanding vocal impressions, and shed light on the influence of sex on VFI formation.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Confusion matrices of p(true attitude | model prediction) for each SVM.
(A) Results of vocal classification for different attitudinal labels. with a 77.8% accuracy. From bottom left to top corner right are all the congruent classifications (HTHD: trustworthy/dominant, HTLD: trustworthy/non-dominant, LTHD: untrustworthy/dominant, LTLD: untrustworthy/non-dominant). True attitude represents the given class label, and predicted attitude is the assigned class label after supervised learning. (B) Results of vocal classification for valence (positive, negative or neutral). (C) Results of vocal classification for dominance (high, low or neutral), from bottom left corner to top right are all the correctly classified items (Low Dominance = 50%; neutral = 54.3%, high dominance = 73.3%).
Fig 2
Fig 2. Receiver Operating Characteristic (ROC) curves for bi-dimensional profiles (A), valence (B) and dominance (C) classification for Polynomial SVMs.
Fig 3
Fig 3. Visualization of all decision trees in the RF, for energy levels in the 250-750Hz band, relative to minimum intensity and HNR.
Each dot indicates a decision tree’s classification. HTLD and LTHD do not appear, indicating the combination of the three features were not significantly associated to those classes.

Similar articles

References

    1. Cantril H, Allport GW. The psychology of radio. 1935.
    1. Cuddy AJC, Fiske ST, Glick P. Warmth and Competence as Universal Dimensions of Social Perception: The Stereotype Content Model and the BIAS Map. Adv Exp Soc Psychol. 2008;40: 61–149. doi: 10.1016/S0065-2601(07)00002-0 - DOI
    1. Feinberg DR, DeBruine LM, Jones BC, Little AC. Correlated preferences for men’s facial and vocal masculinity. Evol Hum Behav. 2008;29: 233–241. doi: 10.1016/j.evolhumbehav.2007.12.008 - DOI
    1. Flowe HD. Do characteristics of faces that convey trustworthiness and dominance underlie perceptions of criminality? PLoS One. 2012;7: 1–7. doi: 10.1371/journal.pone.0037253 - DOI - PMC - PubMed
    1. Oosterhof NN, Todorov A. The functional basis of face evaluation. Proc Natl Acad Sci. 2008;105: 11087–11092. doi: 10.1073/pnas.0805664105 - DOI - PMC - PubMed