Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May;1491(1):89-105.
doi: 10.1111/nyas.14532. Epub 2020 Dec 18.

Gesture-speech physics in fluent speech and rhythmic upper limb movements

Affiliations

Gesture-speech physics in fluent speech and rhythmic upper limb movements

Wim Pouw et al. Ann N Y Acad Sci. 2021 May.

Abstract

It is commonly understood that hand gesture and speech coordination in humans is culturally and cognitively acquired, rather than having a biological basis. Recently, however, the biomechanical physical coupling of arm movements to speech vocalization has been studied in steady-state vocalization and monosyllabic utterances, where forces produced during gesturing are transferred onto the tensioned body, leading to changes in respiratory-related activity and thereby affecting vocalization F0 and intensity. In the current experiment (n = 37), we extend this previous line of work to show that gesture-speech physics also impacts fluent speech. Compared with nonmovement, participants who are producing fluent self-formulated speech while rhythmically moving their limbs demonstrate heightened F0 and amplitude envelope, and such effects are more pronounced for higher-impulse arm versus lower-impulse wrist movement. We replicate that acoustic peaks arise especially during moments of peak impulse (i.e., the beat) of the movement, namely around deceleration phases of the movement. Finally, higher deceleration rates of higher-mass arm movements were related to higher peaks in acoustics. These results confirm a role for physical impulses of gesture affecting the speech system. We discuss the implications of gesture-speech physics for understanding of the emergence of communicative gesture, both ontogenetically and phylogenetically.

Keywords: biomechanics; entrainment; hand gesture; speech acoustics; speech production.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Graphical overview of movement conditions. Movement conditions are shown. Each participant performed all conditions (i.e., within‐subjects). To ensure that movement tempo remained relatively constant, participants were shown a moving green bar that indicated whether they moved too fast or too slow relative to a 20% target region of 1.33 Hz. Participants were instructed to have an emphasis in the downbeat with an abrupt stop (i.e., beat) at the maximum extension. The human pose figures were obtained and modified from an open database. 85
Figure 2
Figure 2
Example movement, amplitude envelope, F0 time series, and time‐dependent movement frequency estimates. A sample of about 10 s is shown. With the participant's permission, the speech sample is available at https://osf.io/2qbc6/. The smoothed amplitude envelope in purple traces the waveform maxima. The F0 traces show the concomitant vocalizations in Hz, with an example of vocalization interval and vocalization duration (which were calculated for all vocalizations). The bottom panel shows the continuously estimated movement frequency in cyan, which hovers around 1.33 Hz. In all these panels, the co‐occurring movement is plotted in arbitrary units (a.u.) to show the temporal relation of movement phases and the amplitude envelope, F0, and the movement frequency estimate. In our analysis, we refer to the maximum extension and deceleration phases as relevant moments for speech modulations. In this example, a particularly dramatic acoustic excursion occurs during a moment of deceleration of the arm movement, possibly an example of gesture–speech physics.
Figure 3
Figure 3
Summaries of movement frequency, vocalization duration, and vocalization interval. Density distributions of movement frequencies, vocalization interval, and vocalization duration are shown. There was no movement for the passive condition, but we display the randomly paired movement time series in the surrogate baseline pairing for which frequency information is shown. The red vertical line indicates the target movement frequency (1.3 Hz).
Figure 4
Figure 4
Average F0 and amplitude envelope (ENV) per trial per condition. Violin and box plots are shown for average F0 (Hz) and amplitude envelope (z‐scaled) per trial. (Points are jittered to show per‐trial observations).
Figure 5
Figure 5
Average observed vocalization acoustics relative to the moment of maximum extension. For the upper two panels, the average acoustic trajectory is shown around the moment of maximum extension (t = 0, black vertical dashed line). In the lower panel, we have plotted the z‐scaled average vertical displacement of the hand and the z‐scaled acceleration trace. The blue vertical dashed line marks the moment where the deceleration phase starts, which aligns with peaks in acoustics.
Figure 6
Figure 6
Fitted trajectories GAM.
Figure 7
Figure 7
Relation between maximum deceleration and acoustic peak height. The x‐axis shows the average maximum deceleration per trial (absolutized negative acceleration value), where 0 indicates no deceleration (absolutized) and positive values indicate higher deceleration rates in cm/s2. Each point contains trial‐averaged values. It can be seen that deceleration rates are more extreme for the arm versus the wrist condition. On the y‐axis, we have the average maximum observed amplitude envelope (lower panel) and F0 (upper panel) for those moments of deceleration. Higher decelerations co‐occur with higher peaks in acoustics for arm movements (but not or less so for wrist movements).

References

    1. Feyereisen, P. 2017. The Cognitive Psychology of Speech‐Related Gesture. New York: Routledge.
    1. Holler, J. & Levinson S.C.. 2019. Multimodal language processing in human communication. Trends Cogn. Sci. 23: 639–652. - PubMed
    1. Streeck, J. 2008. Depicting by gesture. Gesture 8: 285–301.
    1. Shattuck‐Hufnagel, S. & Prieto P.. 2019. Dimensionalizing co‐speech gestures. In Proceedings of the International Congress of Phonetic Sciences , Melbourne, Australia.
    1. Wagner, P. , Malisz Z. & Kopp S.. 2014. Gesture and speech in interaction: an overview. Speech Commun. 57: 209–232.

Publication types