Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 7;57(9):249.
doi: 10.3758/s13428-025-02747-7.

Aligning syntactic structure to the dynamics of verbal communication: A pipeline for annotating syntactic phrases onto speech acoustics

Affiliations

Aligning syntactic structure to the dynamics of verbal communication: A pipeline for annotating syntactic phrases onto speech acoustics

Cosimo Iaia et al. Behav Res Methods. .

Abstract

To investigate how the human brain encodes the complex dynamics of natural languages, any viable and reproducible analysis pipeline must rely on either manual annotations or natural language processing (NLP) tools, which extract relevant physical (e.g., acoustic, gestural), and structure-building information from speech and language signals. However, annotating syntactic structure for a given natural language is arguably a harder task than annotating the onset and offset of speech units such as phonemes and syllables, as the latter can be identified by relying on the physically overt and temporally measurable properties of the signal, while syntactic units are generally covert and their chunking is model-driven. We describe and validate a pipeline that takes into account both physical and theoretical aspects of speech and language signals, and operates a theory-driven and explicit alignment between overt speech units and covert syntactic units.

Keywords: Alignment; Annotation; Language; Neural tracking; Syntax.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethical Approval: Not applicable. Conflicts of Interest: The authors have no conflicts of interest/competing interests to disclose. Consent to Participate: Not applicable. Consent for Publication: Not applicable. Code availability: The code for aligning syntactic structures to word timestamps is available here: https://github.com/cosimo-iaia9305/align_syntax

Figures

Fig. 1
Fig. 1
Word timestamps extraction. The acoustic signal is synthesized with gTTS (Durette, 2024) in Python, and then the annotation is performed manually in Praat (Boersma & Weenink, 2023). The temporal references of words are marked. Note that two words do not overlap on the time axis, i.e., their onsets and/or offsets never intersect, but follow one another sequentially
Fig. 2
Fig. 2
Visualization of a syntactic tree. Example of the syntactic tree for the sentence “The idea of one employee improved the quality of the product”. The labeling scheme is as follows: Sentence (S); Noun Phrase (NP); Verb Phrase (VP); Prepositional Phrase (PP). All other labels refer to the Part-of-Speech (PoS) of the words: Determiner (Det); Adposition (Adp); Noun (Nouns); Verb (Verb)
Fig. 3
Fig. 3
Alignment of word timestamps and syntactic structure. In this example, the higher-order phrases will share some of the onset/offset with lower-order phrases, as they are nested into each other. For example, the higher NP (the idea of one employee) has the same onset as the lower-left NP (the idea) and the same offset as the lower-right PP (of one employee) and NP (one employee). For this example, annotations at word level, as well as for the syntactic structure, were performed manually in Praat (Boersma & Weenink, 2023). The audio signal was synthesized using the gTTS (Durette, 2024) library in Python
Fig. 4
Fig. 4
General overview of the pipeline. Given word timestamps and the parsed syntactic structure, the pipeline provides explicit alignment by tagging every word with all syntactic node labels necessary to integrate that word into the sentence. In order to uniquely identify each syntactic node label, its relative position (i.e., treeposition) is added. The final output is a dataframe that contains the word timestamps aligned with the syntactic structure, such that every word is tagged with all relevant information
Fig. 5
Fig. 5
Example of using the treepositions to navigate a tree. For every word in a sentence, one loops through the structure and extracts the relative treeposition for every node in which the word appears. The indices of each label are used as a unique identifier for each node in the structure
Fig. 6
Fig. 6
Comparison of durations of syntactic phrases extracted manually and automatically. a QQplot of the distributions of the durations of automatically extracted phrases and manually extracted phrases in seconds. b Empirical cumulative probability distribution of both automatic and manual annotations; c Equivalence prior and posterior for all phrases. d Equivalence Bayesian independent-samples t tests for all phrase labels. In case of mismatch in the number of items to compare, we resampled the data

Similar articles

References

    1. Abney, S.P. (1987). The english noun phrase in its sentential aspect (Unpublished doctoral dissertation). M.I.T.
    1. Armeni, K., Güçlü, U., van Gerven, M., & Schoffelen, J. M. (2022). A 10-hour within-participant magnetoencephalography narrative dataset to test models of language comprehension. Scientific Data,91, 278. 10.1038/s41597-022-01382-7 - PMC - PubMed
    1. Beddor, P. S., McGowan, K. B., Boland, J. E., Coetzee, A. W., & Brasher, A. (2013). April). The time course of perception of coarticulation. The Journal of the Acoustical Society of America,4, 2350–2366. 10.1121/1.4794366 - PubMed
    1. Bernstein, J. B. (2008). Reformulating the Determiner Phrase Analysis. Language and Linguistics Compass,2(6), 1246–1270. 10.1111/j.1749-818X.2008.00091.x
    1. Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with python: analyzing text with the natural language toolkit. O’Reilly Media, Inc.

LinkOut - more resources