Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 4:4:778060.
doi: 10.3389/frai.2021.778060. eCollection 2021.

Remarks on Multimodality: Grammatical Interactions in the Parallel Architecture

Affiliations

Remarks on Multimodality: Grammatical Interactions in the Parallel Architecture

Neil Cohn et al. Front Artif Intell. .

Abstract

Language is typically embedded in multimodal communication, yet models of linguistic competence do not often incorporate this complexity. Meanwhile, speech, gesture, and/or pictures are each considered as indivisible components of multimodal messages. Here, we argue that multimodality should not be characterized by whole interacting behaviors, but by interactions of similar substructures which permeate across expressive behaviors. These structures comprise a unified architecture and align within Jackendoff's Parallel Architecture: a modality, meaning, and grammar. Because this tripartite architecture persists across modalities, interactions can manifest within each of these substructures. Interactions between modalities alone create correspondences in time (ex. speech with gesture) or space (ex. writing with pictures) of the sensory signals, while multimodal meaning-making balances how modalities carry "semantic weight" for the gist of the whole expression. Here we focus primarily on interactions between grammars, which contrast across two variables: symmetry, related to the complexity of the grammars, and allocation, related to the relative independence of interacting grammars. While independent allocations keep grammars separate, substitutive allocation inserts expressions from one grammar into those of another. We show that substitution operates in interactions between all three natural modalities (vocal, bodily, graphic), and also in unimodal contexts within and between languages, as in codeswitching. Altogether, we argue that unimodal and multimodal expressions arise as emergent interactive states from a unified cognitive architecture, heralding a reconsideration of the "language faculty" itself.

Keywords: codeswitching; grammar; linguistic theory; multimodality; parallel architecture.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
T-shirts all using a pattern of Subject—PictureVerb–Object.
Figure 2
Figure 2
The Parallel Architecture including multiple modalities. Expressive forms arise through emergent states within the full architecture (A). These include single unit expressions (B,D,F) and potentially full languages using recursive grammars (C,E,G).
Figure 3
Figure 3
Lexical entries for (A) the word “heart,” (B) the heart shape, (C) a multimodal construction using the heart shape, and (D) the heart shape as a visual affix.
Figure 4
Figure 4
Multimodal interactions arising in the Parallel Architecture. Emergent states here describe (A) co-speech gesture, (B) text-emoji relationships, and (C) a visual sequence using a narrative grammar alongside grammatical text. Savage Chickens is © 2021 Doug Savage.
Figure 5
Figure 5
Grammatical structures in both (A) the syntactic structure of spoken languages and (B) as the narrative structures of visual languages. Savage Chickens is © 2021 Doug Savage.
Figure 6
Figure 6
Independent grammatical interactions between (A) text and emoji and (B) speech and gesture.
Figure 7
Figure 7
Substitutive grammatical interactions between (A) a sentence of text and emoji, (B) a list of text and an emoji, and (C) speech and gesture.
Figure 8
Figure 8
Abstracted correspondence schema for substitutive allocation.
Figure 9
Figure 9
Substitutive grammatical interactions between a visual narrative sequence and (A) an onomatopoeia and (B) a sentence.

Similar articles

Cited by

References

    1. Bateman J. A. (2014). Text and Image: A Critical Introduction to the Visual/Verbal Divide. New York, NY: Routledge. 10.4324/9781315773971 - DOI
    1. Bateman J. A., Wildfeuer J., Hiippala T. (2017). Multimodality: Foundations, Research and Analysis–A Problem-Oriented Introduction. Berlin: Walter de Gruyter GmbH & Co KG. p. 488. 10.1515/9783110479898 - DOI
    1. Chomsky N. (1956). Three models for the description of language. IRE Trans. Inform. Theory 2, 113–124. 10.1109/TIT.1956.1056813 - DOI - PubMed
    1. Clark H. H. (1996). Using Language. Cambridge: Cambridge University Press. 10.1017/CBO9780511620539 - DOI
    1. Cohn N. (2012). Explaining “I can't draw”: Parallels between the structure and development of language and drawing. Hum. Dev. 55, 167–192. 10.1159/000341842 - DOI

LinkOut - more resources