Remarks on Multimodality: Grammatical Interactions in the Parallel Architecture

doi:10.3389/frai.2021.778060

. 2022 Jan 4:4:778060.

doi: 10.3389/frai.2021.778060. eCollection 2021.

Remarks on Multimodality: Grammatical Interactions in the Parallel Architecture

Neil Cohn¹, Joost Schilperoord¹

Affiliations

PMID: 35059636
PMCID: PMC8764459
DOI: 10.3389/frai.2021.778060

Remarks on Multimodality: Grammatical Interactions in the Parallel Architecture

Neil Cohn et al. Front Artif Intell. 2022.

. 2022 Jan 4:4:778060.

doi: 10.3389/frai.2021.778060. eCollection 2021.

Authors

Neil Cohn¹, Joost Schilperoord¹

Affiliation

¹ Department of Communication and Cognition, Tilburg School of Humanities and Digital Sciences, Tilburg University, Tilburg, Netherlands.

PMID: 35059636
PMCID: PMC8764459
DOI: 10.3389/frai.2021.778060

Abstract

Language is typically embedded in multimodal communication, yet models of linguistic competence do not often incorporate this complexity. Meanwhile, speech, gesture, and/or pictures are each considered as indivisible components of multimodal messages. Here, we argue that multimodality should not be characterized by whole interacting behaviors, but by interactions of similar substructures which permeate across expressive behaviors. These structures comprise a unified architecture and align within Jackendoff's Parallel Architecture: a modality, meaning, and grammar. Because this tripartite architecture persists across modalities, interactions can manifest within each of these substructures. Interactions between modalities alone create correspondences in time (ex. speech with gesture) or space (ex. writing with pictures) of the sensory signals, while multimodal meaning-making balances how modalities carry "semantic weight" for the gist of the whole expression. Here we focus primarily on interactions between grammars, which contrast across two variables: symmetry, related to the complexity of the grammars, and allocation, related to the relative independence of interacting grammars. While independent allocations keep grammars separate, substitutive allocation inserts expressions from one grammar into those of another. We show that substitution operates in interactions between all three natural modalities (vocal, bodily, graphic), and also in unimodal contexts within and between languages, as in codeswitching. Altogether, we argue that unimodal and multimodal expressions arise as emergent interactive states from a unified cognitive architecture, heralding a reconsideration of the "language faculty" itself.

Keywords: codeswitching; grammar; linguistic theory; multimodality; parallel architecture.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
T-shirts all using a pattern of Subject—Picture_Verb–Object.

**Figure 2**
The Parallel Architecture including multiple modalities. Expressive forms arise through emergent states within the full architecture **(A)**. These include single unit expressions **(B,D,F)** and potentially full languages using recursive grammars **(C,E,G)**.

**Figure 3**
Lexical entries for **(A)** the word “heart,” **(B)** the heart shape, **(C)** a multimodal construction using the heart shape, and **(D)** the heart shape as a visual affix.

**Figure 4**
Multimodal interactions arising in the Parallel Architecture. Emergent states here describe **(A)** co-speech gesture, **(B)** text-emoji relationships, and **(C)** a visual sequence using a narrative grammar alongside grammatical text. *Savage Chickens* is © 2021 Doug Savage.

**Figure 5**
Grammatical structures in both **(A)** the syntactic structure of spoken languages and **(B)** as the narrative structures of visual languages. *Savage Chickens* is © 2021 Doug Savage.

**Figure 6**
Independent grammatical interactions between **(A)** text and emoji and **(B)** speech and gesture.

**Figure 7**
Substitutive grammatical interactions between **(A)** a sentence of text and emoji, **(B)** a list of text and an emoji, and **(C)** speech and gesture.

**Figure 8**
Abstracted correspondence schema for substitutive allocation.

**Figure 9**
Substitutive grammatical interactions between a visual narrative sequence and **(A)** an onomatopoeia and **(B)** a sentence.

See this image and copyright information in PMC

Cited by

Is Comprehension in Comics More Effective Than in Traditional Texts in Skilled Adult Readers? An Eye Movement-Based Study.
Rasamimanana M, Mizzi R, Melmi JB, Saffi S, Colé P. Rasamimanana M, et al. Cogn Sci. 2025 Jul;49(7):e70081. doi: 10.1111/cogs.70081. Cogn Sci. 2025. PMID: 40650389 Free PMC article.

References

1. Bateman J. A. (2014). Text and Image: A Critical Introduction to the Visual/Verbal Divide. New York, NY: Routledge. 10.4324/9781315773971 - DOI
1. Bateman J. A., Wildfeuer J., Hiippala T. (2017). Multimodality: Foundations, Research and Analysis–A Problem-Oriented Introduction. Berlin: Walter de Gruyter GmbH & Co KG. p. 488. 10.1515/9783110479898 - DOI
1. Chomsky N. (1956). Three models for the description of language. IRE Trans. Inform. Theory 2, 113–124. 10.1109/TIT.1956.1056813 - DOI - PubMed
1. Clark H. H. (1996). Using Language. Cambridge: Cambridge University Press. 10.1017/CBO9780511620539 - DOI
1. Cohn N. (2012). Explaining “I can't draw”: Parallels between the structure and development of language and drawing. Hum. Dev. 55, 167–192. 10.1159/000341842 - DOI

LinkOut - more resources

Full Text Sources

[1] Bateman J. A. (2014). Text and Image: A Critical Introduction to the Visual/Verbal Divide. New York, NY: Routledge. 10.4324/9781315773971 - DOI

[2] Bateman J. A. (2014). Text and Image: A Critical Introduction to the Visual/Verbal Divide. New York, NY: Routledge. 10.4324/9781315773971 - DOI

[3] Bateman J. A., Wildfeuer J., Hiippala T. (2017). Multimodality: Foundations, Research and Analysis–A Problem-Oriented Introduction. Berlin: Walter de Gruyter GmbH & Co KG. p. 488. 10.1515/9783110479898 - DOI

[4] Bateman J. A., Wildfeuer J., Hiippala T. (2017). Multimodality: Foundations, Research and Analysis–A Problem-Oriented Introduction. Berlin: Walter de Gruyter GmbH & Co KG. p. 488. 10.1515/9783110479898 - DOI

[5] Chomsky N. (1956). Three models for the description of language. IRE Trans. Inform. Theory 2, 113–124. 10.1109/TIT.1956.1056813 - DOI - PubMed

[6] Chomsky N. (1956). Three models for the description of language. IRE Trans. Inform. Theory 2, 113–124. 10.1109/TIT.1956.1056813 - DOI - PubMed

[7] Clark H. H. (1996). Using Language. Cambridge: Cambridge University Press. 10.1017/CBO9780511620539 - DOI

[8] Clark H. H. (1996). Using Language. Cambridge: Cambridge University Press. 10.1017/CBO9780511620539 - DOI

[9] Cohn N. (2012). Explaining “I can't draw”: Parallels between the structure and development of language and drawing. Hum. Dev. 55, 167–192. 10.1159/000341842 - DOI

[10] Cohn N. (2012). Explaining “I can't draw”: Parallels between the structure and development of language and drawing. Hum. Dev. 55, 167–192. 10.1159/000341842 - DOI

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Remarks on Multimodality: Grammatical Interactions in the Parallel Architecture

Affiliation

Remarks on Multimodality: Grammatical Interactions in the Parallel Architecture

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources