Monkeys can identify pictures from words
- PMID: 39937708
- PMCID: PMC11819547
- DOI: 10.1371/journal.pone.0317183
Monkeys can identify pictures from words
Abstract
Humans learn and incorporate cross-modal associations between auditory and visual objects (e.g., between a spoken word and a picture) into language. However, whether nonhuman primates can learn cross-modal associations between words and pictures remains uncertain. We trained two rhesus macaques in a delayed cross-modal match-to-sample task to determine whether they could learn associations between sounds and pictures of different types. In each trial, the monkeys listened to a brief sound (e.g., a monkey vocalization or a human word), and retained information about the sound to match it with one of 2-4 pictures presented on a touchscreen after a 3-second delay. We found that the monkeys learned and performed proficiently in over a dozen associations. In addition, to test their ability to generalize, we exposed them to sounds uttered by different individuals. We found that their hit rate remained high but more variable, suggesting that they perceived the new sounds as equivalent, though not identical. We conclude that rhesus monkeys can learn cross-modal associations between objects of different types, retain information in working memory, and generalize the learned associations to new objects. These findings position rhesus monkeys as an ideal model for future research on the brain pathways of cross-modal associations between auditory and visual objects.
Copyright: © 2025 Cabrera-Ruiz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Conflict of interest statement
NO authors have competing interests.
Figures




References
-
- Bowerman M, Choi S. Shaping meanings for language: universal and language–specific in the acquisition of spatial semantic categories. In: Bowerman M, Levinson S, editors. Language acquisition and conceptual development. Cambridge, UK; 2001. p. 475–511.
-
- Noesselt T, Rieger JW, Schoenfeld MA, Kanowski M, Hinrichs H, Heinze HJ, et al.. Audiovisual temporal correspondence modulates human multisensory superior temporal sulcus plus primary sensory cortices. Journal of Neuroscience. 2007. Oct 17;27(42):11431–41. doi: 10.1523/JNEUROSCI.2252-07.2007 - DOI - PMC - PubMed
-
- Vihman M, Croft W. Phonological development: Toward a “radical” templatic phonology. Linguistics. 2007. Jul 20;45(4):683–725.
MeSH terms
LinkOut - more resources
Full Text Sources