Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Aug 16:4:12.
doi: 10.3389/fnevo.2012.00012. eCollection 2012.

Birds, primates, and spoken language origins: behavioral phenotypes and neurobiological substrates

Affiliations

Birds, primates, and spoken language origins: behavioral phenotypes and neurobiological substrates

Christopher I Petkov et al. Front Evol Neurosci. .

Abstract

Vocal learners such as humans and songbirds can learn to produce elaborate patterns of structurally organized vocalizations, whereas many other vertebrates such as non-human primates and most other bird groups either cannot or do so to a very limited degree. To explain the similarities among humans and vocal-learning birds and the differences with other species, various theories have been proposed. One set of theories are motor theories, which underscore the role of the motor system as an evolutionary substrate for vocal production learning. For instance, the motor theory of speech and song perception proposes enhanced auditory perceptual learning of speech in humans and song in birds, which suggests a considerable level of neurobiological specialization. Another, a motor theory of vocal learning origin, proposes that the brain pathways that control the learning and production of song and speech were derived from adjacent motor brain pathways. Another set of theories are cognitive theories, which address the interface between cognition and the auditory-vocal domains to support language learning in humans. Here we critically review the behavioral and neurobiological evidence for parallels and differences between the so-called vocal learners and vocal non-learners in the context of motor and cognitive theories. In doing so, we note that behaviorally vocal-production learning abilities are more distributed than categorical, as are the auditory-learning abilities of animals. We propose testable hypotheses on the extent of the specializations and cross-species correspondences suggested by motor and cognitive theories. We believe that determining how spoken language evolved is likely to become clearer with concerted efforts in testing comparative data from many non-human animal species.

Keywords: avian; communication; evolution; humans; monkeys; neurobiology; speech; vertebrates.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Hypothetical distributions of two behavioral phenotypes: vocal learning and sensory (auditory) sequence learning. We hypothesize that the behavioral phenotypes of vocal learning and auditory learning are distributed along several categories. (A) Vocal learning complexity phenotype and (B) auditory sequence learning phenotype. The left axis (blue) illustrates the hypothetical distribution of species along the behavioral phenotype dimensions. The right axis (black step functions) illustrates different types of transitions along the hypothesized vocal-learning (A) or auditory-learning (B) complexity dimensions. See manuscript text for the basis for the relative position of the non-human animals illustrated in this figure, which in some cases is based on limited data. Also see Arriaga and Jarvis (in press) for an initial proposal of this idea. Whether the actual distributions are continuous functions (blue curves), will need to be tested, in relation to the alternatives that there are several categories with gradual transitions or step functions (black curves). Although auditory learning is a prerequisite for vocal learning and there can be a correlation between the two phenotypes (A–B), the two need not be interdependent. A theoretical Turing machine (Turing, 1968) is illustrated [G*], which can outperform humans on memory for digitized auditory input but is not a vocal learner.
Figure 2
Figure 2
Avian phylogenetic tree and the complex-vocal learning phenotype. Shown is an avian phylogenetic tree (based on: Hackett et al., 2008). Identified in red text and * are three groups of complex-vocal learning birds. Below the figure are summarized three alternative hypotheses on the evolutionary mechanisms of complex-vocal learning in birds (see text, and Jarvis, 2004). The auditory sequence learning phenotype described in Figure 1B, is not shown here, since some forms of auditory learning seem to be present in all birds. However, further comparative data is needed on the learning of the complexity of auditory sequences, which to our knowledge has been tested using Artificial Grammars only in songbirds (Gentner et al., ; van Heijningen et al., ; Abe and Watanabe, 2011).
Figure 3
Figure 3
Primate phylogenetic tree and complex-vocal learning vs. auditory sequence learning. Shown is a primate phylogenetic tree based on a combination of DNA sequence and fossil age data (Goodman et al., ; Page et al., 1999); for a recent review see (Cartmill, 2010). Humans (Homo) are the only primates classified as “vocal learners.” However, non-human primates might be better at auditory sequence learning than their limited vocal-production learning capabilities would suggest. In blue text and (#) we highlight species for which there is some evidence of Artificial Grammar Learning capabilities for at least adjacent relationships between the elements in a sequence (tamarins: Fitch and Hauser, 2004), (macaques: Wilson et al., 2011). Presuming that the auditory capabilities of guenons and gibbons mentioned in the text (or the symbolic learning of signs by apes) would mean that these animals are able to learn at least adjacent relationships in Artificial Grammars we can tentatively mark these species also in blue #. Note however, that for the species labeled in black text, future studies might show them to be capable of some limited-vocal learning or various levels of complexity in learning the structure of auditory sequences. Three not mutually exclusive hypotheses are illustrated for both complex-vocal learning and auditory sequence learning.
Figure 4
Figure 4
Vocalization subsystems in complex-vocal learners and in limited-vocal learners or vocal non-learners: Direct and indirect pathways. The different subsystems for vocalization and their interconnectivity are illustrated using different colors. (A) Schematic of a songbird brain showing some connectivity of the four major song nuclei (HVC, RA, AreaX, and LMAN). (B) Human brain schematic showing the different proposed vocal subsystems. The learned vocalization subsystem consists of a primary motor cortex pathway (blue arrow) and a cortico-striatal-thalamic loop for learning vocalizations (white). Also shown is the limbic vocal subsystem that is broadly conserved in primates for producing innate vocalizations (black), and the motoneurons that control laryngeal muscles (red). (C) Known connectivity of a brainstem vocal system (not all connections shown) showing absence of forebrain song nuclei in vocal non-learning birds. (D) Known connectivity of limited-vocal learning monkeys (based on data in squirrel monkeys and macaques) showing presence of forebrain regions for innate vocalization (ACC, OFC, and amygdala) and also of a ventral premotor area (Area 6vr) of currently poorly understood function that is indirectly connected to nucleus ambiguous (see text). The LMC in humans is directly connected with motoneurons in the nucleus ambiguus, which orchestrate the production of learned vocalizations (also see Figure 5B). Only the direct pathway through the mammalian basal ganglia (ASt, anterior striatum; GPi, globus palidus, internal) is shown as this is the one most similar to AreaX connectivity in songbirds. Modified figure based on (Jarvis, ; Jarvis et al., 2005). Abbreviations: ACC, anterior cingulate cortex; Am, nucleus ambiguus; Amyg, amygdala; AT, anterior thalamus; Av, nucleus avalanche; DLM, dorsolateral nucleus of the medial thalamus; DM, dorsal medial nucleus of the midbrain; HVC, high vocal center; LMAN, lateral magnocellular nucleus of the anterior nidopallium; LMC, laryngeal motor cortex; OFC, orbito-frontal cortex; PAG, periaqueductal gray; RA, robust nucleus of the of arcopallium; RF, reticular formation; vPFC, ventral prefrontal cortex; VLT, ventro-lateral division of thalamus; XIIts, bird twelfth nerve nucleus.
Figure 5
Figure 5
Human syntactic learning and vocal production sub-systems, with hypothesized monkey and bird evolutionary substrates. (A) Auditory perceptual learning system in humans (red and orange). Primary (pAC) and non-primary (npAC) auditory cortical regions are engaged in the auditory perceptual organization of sound. (B) The perceptual learning system interacts with a system for learned vocal production (blue, also see Figure 4B). (C) Hypothetical evolutionary “proto-syntactic” pathways that might be engaged in monkeys for the perceptual learning of different auditory sequence structures in Finite-State Artificial-Grammars (FSG), e.g., adjacent (red text) vs. non-adjacent (orange text) relationships (also see text). Note that the hypothetical ventral pathway is not expected to directly engage monkey Area 6vr (black) or the innate vocal production subsystem (black; see Figure 4D). More bilateral hemispheric engagement might be expected in non-human primates, see text, and/or that the cortical-striatal-thalamic loop would also be engaged in certain forms of implicit sequence learning. (D) Songbird auditory (red region and red/orange arrows) and song motor (blue regions) pathways. The auditory pathway is proposed to interact with motor regions adjacent to song nuclei for syntactic-like processing and production of vocal or non-vocal behaviors. Abbreviations: AC, auditory cortex; EC, extreme capsule fasciculus; SLF, superior-longitudinal fasciculus; UF, uncinate fasciculus. CM, caudal mesopallium; DLPFC, dorso-lateral prefrontal cortex; FMC, face motor cortex; FOP, frontal operculum; L2/L3, fields L2 and L3; NIf, interfacial nucleus of the nidopallium; NCM, caudal medial nidopallium; SMA, supplementary motor area; vF4/vF4, macaque anatomical regions ventral F4/F5; 44, 45, Brodmann Areas; See Figure 4 for further abbreviations.

References

    1. Abe K., Watanabe D. (2011). Songbirds possess the spontaneous ability to discriminate syntactic rules. Nat. Neurosci. 14, 1067–1074 10.1038/nn.2869 - DOI - PubMed
    1. Allott R. (1992). The motor theory of language: origin and function, in Language Origin: A Multidisciplinary Approach, eds Wind J., Chiarelli B., Bichakjian B., Nocentini A. (Neatherlands: Kluwer Academic Publishers; ), 105–119
    1. Arbib M. A. (2005). From monkey-like action recognition to human language: an evolutionary framework for neurolinguistics. Behav. Brain Sci. 28, 105–124 discussion 125–167. - PubMed
    1. Arbib M. A. (2010). Mirror system activity for action and language is embedded in the integration of dorsal and ventral pathways. Brain Lang. 112, 12–24 10.1016/j.bandl.2009.10.001 - DOI - PubMed
    1. Arnold K., Zuberbuhler K. (2006). Language evolution: semantic combinations in primate calls. Nature 441, 303 10.1038/441303a - DOI - PubMed

LinkOut - more resources