Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 11:14:573554.
doi: 10.3389/fncom.2020.573554. eCollection 2020.

Hierarchical Sequencing and Feedforward and Feedback Control Mechanisms in Speech Production: A Preliminary Approach for Modeling Normal and Disordered Speech

Affiliations

Hierarchical Sequencing and Feedforward and Feedback Control Mechanisms in Speech Production: A Preliminary Approach for Modeling Normal and Disordered Speech

Bernd J Kröger et al. Front Comput Neurosci. .

Abstract

Our understanding of the neurofunctional mechanisms of speech production and their pathologies is still incomplete. In this paper, a comprehensive model of speech production based on the Neural Engineering Framework (NEF) is presented. This model is able to activate sensorimotor plans based on cognitive-functional processes (i.e., generation of the intention of an utterance, selection of words and syntactic frames, generation of the phonological form and motor plan; feedforward mechanism). Since the generation of different states of the utterance are tied to different levels in the speech production hierarchy, it is shown that different forms of speech errors as well as speech disorders can arise at different levels in the production hierarchy or are linked to different levels and different modules in the speech production model. In addition, the influence of the inner feedback mechanisms on normal as well as on disordered speech is examined in terms of the model. The model uses a small number of core concepts provided by the NEF, and we show that these are sufficient to create this neurobiologically detailed model of the complex process of speech production in a manner that is, we believe, clear, efficient, and understandable.

Keywords: aphasia; computer simulation; hierarchical sequencing; neural engineering framework (NEF); neurocomputational model; semantic pointer architecture (SPA); speech disorders; speech processing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The entire neural model of speech processing. Blue, names of modules; red, hypothetical anatomical location of modules or parts of modules; black, production and perception modalities within production and perception pathway; black, different state levels within the knowledge and skill repository (mental lexicon and mental syllabary); black, peripheral sub-systems and cognitive processes; back, external visual and auditory input; brown, names of state buffers (for description of all state buffers see text and see Table A1 in Appendix). Thick brown arrows indicates the main processing or feedback loop (see text); thin brown arrows indicate all other feedback loops; thin black arrows indicate the interaction between control module with all other modules.
Figure 2
Figure 2
Decoded neural information in the neural state buffers of different model levels on the perception and production side by displaying activation strength of states (y-axis; the value one indicates an activation level of 100% for a specific neural state) over time (x-axis in sec). The task defining the process scenario is word perception (here of the word “apple”), word comprehension, and word processing (here of the word “apple is a”) and word production of the result of cognitive processing (here: the word “fruits”). State buffers and there activation patterns from above: input control activity for the process scenario (in_con); generated actions of the control module (out_con); utility values for the selection of actions [see “action selection” processes in Stewart and Eliasmith (2014)]; visual input (arrow under V_perc); auditory input (arrow under A_perc); Concept input for the cognitive processing module (C_cog_in); Relation type input for cognitive processing (rel_type); cognitive output from the cognitive processing module (C_cog_out). Remaining part of the figure: state buffer activations at the concept level (states are named: C_. and Cn_.), lemma level (L_.) and phonological form level (P_ and Pn_.) on the production and perception side (small “n” indicates S-pointers which are part of a S-Pointer network; see text; small “w” indicates “word” in contrast to sub-word units like syllables). On bottom: state buffer activation patterns of the motor plan on the production side (see text). Activation patterns are decoded in form of S-pointer-amplitudes for each speech item at each model level. The standard S-pointer amplitude (i.e., the activation strength of the appropriate neural state) in these “similarity plots” (cf. Eliasmith, 2013) is one (unity length). It can be assumed that a neural representation is activated sufficiently in a state buffer, if the amplitude of the associated S-pointer is higher than 0.7. The display of activation levels is limited to value of two (see C_cog_in and C_cog_out) for all figures, because no new information is given if the activation level of a state is higher than “full activation” (i.e., one). Activation levels higher than one sometimes occur due to a building up of neural energy in a memory buffer, if information is transferred that memory (see figures below).
Figure 3
Figure 3
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For a description of the individual buffers, see Figure 2. The process scenario is that of a picture naming task (see text).
Figure 4
Figure 4
Decoded neural information in the neural state buffers at different levels of the model on the perception and production side. For the description of all state buffers, see Figure 2. The process scenario is that of a word or logatome repetition task (see text).
Figure 5
Figure 5
Decoded neural information occurring in the neural state buffers at different model levels on the perception and production side. For the description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word here: “train”) with an acoustically interspersed distractor word (here: the phonologically and semantically dissimilar word: “fox”). See text for further explanations.
Figure 6
Figure 6
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For a description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word here: “train”) with an acoustically interspersed word (here: phonologically and semantically similar word: “trolley”). See text for further explanations.
Figure 7
Figure 7
Decoded neural information in the neural state buffers of different model levels on the perception and production side. For the description of the individual buffers see Figure 2. The process scenario up to 750 ms is a picture naming task (target word here: “snake”). After 750 ms, the test supervisor provides an additional phonological cue via the auditory channel [“(the target word begins with) /snE/”]. In this simulation example, this cue leads to the full activation of the correct word at the motor plan level.
Figure 8
Figure 8
Decoded neural information in the neural state buffers of different model levels on the perception and production side. For a description of the individual buffers, see Figure 2. The process scenario is the picture naming task (target word here: “duck”). After 750 ms, the test supervisor provides an additional semantic cue via the auditory channel [“(the target word belongs to the group of) birds”]. In this example simulation, this leads to the activation of the correct word.
Figure 9
Figure 9
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For the description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word “apple”) plus phonological cue (/Ep/) comparable to Figure 7. In contrast to the simulations shown in Figures 7, 8, a model version was chosen here that directly connects the perception to the production side at the phonological level of the model (shortcut P_perc -> P_prod).
Figure 10
Figure 10
Number of correctly implemented target words (maximum value is 18) as a function of the degree of neural dysfunction for the appropriate state buffer or associative memory perturbed by the speech disorder for three different tasks [picture naming (i.e., production); word comprehension and word repetition] for six different neuronal dysfunctions (Broca, Wernicke, transcortical motor, transcortical sensory, mixed and conduction aphasia). In the case of conduction aphasia, the percentage of neuronal dysfunction corresponds directly to the percentage of weakening the appropriate neuronal connections between P_perc and P_prod, since both memories (P_prod and P_perc) use the same S-pointer activation pattern and therefore no associative memory has to be interposed, but a direct neuronal association is possible. In the case of neuronal dysfunction of a buffer or associative memory, the percentage indicates the percentage of ablated neurons within that buffer or memory.
Figure 11
Figure 11
Decoded neural information in the neural state buffers on different model levels on the perception and production side. For a description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word here: “peanut”) with an acoustically interspersed word (here: phonologically and semantically similar word: “pecan”). See text for further explanations of this rarely occurring case. In order to give the reader an impression how the association of trajectories and S-pointer names is realized, we here included the labels, automatically generated by the simulation software. (These labels were deleted in Figures 2–9).
Figure 12
Figure 12
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For the description of the individual state buffers see Figure 2. The process scenario here is a picture naming task (target word here: “fly”) with a later occurring semantic cue, i.e., the superordinate word “bluebottle.” See text for further explanations of this rarely occurring case.

References

    1. Cholin J. (2008). The mental syllabary in speech production: an integration of different approaches and domains. Aphasiology 22, 1127–1141. 10.1080/02687030701820352 - DOI
    1. Eliasmith C. (2013). How to Build a Brain: A Neural Architecture for Biological Cognition. New York, NY: Oxford University Press.
    1. Eliasmith C., Anderson C. (2004). Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems. Cambridge, MA: MIT Press.
    1. Eliasmith C., Stewart T. C., Choo X., Bekolay T., DeWolf T., Tan Y. (2012). A large-scale model of the functioning brain. Science 338, 1202–1205. 10.1126/science.1225266 - DOI - PubMed
    1. Glück C. W. (2011). WWT 6-10. Wortschatz-und Wortfindungstest Für 6-Bis 10-Jährige. München: Urban and Fischer.

LinkOut - more resources