Hierarchical Sequencing and Feedforward and Feedback Control Mechanisms in Speech Production: A Preliminary Approach for Modeling Normal and Disordered Speech

Bernd J Kröger¹, Catharina Marie Stille¹, Peter Blouw^{2

3}, Trevor Bekolay^{2

3}, Terrence C Stewart⁴

Affiliations

¹ Department for Phoniatrics, Pedaudiology and Communication Disorders, Medical Faculty, RWTH Aachen University, Aachen, Germany.
² Applied Brain Research, Waterloo, ON, Canada.
³ Centre for Theoretical Neuroscience, University of Waterloo, Waterloo, ON, Canada.
⁴ National Research Council of Canada, University of Waterloo Collaboration Centre, Waterloo, ON, Canada.

PMID: 33262697
PMCID: PMC7686541
DOI: 10.3389/fncom.2020.573554

Hierarchical Sequencing and Feedforward and Feedback Control Mechanisms in Speech Production: A Preliminary Approach for Modeling Normal and Disordered Speech

Bernd J Kröger et al. Front Comput Neurosci. 2020.

. 2020 Nov 11:14:573554.

doi: 10.3389/fncom.2020.573554. eCollection 2020.

Authors

Bernd J Kröger¹, Catharina Marie Stille¹, Peter Blouw^{2

3}, Trevor Bekolay^{2

3}, Terrence C Stewart⁴

Affiliations

¹ Department for Phoniatrics, Pedaudiology and Communication Disorders, Medical Faculty, RWTH Aachen University, Aachen, Germany.
² Applied Brain Research, Waterloo, ON, Canada.
³ Centre for Theoretical Neuroscience, University of Waterloo, Waterloo, ON, Canada.
⁴ National Research Council of Canada, University of Waterloo Collaboration Centre, Waterloo, ON, Canada.

PMID: 33262697
PMCID: PMC7686541
DOI: 10.3389/fncom.2020.573554

Abstract

Our understanding of the neurofunctional mechanisms of speech production and their pathologies is still incomplete. In this paper, a comprehensive model of speech production based on the Neural Engineering Framework (NEF) is presented. This model is able to activate sensorimotor plans based on cognitive-functional processes (i.e., generation of the intention of an utterance, selection of words and syntactic frames, generation of the phonological form and motor plan; feedforward mechanism). Since the generation of different states of the utterance are tied to different levels in the speech production hierarchy, it is shown that different forms of speech errors as well as speech disorders can arise at different levels in the production hierarchy or are linked to different levels and different modules in the speech production model. In addition, the influence of the inner feedback mechanisms on normal as well as on disordered speech is examined in terms of the model. The model uses a small number of core concepts provided by the NEF, and we show that these are sufficient to create this neurobiologically detailed model of the complex process of speech production in a manner that is, we believe, clear, efficient, and understandable.

Keywords: aphasia; computer simulation; hierarchical sequencing; neural engineering framework (NEF); neurocomputational model; semantic pointer architecture (SPA); speech disorders; speech processing.

PubMed Disclaimer

Figures

**Figure 1**
The entire neural model of speech processing. Blue, names of modules; red, hypothetical anatomical location of modules or parts of modules; black, production and perception modalities within production and perception pathway; black, different state levels within the knowledge and skill repository (mental lexicon and mental syllabary); black, peripheral sub-systems and cognitive processes; back, external visual and auditory input; brown, names of state buffers (for description of all state buffers see text and see Table A1 in Appendix). Thick brown arrows indicates the main processing or feedback loop (see text); thin brown arrows indicate all other feedback loops; thin black arrows indicate the interaction between control module with all other modules.

**Figure 2**
Decoded neural information in the neural state buffers of different model levels on the perception and production side by displaying activation strength of states (y-axis; the value one indicates an activation level of 100% for a specific neural state) over time (x-axis in sec). The task defining the process scenario is word perception (here of the word “apple”), word comprehension, and word processing (here of the word “apple is a”) and word production of the result of cognitive processing (here: the word “fruits”). State buffers and there activation patterns from above: input control activity for the process scenario (in_con); generated actions of the control module (out_con); utility values for the selection of actions [see “action selection” processes in Stewart and Eliasmith (2014)]; visual input (arrow under V_perc); auditory input (arrow under A_perc); Concept input for the cognitive processing module (C_cog_in); Relation type input for cognitive processing (rel_type); cognitive output from the cognitive processing module (C_cog_out). Remaining part of the figure: state buffer activations at the concept level (states are named: C_. and Cn_.), lemma level (L_.) and phonological form level (P_ and Pn_.) on the production and perception side (small “n” indicates S-pointers which are part of a S-Pointer network; see text; small “w” indicates “word” in contrast to sub-word units like syllables). On bottom: state buffer activation patterns of the motor plan on the production side (see text). Activation patterns are decoded in form of S-pointer-amplitudes for each speech item at each model level. The standard S-pointer amplitude (i.e., the activation strength of the appropriate neural state) in these “similarity plots” (cf. Eliasmith, 2013) is one (unity length). It can be assumed that a neural representation is activated sufficiently in a state buffer, if the amplitude of the associated S-pointer is higher than 0.7. The display of activation levels is limited to value of two (see C_cog_in and C_cog_out) for all figures, because no new information is given if the activation level of a state is higher than “full activation” (i.e., one). Activation levels higher than one sometimes occur due to a building up of neural energy in a memory buffer, if information is transferred that memory (see figures below).

**Figure 3**
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For a description of the individual buffers, see Figure 2. The process scenario is that of a picture naming task (see text).

**Figure 4**
Decoded neural information in the neural state buffers at different levels of the model on the perception and production side. For the description of all state buffers, see Figure 2. The process scenario is that of a word or logatome repetition task (see text).

**Figure 5**
Decoded neural information occurring in the neural state buffers at different model levels on the perception and production side. For the description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word here: “train”) with an acoustically interspersed distractor word (here: the phonologically and semantically dissimilar word: “fox”). See text for further explanations.

**Figure 6**
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For a description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word here: “train”) with an acoustically interspersed word (here: phonologically and semantically similar word: “trolley”). See text for further explanations.

**Figure 7**
Decoded neural information in the neural state buffers of different model levels on the perception and production side. For the description of the individual buffers see Figure 2. The process scenario up to 750 ms is a picture naming task (target word here: “snake”). After 750 ms, the test supervisor provides an additional phonological cue via the auditory channel [“(the target word begins with) /snE/”]. In this simulation example, this cue leads to the full activation of the correct word at the motor plan level.

**Figure 8**
Decoded neural information in the neural state buffers of different model levels on the perception and production side. For a description of the individual buffers, see Figure 2. The process scenario is the picture naming task (target word here: “duck”). After 750 ms, the test supervisor provides an additional semantic cue via the auditory channel [“(the target word belongs to the group of) birds”]. In this example simulation, this leads to the activation of the correct word.

**Figure 9**
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For the description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word “apple”) plus phonological cue (/Ep/) comparable to Figure 7. In contrast to the simulations shown in Figures 7, 8, a model version was chosen here that directly connects the perception to the production side at the phonological level of the model (shortcut P_perc -> P_prod).

**Figure 10**
Number of correctly implemented target words (maximum value is 18) as a function of the degree of neural dysfunction for the appropriate state buffer or associative memory perturbed by the speech disorder for three different tasks [picture naming (i.e., production); word comprehension and word repetition] for six different neuronal dysfunctions (Broca, Wernicke, transcortical motor, transcortical sensory, mixed and conduction aphasia). In the case of conduction aphasia, the percentage of neuronal dysfunction corresponds directly to the percentage of weakening the appropriate neuronal connections between P_perc and P_prod, since both memories (P_prod and P_perc) use the same S-pointer activation pattern and therefore no associative memory has to be interposed, but a direct neuronal association is possible. In the case of neuronal dysfunction of a buffer or associative memory, the percentage indicates the percentage of ablated neurons within that buffer or memory.

**Figure 11**
Decoded neural information in the neural state buffers on different model levels on the perception and production side. For a description of the individual state buffers, see Figure 2. The process scenario is a picture naming task (target word here: “peanut”) with an acoustically interspersed word (here: phonologically and semantically similar word: “pecan”). See text for further explanations of this rarely occurring case. In order to give the reader an impression how the association of trajectories and S-pointer names is realized, we here included the labels, automatically generated by the simulation software. (These labels were deleted in Figures 2–9).

**Figure 12**
Decoded neural information in the neural state buffers at different model levels on the perception and production side. For the description of the individual state buffers see Figure 2. The process scenario here is a picture naming task (target word here: “fly”) with a later occurring semantic cue, i.e., the superordinate word “bluebottle.” See text for further explanations of this rarely occurring case.

See this image and copyright information in PMC

References

1. Cholin J. (2008). The mental syllabary in speech production: an integration of different approaches and domains. Aphasiology 22, 1127–1141. 10.1080/02687030701820352 - DOI
1. Eliasmith C. (2013). How to Build a Brain: A Neural Architecture for Biological Cognition. New York, NY: Oxford University Press.
1. Eliasmith C., Anderson C. (2004). Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems. Cambridge, MA: MIT Press.
1. Eliasmith C., Stewart T. C., Choo X., Bekolay T., DeWolf T., Tan Y. (2012). A large-scale model of the functioning brain. Science 338, 1202–1205. 10.1126/science.1225266 - DOI - PubMed
1. Glück C. W. (2011). WWT 6-10. Wortschatz-und Wortfindungstest Für 6-Bis 10-Jährige. München: Urban and Fischer.

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Hierarchical Sequencing and Feedforward and Feedback Control Mechanisms in Speech Production: A Preliminary Approach for Modeling Normal and Disordered Speech

Affiliations

Hierarchical Sequencing and Feedforward and Feedback Control Mechanisms in Speech Production: A Preliminary Approach for Modeling Normal and Disordered Speech

Authors

Affiliations

Abstract

Figures

References

LinkOut - more resources

Full Text Sources