Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jun 25:14:36.
doi: 10.3389/fnbot.2020.00036. eCollection 2020.

A Path Toward Explainable AI and Autonomous Adaptive Intelligence: Deep Learning, Adaptive Resonance, and Models of Perception, Emotion, and Action

Affiliations
Review

A Path Toward Explainable AI and Autonomous Adaptive Intelligence: Deep Learning, Adaptive Resonance, and Models of Perception, Emotion, and Action

Stephen Grossberg. Front Neurorobot. .

Abstract

Biological neural network models whereby brains make minds help to understand autonomous adaptive intelligence. This article summarizes why the dynamics and emergent properties of such models for perception, cognition, emotion, and action are explainable, and thus amenable to being confidently implemented in large-scale applications. Key to their explainability is how these models combine fast activations, or short-term memory (STM) traces, and learned weights, or long-term memory (LTM) traces. Visual and auditory perceptual models have explainable conscious STM representations of visual surfaces and auditory streams in surface-shroud resonances and stream-shroud resonances, respectively. Deep Learning is often used to classify data. However, Deep Learning can experience catastrophic forgetting: At any stage of learning, an unpredictable part of its memory can collapse. Even if it makes some accurate classifications, they are not explainable and thus cannot be used with confidence. Deep Learning shares these problems with the back propagation algorithm, whose computational problems due to non-local weight transport during mismatch learning were described in the 1980s. Deep Learning became popular after very fast computers and huge online databases became available that enabled new applications despite these problems. Adaptive Resonance Theory, or ART, algorithms overcome the computational problems of back propagation and Deep Learning. ART is a self-organizing production system that incrementally learns, using arbitrary combinations of unsupervised and supervised learning and only locally computable quantities, to rapidly classify large non-stationary databases without experiencing catastrophic forgetting. ART classifications and predictions are explainable using the attended critical feature patterns in STM on which they build. The LTM adaptive weights of the fuzzy ARTMAP algorithm induce fuzzy IF-THEN rules that explain what feature combinations predict successful outcomes. ART has been successfully used in multiple large-scale real world applications, including remote sensing, medical database prediction, and social media data clustering. Also explainable are the MOTIVATOR model of reinforcement learning and cognitive-emotional interactions, and the VITE, DIRECT, DIVA, and SOVEREIGN models for reaching, speech production, spatial navigation, and autonomous adaptive intelligence. These biological models exemplify complementary computing, and use local laws for match learning and mismatch learning that avoid the problems of Deep Learning.

Keywords: Adaptive Resonance Theory; arm and speech movement; category learning; consciousness; deep learning; emotion; explainable AI; visual boundaries and surfaces.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Circuit diagram of the back propagation model. Input vector ai in level F1 sends a sigmoid signal Si = f(ai) that is multiplied by learned weights wij on their way to level F2. These LTM-weighted signals are added together at F2 with a bias term θj to define xj. A sigmoid signal Sj = f(xj) then generates outputs from F2 that activate two pathways. One pathway inputs to a Differentiator. The other pathway gets multiplied by adaptive weight wjk on the way to level F3. At level F3, the weighted signals are added together with a bias term θk to define xk. A sigmoid signal Sk = f(xk) from F3 defines the Actual Output of the system. This Actual Output Sk is subtracted from a Target Output bk via a back-coupled error correction step. The difference bk – Sk is also multiplied by the term f′(xk) that is computed at the Differentiator from level F3. One function of the Differentiator step is to ensure that the activities and weights remain in a bounded range, because if xk grows too large, then f′(xk) approaches zero. The net effect of these operations is to compute the Error δk = f′(xk)(bk – Sk) that sends a top-down output signal to the level just below it. On the way, each δk is multiplied by the bottom-up learned weights wjk at F3. These weights reach the pathways that carry δk via the process of weight transport. Weight transport is clearly a non-local operation relative to the network connections that carry locally computed signals. All the δk are multiplied by the transported weights wjk and added. This sum is multiplied by another Differentiator term f′(xi) from level F2 to keep the resultant product δj bounded. δj is then back-coupled to adjust all the weights wij in pathways from level F1 to F2 [figure reprinted and text adapted with permission from Carpenter (1989)].
Figure 2
Figure 2
The ART Matching Rule circuit enables bottom-up inputs to fire their target cells, top-down expectations to provide excitatory modulation of cells in their on-center while inhibiting cells in their off-surround, and a convergence of bottom-up and top-down signals to generate an attentional focus at matched cells while continuing to inhibit unmatched cells in the off-surround [adapted with permission from Grossberg (2017b)].
Figure 3
Figure 3
The ART hypothesis testing and learning cycle whereby bottom-up input patterns that are sufficiently mismatched by their top-down expectations can drive hypothesis testing and memory search leading to discovery of recognition categories that can match the bottom-up input pattern well-enough to trigger resonance and learning. See the text for details [adapted with permission from Carpenter and Grossberg (1988)].
Figure 4
Figure 4
When a good enough match occurs between a bottom-up input pattern and top-down expectation, a feature-category resonance is triggered the synchronizes, amplifies, and prolongs the STM activities of the cells that participate in the resonance, while also selecting an attentional focus and triggering learning in the LTM traces in the active bottom-up adaptive filter and top-down expectation pathways to encode the resonating attended data [adapted with permission from Grossberg (2017b)].
Figure 5
Figure 5
When the ART Matching Rule is eliminated by deleting an ART circuit's top-down expectations from the ART 1 model, the resulting competitive learning network experiences catastrophic forgetting even if it tries to learn any of arbitrarily many lists consisting of just four input vectors A, B, C, and D when they are presented repeatedly in the order ABCAD, assuming that the input vectors satisfy the constraints shown in the figure [adapted with permission from Carpenter and Grossberg (1987)].
Figure 6
Figure 6
These computer simulations illustrate how (A) unstable learning and (B) stable learning occur in response to a particular sequence of input vectors A, B, C, D when they are presented repeatedly in the order ABCAD to an ART 1 model. Unstable learning with catastrophic forgetting of the category that codes vector A occurs when no top-down expectations exist, as illustrated by its periodic recoding by categories 1 and 2 on each learning trial. See the text for details [adapted with permission from Carpenter and Grossberg (1987)].
Figure 7
Figure 7
These computer simulations show how the alphabet A, B, C, … is learned by the ART 1 when vigilance is chosen to equal (A) 0.5, or (B) 0.8. Note that more categories are learned in (B) and that their learned prototypes more closely represent the letters that they categorize. Thus, higher vigilance leads to the learning of more concrete categories. See the text for details [reprinted with permission from Carpenter and Grossberg (1987)].
Figure 8
Figure 8
The fuzzy ARTMAP architecture can learn recognition categories in both ARTa and ARTb by unsupervised learning, as well as an associative map via the map field from ARTa to ARTb by supervised learning. See the text for details [adapted with permission from Carpenter et al. (1992)].
Figure 9
Figure 9
(A) A prediction from ARTa to ARTb can be made if the analog match between bottom-up and top-down patterns exceeds the current vigilance value. (B) If a mismatch occurs between the prediction at ARTb and the correct output pattern, then a match tracking signal can increase vigilance just enough to drive hypothesis testing and memory search for a better-matching category at ARTa. Matching tracking hereby sacrifices the minimum amount of generalization necessary to correct the predictive error [adapted with permission from Carpenter and Grossberg (1992)].
Figure 10
Figure 10
Spatially abutting and collinear boundary contour (BC) and feature contour (FC) signals in a Filling-In-DOmain, or FIDO, can trigger depth-selective filling-in of the color carried by the FC signal in that FIDO. See the text for details [adapted with permission from Grossberg and Zajac (2017)].
Figure 11
Figure 11
(A) Object categories are activated by visual or gustatory inputs in anterior inferotemporal (ITA) and rhinal (RHIN) cortices, respectively. Value categories represent the value of anticipated outcomes on the basis of current hunger and satiety inputs in amygdala (AMYG) and lateral hypothalamus (LH). Object-value categories occur in the lateral orbitofrontal (ORB) cortex, for visual stimuli, and the medial orbitofrontal (MORB) cortex, for gustatory stimuli. They use the learned value of perceptual stimuli to choose the most valued stimulus in the current context. The reward expectation filter in the basal ganglia detects the omission or delivery of rewards using a circuit that spans ventral striatum (VS), ventral pallidum (VP), striosomal delay (SD) cells in the ventral striatum, the pedunculopontine nucleus (PPTN), and midbrain dopaminergic neurons of the substantia nigra pars compacta/ventral tegmental area (SNc/VTA). (B) Reciprocal excitatory signals from hypothalamic drive-taste cells to amygdala value category cells can drive the learning of a value category that selectively fires in response to a particular hypothalamic homeostatic activity pattern. See the text for details [adapted with permission from Dranias et al. (2008)].
Figure 12
Figure 12
The Vector Integration to Endpoint, or VITE, model of Bullock and Grossberg (1988) realize the Three S's of arm movement control: Synergy, Synchrony, and Speed. See the text for details [adapted with permission from Bullock and Grossberg (1988)].
Figure 13
Figure 13
(Top half) Neurophysiological data of vector cell responses in motor cortex. (Bottom half) VITE model simulations of a simple arm movement in which the model's difference vector D simulates the data as an emergent property of network interactions [data of Georgopoulos et al. (1982) and Bullock and Grossberg (1988) are reproduced with permission. Figure as a whole is reprinted with permission from Grossberg (2020)].
Figure 14
Figure 14
The DIRECT and DIVA models have homologous circuits to learn and control motor-equivalent reaching and speaking. Tool use and coarticulation are among the resulting useful motor-equivalent properties [reprinted with permission from Grossberg (2020)].

References

    1. Amari S. I. (1972). Characteristics of random nets of analog neuron-like elements. Trans. Syst. Man. Cybern. 2, 643–657. 10.1109/TSMC.1972.4309193 - DOI
    1. Bellmann A., Meuli R., Clarke S. (2001). Two types of auditory neglect. Brain 124, 676–687. 10.1093/brain/124.4.676 - DOI - PubMed
    1. Bregman A. S. (1990). Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge, MA: MIT Press.
    1. Brown J. W., Bullock D., Grossberg S. (1999). How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues. J. Neurosci. 19, 10502–10511. 10.1523/JNEUROSCI.19-23-10502.1999 - DOI - PMC - PubMed
    1. Brown J. W., Bullock D., Grossberg S. (2004). How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades. Neural Netw. 17, 471–510. 10.1016/j.neunet.2003.08.006 - DOI - PubMed