Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 8:17:1141884.
doi: 10.3389/fnins.2023.1141884. eCollection 2023.

An in-silico framework for modeling optimal control of neural systems

Affiliations

An in-silico framework for modeling optimal control of neural systems

Bodo Rueckauer et al. Front Neurosci. .

Abstract

Introduction: Brain-machine interfaces have reached an unprecedented capacity to measure and drive activity in the brain, allowing restoration of impaired sensory, cognitive or motor function. Classical control theory is pushed to its limit when aiming to design control laws that are suitable for large-scale, complex neural systems. This work proposes a scalable, data-driven, unified approach to study brain-machine-environment interaction using established tools from dynamical systems, optimal control theory, and deep learning.

Methods: To unify the methodology, we define the environment, neural system, and prosthesis in terms of differential equations with learnable parameters, which effectively reduce to recurrent neural networks in the discrete-time case. Drawing on tools from optimal control, we describe three ways to train the system: Direct optimization of an objective function, oracle-based learning, and reinforcement learning. These approaches are adapted to different assumptions about knowledge of system equations, linearity, differentiability, and observability.

Results: We apply the proposed framework to train an in-silico neural system to perform tasks in a linear and a nonlinear environment, namely particle stabilization and pole balancing. After training, this model is perturbed to simulate impairment of sensor and motor function. We show how a prosthetic controller can be trained to restore the behavior of the neural system under increasing levels of perturbation.

Discussion: We expect that the proposed framework will enable rapid and flexible synthesis of control algorithms for neural prostheses that reduce the need for in-vivo testing. We further highlight implications for sparse placement of prosthetic sensor and actuator components.

Keywords: control theory; dynamical systems; neural prosthesis; neurotechnology; reinforcement learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Classical agent-environment loop, where a neural system interacts with its environment by observing its states and applying feedback control. Here, the deterministic case is shown for simplicity; see Equation (1) for the stochastic form.
Figure 2
Figure 2
Both the environment and the neural system are modeled as dynamical systems using ordinary or stochastic differential equations. Here, the neural system is represented by an RNN with a sensory input component, a motor output component, and an association area which provides the input-output mapping. The system can be trained using backpropagation through time (Werbos, 1990).
Figure 3
Figure 3
Framework to model a perturbed neural system in interaction with its environment (left) and restoring its performance by training a secondary controller (right).
Figure 4
Figure 4
Learning a control policy in the neural system. The panels on the left illustrate the evolution of the environment in state space. The panels on the right show the training curves of the RNN neural system. The rows illustrate the training methods discussed in Section 2.5, applied to the particle stabilization and pendulum balancing problem. The last row also shows the final rewards. In all four cases, the RNN system succeeds in learning a control policy to solve the task.
Figure 5
Figure 5
Restoring particle stabilization performance by training a prosthetic controller directly on the LQR cost. The panels on the left show example trajectories at increasing perturbation strengths (from left to right). The rightmost column compares the loss of the system after training the prosthetic controller against the uncontrolled and unperturbed baselines. The three rows illustrate the case of an impaired sensory, association, or motor population.
Figure 6
Figure 6
Restoring particle stabilization performance by training a prosthetic controller using RL. See caption of Figure 5 for a more detailed description.
Figure 7
Figure 7
Restoring performance of the neural system on the pole balancing task by training a prosthetic controller using RL. See caption of Figure 5 for a more detailed description.
Figure 8
Figure 8
Effect of reducing controllability and observability. The loss is shown when the prosthesis can record from and stimulate only a fraction of neurons in the neural system. The prosthesis restores the performance of the unperturbed neural system when about 10% of the neurons are accessible. This number approximately matches the lower bound estimated from the controllability and observability Gramians (indicated by arrows).

References

    1. Antolík J., Hofer S. B., Bednar J. A., Mrsic-Flogel T. D. (2016). Model constrained by visual hierarchy improves prediction of neural responses to natural scenes. PLoS Comput Biol. 12, e1004927. 10.1371/journal.pcbi.1004927 - DOI - PMC - PubMed
    1. Antolík J., Sabatier Q., Galle C., Frégnac Y., Benosman R. (2021). Assessment of optogenetically-driven strategies for prosthetic restoration of cortical vision in large-scale neural simulation of V1. Sci. Rep. 11, 10783. 10.1038/s41598-021-88960-8 - DOI - PMC - PubMed
    1. Astrom K. J., Murray R. M. (2020). Feedback Systems-an Introduction for Scientists and Engineers. Princeton, NJ: Princeton University Press.
    1. Barto A., Sutton R., Anderson C. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13, 834–846. 10.1109/TSMC.1983.6313077 - DOI
    1. Baydin A. G., Pearlmutter B. A., Radul A. A., Siskind J. M. (2017). Automatic differentiation in machine learning: a survey. J. Mach. Learn. Res. 18, 5595–5637. 10.48550/arXiv.1502.05767 - DOI