Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov 12;9(11):e112265.
doi: 10.1371/journal.pone.0112265. eCollection 2014.

Adaptive robotic control driven by a versatile spiking cerebellar network

Affiliations

Adaptive robotic control driven by a versatile spiking cerebellar network

Claudia Casellato et al. PLoS One. .

Erratum in

Abstract

The cerebellum is involved in a large number of different neural processes, especially in associative learning and in fine motor control. To develop a comprehensive theory of sensorimotor learning and control, it is crucial to determine the neural basis of coding and plasticity embedded into the cerebellar neural circuit and how they are translated into behavioral outcomes in learning paradigms. Learning has to be inferred from the interaction of an embodied system with its real environment, and the same cerebellar principles derived from cell physiology have to be able to drive a variety of tasks of different nature, calling for complex timing and movement patterns. We have coupled a realistic cerebellar spiking neural network (SNN) with a real robot and challenged it in multiple diverse sensorimotor tasks. Encoding and decoding strategies based on neuronal firing rates were applied. Adaptive motor control protocols with acquisition and extinction phases have been designed and tested, including an associative Pavlovian task (Eye blinking classical conditioning), a vestibulo-ocular task and a perturbed arm reaching task operating in closed-loop. The SNN processed in real-time mossy fiber inputs as arbitrary contextual signals, irrespective of whether they conveyed a tone, a vestibular stimulus or the position of a limb. A bidirectional long-term plasticity rule implemented at parallel fibers-Purkinje cell synapses modulated the output activity in the deep cerebellar nuclei. In all tasks, the neurorobot learned to adjust timing and gain of the motor responses by tuning its output discharge. It succeeded in reproducing how human biological systems acquire, extinguish and express knowledge of a noisy and changing world. By varying stimuli and perturbations patterns, real-time control robustness and generalizability were validated. The implicit spiking dynamics of the cerebellar model fulfill timing, prediction and learning functions.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Cerebellar SNN.
The computational model applied for creating the cerebellar spiking neural network embedded into the controller of the robotic platform.
Figure 2
Figure 2. Real robot experiments: neurophysiology, robotic set-up and cerebellar controller.
EBCC-like, VOR and upper limb perturbed reaching: on the left column the typical set-up used in neurophysiological studies; in the middle the corresponding set-up used in our robotic tasks and on the right column the cerebellar network with the task-specific input and output signals. (A), (B) and (C): the EBCC-like Pavlovian task is reproduced into the robotic platform as a collision-avoidance task. The CS onset is based on the distance between the moving robot end-effector and the fixed obstacle placed along the trajectory, detected by the optical tracker. The US is the collision event. US is fed into the CF pathway, CS into the MF pathway; the DCNs trigger the conditioned response (anticipated stop). (D), (E) and (F): the VOR is reproduced into the robotic platform by using the second joint of the robotic arm as the head (imposed rotation) and the third joint (determining the orientation of the second link, on which the green laser is placed) as the eye. The disalignment between the gaze direction (i.e. second link orientation) and the environmental target to be looked at (hold and eventually moved by another robotic device) is computed through geometric equations from the optical tracker recording. The image slip is fed into the CF pathway, the vestibular stimulus about the head state into the MF pathway; the DCNs modulate the eye compensatory motion. (G), (H) and (I): the perturbed reaching is reproduced into the robotic platform by applying a viscous force field on the moving robotic arm by means of the other robotic device attached at its end-effector. The joint error is fed into the CF pathway, the desired plan into the MF pathway; the DCNs modulate the anticipatory corrective torque.
Figure 3
Figure 3. Cerebellar SNN tuning by EBCC simulations.
Exploration of LTD and LTP parameters in EBCC tests (400 trials of acquisition and 200 trials of extinction). A gross exploration (first column) with 25 combinations (centers of each pixel) and then a finer exploration (second column) with 100 combinations (i.e. pixels) of the parameter space were carried out. The impacts on learning were quantified by the maximum of the DCN within-trial maxima (A and E), by the number of the first trial when CR threshold was overcome (B and F), by the standard deviation of the DCN maxima during late acquisition (C and G), and by the DCN activity at the end of extinction (D and H). Yellow arrows indicate the optimal directions of these indexes. Green squares on the first column represent the gross area selected for the finer exploration. The light blue squares in the second column represent the optimal area within the LTD and LTP parameters have been chosen. Red cross within pixels means no results came out with that combination of parameters, relative to the specific index.
Figure 4
Figure 4. Granular layer: spatio-temporal activity.
(A) similarity indexes between pairs of instantaneous patterns in the GR layer during a 400-ms CS sent as a 50 Hz random activity on the 20 MFs. The values of indexes are represented in grey scale; black 0, white 1. The darker the matrix is, the better uncorrelated activity patterns are. (B) raster plot of the 1500 GR cells during the CS.
Figure 5
Figure 5. EBCC in simulations: SNN working.
The first test of delay-EBCC was implemented with ISI = 300 ms, 400-ms CS and 100-ms US; each repetition lasted 500 ms. The protocol consisted of 400 repetitions of acquisition (CS-US pairs) and 200 of extinction (CS-alone). On the left column, the raster plots of the network activity (excluding the 1500 GRs) of three trials in different phases of the learning process: early acquisition, late acquisition and late extinction. Aligned on the right, the activity of each cell population as mean of the active cells' instantaneous firing rates (spike counting within a mobile 25-ms time window and then 50-ms smoothing). In all the trials, the CS-related MF spike pattern was equal to 39±8 Hz. The IOs showed a firing rate of 10±3 Hz in the trials where no response was generated ahead of the US onset (A); whereas, the IO firing rate was reduced to 5±2 Hz when a CR anticipated the US onset (B). (A) At the beginning of the acquisition (1st repetition), PCs were spontaneously active, with an overall firing rate during the CS of 81±49 Hz, supplying tonic inhibition to the DCNs. (B) The activity pattern of Purkinje cells was altered during conditioning, reducing the firing rate in a time-dependent manner (68±58 Hz), thanks to a temporal-specific LTD at PF-PC connections. Consequently the DCN activity increased (8±6 Hz), overcoming the threshold before the US onset. Hence, a CR was produced (black star) (380th repetition). (C) After some trials in which the network output still produced CRs even if only CS was presented, a complete extinction, driven by the LTP mechanism, re-increased the PCs activity and hence re-inhibited the DCNs (580th repetition).
Figure 6
Figure 6. EBCC in simulations: motor response generation.
3D plots of PC activity (A) and of DCN activity (B), along time and trials (one each 10 repetitions, for picture clarity). The activity is computed as firing rate in a mobile 25-ms time window, averaged across all cells of each population. (C) PC firing rate, averaged within each trial (during CS-related MF excitation). The whiskers represent the standard deviations. One each 10 repetitions is depicted for picture clarity. (D) Number of CRs (%) along acquisition and extinction trials, computed as percentage number of CR occurrence within consecutive blocks of 10 trials each. Both acquisition and extinction phases were fitted by the best least-squares sigmoid fitting curves, i.e. the ones minimizing the residual error (Root mean square error: 6.091% for the acquisition phase, 5.758% for the extinction one). The vertical dashed line highlights the shift between acquisition and extinction phases. (E) CR anticipation, computed as mean within blocks of 100 trials each. The whiskers represent the standard deviations of the CR latency within each block. The vertical dashed line highlights the shift between acquisition and extinction phases.
Figure 7
Figure 7. Plasticity.
(A) histograms of the PF-PC weights at the end of three trials (1st, 380th, 580th, as in Fig. 5). All weights were initialized at 4 nS. (B) blow-up of PF-PC synaptic weights, depicted one each 100 (288 lines) for picture clarity, within one repetition (the 1st trial, from 0.0 to 0.5 s), in order to show the time-dependent changes of each weight. In red, the weights underwent basically a LTP during the test, while in grey the overall depressed synapses (LTD).
Figure 8
Figure 8. EBCC in simulations: different ISIs.
One representative trial for each of the other two tests of EBCC, taken in late acquisition when a stable CR generation was achieved (380th trial). (A) ISI = 200 ms (B) ISI = 400 ms. The cerebellar SNN was able to associate different combinations of stimuli.
Figure 9
Figure 9. Associative Pavlovian task in real robot.
(A) and (C) Number of CRs (%) along trials (400 acquisition trials and 200 extinction trials), computed as percentage number of CR occurrence within blocks of 10 trials each, for real robot tests with ISI1 and ISI2. For each ISI, the black curve is the average on the 12 tests, and the grey area is the standard deviation. In spite of the uncertainty and variability introduced by the direct interaction with a real environment, the cerebellar SNN was able to progressively learn a CR anticipating the US and to extinguish it. (B) and (D) Raster plot of the recorded network activity (excluding the 1500 GRs), in late acquisition when a stable CR generation was achieved (380th repetition, from 1520 s to 1520.6 s). The learning process induced a temporal-specific LTD at PF-PC connections, thus reducing the PC activity and consequently increasing the DCN activity, which overcame the threshold just before the US onset. Hence, a CR was detected (black star).
Figure 10
Figure 10. VOR in real robot.
(A) and (C) Gaze angular error (RMS within each trial) during 400 repetitions with fixed target and 200 repetitions with moving target, which required a strong reduction of the previously learned VOR. For each HR (HR1 and HR2), the curve is the average on the 12 tests, and the area is the standard deviation. (B) and (D) Raster plot of the recorded network activity (excluding the 1500 GRs), in one repetition (380th repetition), where the acquisition had been completed, characterized by a stable and effective compensatory eye movement. There was still an error (negative), but the IOs fired at a very low rate. The head rotation on the 2nd joint (vestibular stimulus) was decoded by the MFs through RBFs. The DCN activity was significant for the negative DCNs, which produced a negative torque, applied on the third joint; thus, it produced an eye turn, with the same shape of the head turn but the opposite sign, and with an intensity peak of 36 mNm for HR1 and 27 mNm for HR2. Note: there was no MF and PC activity during the last second because that part of the repetition was only necessary to steadily bring back the robot to its initial position, without any cerebellar activation.
Figure 11
Figure 11. Perturbed reaching in real robot.
(A) and (D) Joint angular error (RMS within each trial) during the 650 repetitions: 50 trials without force field, then 400 with a viscous force field (force field c1 and force field c2) and the last 200 again with null force field. The last phase clearly shows the after-effect phenomenon, occurring when the perturbation was suddenly canceled after a complete acquisition. For each force field, the curve is the average on the 12 tests, and the area is the standard deviation. (B) and (E) For each force field, Cartesian trajectories on the xz plane in some representative trials: ideal trajectory in dashed green, 1st perturbed trial (51st) in thick red, last perturbed trial (450th) in thin red, 1st re-unperturbed trial (451st) in thick blue, last trial (650th) in thin blue. (C) and (F) Raster plot of the recorded network activity (excluding the 1500 GRs), in one repetition (430th repetition), where the acquisition had been completed, characterized by a stable compensation of the constant external perturbation. There was still an error (negative and positive), but the IOs fired at a very low rate. The DCN activity was evident almost only on the negative DCNs, which produced a negative corrective torque on the third joint, able to counterbalance the viscous force fields.

References

    1. Ito M (2011) Adaptive control of reflexes by the cerebellum. Understanding the Stretch Reflex 44: 435. - PubMed
    1. Miall R, Wolpert DM (1996) Forward models for physiological motor control. Neural networks 9: 1265–1279. - PubMed
    1. Shadmehr R, Mussa-Ivaldi FA (1994) Adaptive representation of dynamics during learning of a motor task. The Journal of Neuroscience 14: 3208–3224. - PMC - PubMed
    1. Cheron G, Dan B, Márquez-Ruiz J (2013) Translational approach to behavioral learning: Lessons from cerebellar plasticity. Neural plasticity 2013. - PMC - PubMed
    1. Marr D (1969) A theory of cerebellar cortex. The Journal of physiology 202: 437–470. - PMC - PubMed

Publication types

LinkOut - more resources