Initialization and self-organized optimization of recurrent neural network connectivity

Joschka Boedecker, Oliver Obst, N Michael Mayer, Minoru Asada

PMID: 20357891
PMCID: PMC2801534
DOI: 10.2976/1.3240502

Initialization and self-organized optimization of recurrent neural network connectivity

Joschka Boedecker et al. HFSP J. 2009 Oct.

. 2009 Oct;3(5):340-9.

doi: 10.2976/1.3240502. Epub 2009 Oct 26.

Authors

Joschka Boedecker, Oliver Obst, N Michael Mayer, Minoru Asada

PMID: 20357891
PMCID: PMC2801534
DOI: 10.2976/1.3240502

Abstract

Reservoir computing (RC) is a recent paradigm in the field of recurrent neural networks. Networks in RC have a sparsely and randomly connected fixed hidden layer, and only output connections are trained. RC networks have recently received increased attention as a mathematical model for generic neural microcircuits to investigate and explain computations in neocortical columns. Applied to specific tasks, their fixed random connectivity, however, leads to significant variation in performance. Few problem-specific optimization procedures are known, which would be important for engineering applications, but also in order to understand how networks in biology are shaped to be optimally adapted to requirements of their environment. We study a general network initialization method using permutation matrices and derive a new unsupervised learning rule based on intrinsic plasticity (IP). The IP-based learning uses only local learning, and its aim is to improve network performance in a self-organized way. Using three different benchmarks, we show that networks with permutation matrices for the reservoir connectivity have much more persistent memory than the other methods but are also able to perform highly nonlinear mappings. We also show that IP-based on sigmoid transfer functions is limited concerning the output distributions that can be achieved.

PubMed Disclaimer

Figures

**Figure 1. Architecture of an echo state network.**
In echo state networks, usually only the connections represented by the dashed lines are trained, and all other connections are setup randomly and remain fixed. The recurrent layer is also called a reservoir, analogously to a liquid, which has fading memory properties. As an example, consider throwing a rock into a pond; the ripples caused by the rock will persist for a certain amount of time and thus information about the event can be extracted from the liquid as long as it has not returned to its single attractor state—the flat surface.

**Figure 2. TheMC _k curves for the uniform random input data.**
The plot indicates how well the input signal can be reconstructed (MC_k) for increasing delay times k.

**Figure 3. Plot of the reservoir matrix eigenvalues in the complex plane for a 100 node network (a) in the PMT condition and (b) in the RND condition.**
Both matrices have been scaled to have a spectral radius of 0.95.

**Figure 4. Uniform input on the interval [−1;1] and a tanh(⋅) transfer function lead to the output distribution in the histogram.**
IP selects a slice of this distribution, as illustrated by the vertical lines. Adapting gain and bias changes width and position of the slice.

**Figure 5. Panels (a)–(c) show the effect of IP learning on a single feedforward neuron.**
Panel (a) shows the result of IP learning with a single fermi neuron without self-recurrence and a learning rule for the exponential output distribution. IP successfully produces the desired output distribution. In panels (b) and (c), we see the effect of IP learning on a single tanh neuron without self-recurrence, trained with a learning rule for a Gaussian, and a Laplace output distribution, respectively. In both cases, IP learning fails to achieve the desired result: the best it can do is to drive the neuron to a uniform output distribution, which has the smallest distance (for the given transfer function) to the desired distributions. Panel (d) shows the effect of IPGAUSS for a single self-recurrent tanh unit. The achieved output distribution is significantly more Gaussian-shaped than without the self-recurrence. The effect is amplified in a network where the neurons receive additional inputs with similar distributions. All units were trained using 100 000 training steps and uniformly distributed input data on the interval [−1;1].

See this image and copyright information in PMC

References

1. Bengio, Y, Simard, P, and Frasconi, P (1994). “Learning long-term dependencies with gradient descent is difficult.” IEEE Trans. Neural Netw. ITNNEP 5(2), 157–166.10.1109/72.279181 - DOI - PubMed
1. Bertschinger, N, and Natschläger, T (2004). “Real-time computation at the edge of chaos in recurrent neural networks.” Neural Comput. NEUCEB 16(7), 1413–1436.10.1162/089976604323057443 - DOI - PubMed
1. Daoudal, G, and Debanne, D (2003). “Long-term plasticity of intrinsic excitability: learning rules and mechanisms.” Learn. Memory LEMEFO 10, 456–465.10.1101/lm.64103 - DOI - PubMed
1. Douglas, R, Markram, H, and Martin, K (2004). “Neocortex.” The Synaptic Organization of the Brain, 5th Ed., Shepard, G M, ed., pp. 499–558, Oxford University Press, New York.
1. Doya, K (1995). “Recurrent networks: supervised learning.” The Handbook of Brain Theory and Neural Networks, Arbib, M A, ed., pp. 796–800, MIT Press, Cambridge, MA.

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Initialization and self-organized optimization of recurrent neural network connectivity

Initialization and self-organized optimization of recurrent neural network connectivity

Authors

Abstract

Figures

References

LinkOut - more resources

Full Text Sources