Comparative Study

. 2005 May 25;25(21):5195-206.

doi: 10.1523/JNEUROSCI.5319-04.2005.

Synergy, redundancy, and independence in population codes, revisited

Peter E Latham¹, Sheila Nirenberg

Affiliations

PMID: 15917459
PMCID: PMC6724819
DOI: 10.1523/JNEUROSCI.5319-04.2005

Comparative Study

Synergy, redundancy, and independence in population codes, revisited

Peter E Latham et al. J Neurosci. 2005.

. 2005 May 25;25(21):5195-206.

doi: 10.1523/JNEUROSCI.5319-04.2005.

Authors

Peter E Latham¹, Sheila Nirenberg

Affiliation

¹ Gatsby Computational Neuroscience Unit, University College London, London WC1N 3AR, United Kingdom.

PMID: 15917459
PMCID: PMC6724819
DOI: 10.1523/JNEUROSCI.5319-04.2005

Abstract

Decoding the activity of a population of neurons is a fundamental problem in neuroscience. A key aspect of this problem is determining whether correlations in the activity, i.e., noise correlations, are important. If they are important, then the decoding problem is high dimensional: decoding algorithms must take the correlational structure in the activity into account. If they are not important, or if they play a minor role, then the decoding problem can be reduced to lower dimension and thus made more tractable. The issue of whether correlations are important has been a subject of heated debate. The debate centers around the validity of the measures used to address it. Here, we evaluate three of the most commonly used ones: synergy, DeltaI(shuffled), and DeltaI. We show that synergy and DeltaI(shuffled) are confounded measures: they can be zero when correlations are clearly important for decoding and positive when they are not. In contrast, DeltaI is not confounded. It is zero only when correlations are not important for decoding and positive only when they are; that is, it is zero only when one can decode exactly as well using a decoder that ignores correlations as one can using a decoder that does not, and it is positive only when one cannot decode as well. Finally, we show that DeltaI has an information theoretic interpretation; it is an upper bound on the information lost when correlations are ignored.

PubMed Disclaimer

Figures

**Figure 1.**
Estimating conditional response distributions from data. a, Estimate of the correlated response distribution, p(r₁, r₂|s), for a single stimulus, s. The responses, r₁ and r₂, are taken to be spike count in a 300 ms window. They range from 0 to 19, so there are 400 (= 20 × 20) bins. A total of 250 trials were used, which leads to a very noisy estimate. b, Estimate of the independent response distribution, p(r₁|s)p(r₂|s). The single neuron distributions, p(r₁|s) and p(r₂|s), can be estimated individually, using all 250 trials for each one, and the joint distribution can then be constructed by multiplying them together. This leads to a much smoother (and more accurate) estimate of the probability distribution.

**Figure 2.**
Correlations can exist without being important for decoding. a, Correlated response distributions for four stimuli, shown as solid lines. For each stimulus, the responses lie along the line segments indicated by the arrows. [Formally, p(r₁, r₂|*s_i*) ∝ δ(*s_i* - (r₂ - *a_ir*₁)), where *a_i* is the slope of the line segment, and there is an implicit cutoff when r₁ is below some minimum or above some maximum.] If the stimulus is known, r₁ predicts r₂ and vice versa, making the responses perfectly correlated. Because the responses form disjoint sets, all responses are uniquely decodable. b, Independent response distributions for the same four stimuli, shown as open boxes (the correlated distributions are shown also, as dashed lines). For each stimulus, the responses lie inside the boxes indicated by the arrows. The boxes overlap, and, if a response were to occur in the overlap region, it would not be uniquely decodable, because it could have been produced by more than one stimulus. However, the responses never land in this region (because they always land on the dashed lines). Thus, a decoder built with no knowledge of the correlational structure would be able to decode the true responses perfectly. c, A very similar set of correlated distributions. A decoder with knowledge of the correlations would be able to decode all responses perfectly. d, The independent response distributions derived from c. The true responses can lie in the overlap region, and a decoder that had no knowledge of the correlational structure would not be able to decode such responses perfectly. Thus, the correlations here are clearly important for decoding.

**Figure 3.**
ΔI_shuffled can be both positive and negative when correlations are not important for decoding. a, Correlated response distributions for two stimuli, s₁ and s₂, which occur with equal probability. For each stimulus, the responses fall inside the boxes labeled by that stimulus. Because the responses are disjoint, all are uniquely decodable. b, Independent response distributions for the same stimuli. (The center boxes are offset so both can be seen.) Responses in the center box, which occur on one-quarter of the trials, provide no information about the stimulus. Thus, the independent responses provide less information than the true responses (because they are sometimes ambiguous), so ΔI_shuffled > 0 (see Appendix A). However, as with Figure 2a, the true responses never land in the ambiguous region, so a decoder that has no knowledge of the correlations will decode exactly as well as one that has full knowledge of them. c, A different set of correlated response distributions, also for two stimuli. In this case, the responses land in the center box on one-half of the trials, and thus there is ambiguity about what the stimulus is on one-half the trials. d, The corresponding independent distribution (which is the same as in b). Here, the independent responses are ambiguous on only one-quarter of the trials, so they provide more information about the stimulus than the true responses. Thus, for this example, ΔI_shuffled < 0 (see Appendix A). However, regardless of whether a decoder knows about the correlations, if a response lands in the overlap region, the stimulus probabilities are the same (one-half for each), so, as in a, knowledge of the correlations is not necessary for decoding.

**Figure 4.**
A synergistic code in which correlations are not important for decoding. a, Correlated response distributions for three stimuli, s₁, s₂, and s₃, which occur with equal probability. For each stimulus, the responses fall inside the boxes. Because the responses form disjoint sets, all responses are uniquely decodable. For this distribution, it is not hard to show that (see Appendix A). b, Independent response distributions for the same stimuli. (The boxes along the diagonal are offset so both can be seen.) If a response were to land in a box along the diagonal, it would not be uniquely decodable; it could have been produced by two stimuli. However, as with Figures 2a and 3a, the responses never occur along the diagonal. Thus, even if a decoder knew nothing at all about the correlational structure, it would be able to decode perfectly all responses that actually occur (which are the only ones that matter). The probability distributions for this figure were derived from Schneidman et al. (2003); see Discussion.

formula image — **Figure 4.**
A synergistic code in which correlations are not important for decoding. a, Correlated response distributions for three stimuli, s₁, s₂, and s₃, which occur with equal probability. For each stimulus, the responses fall inside the boxes. Because the responses form disjoint sets, all responses are uniquely decodable. For this distribution, it is not hard to show that (see Appendix A). b, Independent response distributions for the same stimuli. (The boxes along the diagonal are offset so both can be seen.) If a response were to land in a box along the diagonal, it would not be uniquely decodable; it could have been produced by two stimuli. However, as with Figures 2a and 3a, the responses never occur along the diagonal. Thus, even if a decoder knew nothing at all about the correlational structure, it would be able to decode perfectly all responses that actually occur (which are the only ones that matter). The probability distributions for this figure were derived from Schneidman et al. (2003); see Discussion.

**Figure 5.**
Highly synergistic codes are efficient but typically very difficult to decode. a, Two-neuron response distribution for five stimuli, color coded for clarity. The distribution is of the form p(s|r₁, r₂) ∝ δ(s - (r₁ - r₂)), with both r₁ and r₂ restricted to lie between 0 and 1. The stimulus, s, is a continuous variable that is uniformly distributed between -1 and 1. Observing both responses tells us exactly what the stimulus is, so the responses provide an infinite amount of information (the stimulus is specified with infinite precision). Observing any one response, however, provides only a finite amount of information about the stimulus. Consequently, ΔI_synergy = ∞. This coding scheme is advantageous because it can transmit an infinite amount of information and is easy to decode (s = r₁ - r₂). Note, however, that it requires perfect correlation, which is not biologically plausible. b, A distribution in which the r₁-r₂ plane was scrambled: it was divided into a 100 × 100 grid, and the squares in each column were randomly permuted. If we knew the scrambling algorithm, we could decode responses perfectly: , where is an operator that transforms r₁ and r₂ in scrambled coordinates to r₁ - r₂ in unscrambled ones. However, the decoder would have to store 100 log₂(100!) bits just to unscramble the responses (log₂(100!) bits per column), and, in general, for an n × n grid, it would have to store n log₂(n!) ∼ n²log₂(*n/e*) bits. Moreover, adding even a small amount of noise would destroy the ability of the responses to tell us anything about the stimulus. In the space of response distributions, non-smooth ones such as this are over-whelmingly more likely than the smooth one shown in a. Thus, minimizing redundancy (which leads to maximum synergy) would almost always produce an encoding scheme that is virtually impossible to decode.

See this image and copyright information in PMC

References

1. Abbott LF, Dayan P (1999) The effect of correlated variability on the accuracy of a population code. Neural Comput 11: 91-101. - PubMed
1. Atick JJ (1992) Could information theory provide an ecological theory of sensory processing? Network 3: 213-251. - PubMed
1. Atick JJ, Redlich AN (1990) Towards a theory of early visual processing. Neural Comput 2: 308-320.
1. Attneave F (1954) Informational aspects of visual perception. Psychol Rev 61: 183-193. - PubMed
1. Averbeck BB, Lee D (2003) Neural noise and movement-related codes in the macaque supplementary motor area. J Neurosci 23: 7630-7641. - PMC - PubMed

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Synergy, redundancy, and independence in population codes, revisited

Affiliation

Synergy, redundancy, and independence in population codes, revisited

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical