Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Jul 12:2023.07.11.548615.
doi: 10.1101/2023.07.11.548615.

Resolving Non-identifiability Mitigates Bias in Models of Neural Tuning and Functional Coupling

Affiliations

Resolving Non-identifiability Mitigates Bias in Models of Neural Tuning and Functional Coupling

Pratik Sachdeva et al. bioRxiv. .

Abstract

In the brain, all neurons are driven by the activity of other neurons, some of which maybe simultaneously recorded, but most are not. As such, models of neuronal activity need to account for simultaneously recorded neurons and the influences of unmeasured neurons. This can be done through inclusion of model terms for observed external variables (e.g., tuning to stimuli) as well as terms for latent sources of variability. Determining the influence of groups of neurons on each other relative to other influences is important to understand brain functioning. The parameters of statistical models fit to data are commonly used to gain insight into the relative importance of those influences. Scientific interpretation of models hinge upon unbiased parameter estimates. However, evaluation of biased inference is rarely performed and sources of bias are poorly understood. Through extensive numerical study and analytic calculation, we show that common inference procedures and models are typically biased. We demonstrate that accurate parameter selection before estimation resolves model non-identifiability and mitigates bias. In diverse neurophysiology data sets, we found that contributions of coupling to other neurons are often overestimated while tuning to exogenous variables are underestimated in common methods. We explain heterogeneity in observed biases across data sets in terms of data statistics. Finally, counter to common intuition, we found that model non-identifiability contributes to bias, not variance, making it a particularly insidious form of statistical error. Together, our results identify the causes of statistical biases in common models of neural data, provide inference procedures to mitigate that bias, and reveal and explain the impact of those biases in diverse neural data sets.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Neural activity depends on external stimuli, simultaneously recorded neurons, and unobserved neural activity.
Each row corresponds to a separate neuroscience dataset. a-c. Single-unit spikes recorded from macaque monkey primary visual cortex (red dot on the right) in response to drifting gratings (blue diagram on the left). b. Spike rasters for 20 distinct single units as a function of time from the onset of stimulus. c. The distribution of pairwise noise correlations across the neural population, calculated for each pair of units. d. Micro-electrocorticography (μ-ECoG) recordings ( z-scored Hγ response) from rat primary auditory cortex (right) in response to tone pips at varying frequencies (left). e. The high-γ response for 20 different electrodes as a function of time from stimulus onset. f. The distribution of pairwise noise correlations for the μ-ECoG dataset. g. Single-unit spikes recorded from rat hippocampus (right) during a spatial decision-making task in a maze (left). h. Spike raster from 20 single units as a function of time in the maze. i. The distribution of pairwise noise correlations across the neural population, calculated for each pair of units.
Figure 2:
Figure 2:. Models of neural activity capturing the impact of tuning, functional coupling, and unobserved influences.
a. Neural datasets are comprised of recordings (electrode) from observed neurons y (center circle) that respond to an external stimulus x. The recording apparatus may fail to capture unobserved activity z (dashed circle). The recorded neuronal activity depends on the external stimulus (tuning), the within-population interactions (functional coupling) and the unobserved activity. b-d. Commonly used systems neuroscience models capturing tuning and functional coupling include the b. tuning model, where neuronal dependence on the external stimulus is modeled; c. functional coupling model, where a neuron’s dependence on neighboring neurons is modeled; and d. the coupling and tuning (CoTu) model, which models both factors simultaneously. e. The static CoTuLa model extends the tuning and coupling model by simultaneously capturing the joint impact of external tuning x and unobserved activity z on the target neuron yi and non-target neurons y¬i. In the graphical model, z is a latent variable (light blue coloring). f. A dynamic extension of the static CoTuLa model, which captures the external stimuli and latent variable influence on the neural population at each time point. The model also incorporates both temporal correlations in the stimulus and the latent variable.
Figure 3:
Figure 3:. Unbiased parameter estimation requires accurate model selection during inference of the CoTuLa model.
Each plot presents normalized bias as a function of the parameter-generating distributions’ means: the tuning mean (x-axes) and the coupling mean (y-axes). We examined three conditions of model selection: oracle selection (a-d.), inferred selection (e-h.), and no selection (i-l). In each, the top row depicts results for coupling parameters, while the bottom row depicts results for tuning parameters. Tuning and coupling biases are respectively plotted with the same colormaps, shown in the colorbars on the right of each subplot.
Figure 4:
Figure 4:. Sparse inference in the static CoTuLa model mitigates bias in neural data.
a-c. Results from monkey V1 (PVC) data. a. Example fitted tuning curves on single units using a tuning model (black), CoTu model (gray) and CoTuLa model (red). b. Tuning modulations obtained from CoTu model (x-axis) compared to those obtained from CoTuLa model (y-axis) across the single-units in the population. Inset depicts the distribution (median and IQR) of tuning modulations, for the tuning (black), CoTu (grey), and CoTuLa (red) models. Note the log-scale on the axes. c. Comparison of coupling parameters between the CoTu (x-axis) and CoTuLa (y-axis) models. Points denote coupling parameters across models. Inset depicts the distribution (median and IQR) of coupling parameter magnitudes, for the tuning (black), CoTu (grey), and CoTuLa (red) models. d-f. Results on rat μ-ECoG data, with similar plots as the top row. c. Example fitted tuning curves on single electrodes d-f. Same as a-c. but for rat primary auditory cortex data. d-e. Comparison of tuning modulations. e. Comparison of coupling parameters. g. Fraction of variance in the neural responses captured by each term in the CoTu and CoTuLa models, shown in a stacked bar plot. h. Tuning-coupling ratios, calculated from the variance fractions, based on the CoTu and CoTuLa model inference procedures. Data are displayed as medians and IQR (interquartile range) over all neurons in the fit. Significance markers denote p<10-3 for Wilcoxon signed-rank test.
Figure 5:
Figure 5:. Temporal correlations in unobserved variability can create biases in the dynamical CoTuLa model.
Each panel visualizes the median and interquartile range of the normalized biases for the dynamical functional coupling (Co) and tuning (Tu) parameters of the model, with 50 realizations each. a. Schematic illustration of the origin of simultaneity in the dynamical model, in the presence of temporal correlation. b-e. Systematic biases in OLS estimates. Data were generated, according to four different conditions of the CoTuLa model: b. no temporal correlation in either stimuli or noise, c. temporal correlation in the stimuli, d. temporal correlation in the noise, and e. temporal correlation in both stimuli and noise.
Figure 6:
Figure 6:. Sparse inference in the dynamical CoTuLa model mitigates biases in synthetic data.
Normalized biases for the coupling (Co.) and tuning (Tu.) parameters, under four different generative models with and without temporal correlations. We applied different inference procedures to the same set of data. a-c. CoTu model inference, with a. no selection, b. inferred selection, and c. oracle selection. d-f. Similar to a-c. but for CoTuLa model inference.
Figure 7:
Figure 7:. Sparse inference in the dynamical CoTuLa model can mitigate bias in neural data.
a-c. Data from primary visual cortex. a. Example tuning curves from the tuning-only (black), CoTu (grey), and CoTuLa (red) model inference procedures. b. Comparison of tuning modulations from CoTu and CoTuLa model inference procedures. Inset: box plots (median and IQR) of the distributions of tuning modulations for the three models. c. Comparison of coupling modulations. Scatter plot shows all elements of the coupling matrix. Inset: magnitudes of the non-zero coupling elements (non-zero elements excluded for visual clarity). d-f. Similar to a-c, but for data from rat hippocampus. g-i. Comparison of fits for synthetic, primary visual cortex and hippocampus data. For the hippocampus data, results from early days (E, days 2–4) and late days (L, days 5–9) are shown separately. g. Fraction of variance in the neural responses captured by each term, shown is a stacked bar plot. h. Estimated strengths of coupling a, tuning correlation h, and latent correlation g. See text for details. i. Tuning-coupling ratios, calculated from the variance fractions, based on the CoTu and CoTuLa model inference procedures. Box plots show the median and the IQR.
Figure 8:
Figure 8:. Modulation of tuning, coupling and latent parameters across early and late phases in hippocampal experiment.
a. Day-by-day view of the variance fractions, showing large changes over the first few days of the experiment. b. Comparison of variance fractions between the early phase (days 2–4 since the animal was introduced to the experiment) and the late phase (days 5–9). Box plots show median (dot), mean (horizontal line) and the IQR. P values are from Wilcoxon rank-sum tests comparing distributions between early and late phases.
Figure 9:
Figure 9:. Structural non-identifiabilities within the static CoTuLa model contributes to consistent biased estimation.
a. A toy loss surface (z-axis) in two parameters (x- and y-axes). Red lines denote example identifiability subspaces for this loss surface. b-c. Biases calculated for coupling and tuning parameters obtained from CoTuLa inference on data generated from a sparse, identifiable CoTuLa model. CoTuLa inference was conducted without specification of a model support, and thus had structural non-identifiability. Biases are shown across 30 different initializations. Different colored histograms correspond to different means of the underlying parameter distributions. d. The identifiability family for the static CoTuLa model behaves like a truncated parabola when the latent dimension K=1. Black line denotes the identifiability family plotted as a function of principal axes P0 and P1, which depend on a, bi, and li. The black curve denotes parameter configurations for which the log-likelihood is unchanged. The blue surface denotes the point of truncation, where the private variance Ψi would become negative. e. The experiment set-up for examining structural non-identifiabilities. Orange curve denotes an identifiability family at initialization. Many models are initialized at different points along the identifiability family (denoted by ×). Parameter inference is performed using EM until a fitted identifiability family is reached (green curve). The fitted solutions (denoted by × on the green curve) cluster near the top of the identifiable class, where private variance is maximized. f. The fitted solutions (× marks) obtained empirically from the experiment described in e., but visualized with two principal components.

References

    1. Paninski L., Pillow J. & Lewi J. in Computational Neuroscience: Theoretical Insights into Brain Function 493–507 (Elsevier, 2007).
    1. Kass R. E. et al. Computational Neuroscience: Mathematical and Statistical Perspectives. Annual Review of Statistics and Its Application 5, 183–214 (2018). - PMC - PubMed
    1. Truccolo W., Eden U., Fellows M., Donoghue J. & Brown E. A Point Process Framework for Relating Neural Spiking Activity to Spiking History, Neural Ensemble, and Extrinsic Covariate Effects. Journal of Neurophysiology 93, 1074–1089. arXiv: NIHMS150003 (2004). - PubMed
    1. Paninski L. Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems 15, 243–262 (2004). - PubMed
    1. Pillow J. W. et al. Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature 454, 995 (2008). - PMC - PubMed

Publication types