This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2023 Jul 12:2023.07.11.548615.

doi: 10.1101/2023.07.11.548615.

Resolving Non-identifiability Mitigates Bias in Models of Neural Tuning and Functional Coupling

Pratik Sachdeva^{1

2}, Ji Hyun Bak^{3

4}, Jesse Livezey⁴, Christoph Kirst^{3

5

6}, Loren Frank^{3

7

8}, Sharmodeep Bhattacharyya⁹, Kristofer E Bouchard^{2

3

4

5

10}

Affiliations

¹ Physics Department, UC Berkeley.
² Redwood Center for Theoretical Neuroscience, UC Berkeley.
³ Kavli Institute for Fundamental Neuroscience, UC San Francisco.
⁴ Biological Systems and Engineering Division, Lawrence Berkeley National Lab.
⁵ Scientific Data Division, Lawrence Berkeley National Lab.
⁶ Deptartment of Anatomy, UC San Francisco.
⁷ Departments of Physiology and Psychiatry, UC San Francisco.
⁸ Howard Hughes Medical Institute.
⁹ Department of Statistics; Oregon State University.
¹⁰ Helen Wills Neuroscience Institute, UC Berkeley.

PMID: 37503030
PMCID: PMC10370036
DOI: 10.1101/2023.07.11.548615

Resolving Non-identifiability Mitigates Bias in Models of Neural Tuning and Functional Coupling

Pratik Sachdeva et al. bioRxiv. 2023.

[Preprint]. 2023 Jul 12:2023.07.11.548615.

doi: 10.1101/2023.07.11.548615.

Authors

Pratik Sachdeva^{1

2}, Ji Hyun Bak^{3

4}, Jesse Livezey⁴, Christoph Kirst^{3

5

6}, Loren Frank^{3

7

8}, Sharmodeep Bhattacharyya⁹, Kristofer E Bouchard^{2

3

4

5

10}

Affiliations

¹ Physics Department, UC Berkeley.
² Redwood Center for Theoretical Neuroscience, UC Berkeley.
³ Kavli Institute for Fundamental Neuroscience, UC San Francisco.
⁴ Biological Systems and Engineering Division, Lawrence Berkeley National Lab.
⁵ Scientific Data Division, Lawrence Berkeley National Lab.
⁶ Deptartment of Anatomy, UC San Francisco.
⁷ Departments of Physiology and Psychiatry, UC San Francisco.
⁸ Howard Hughes Medical Institute.
⁹ Department of Statistics; Oregon State University.
¹⁰ Helen Wills Neuroscience Institute, UC Berkeley.

PMID: 37503030
PMCID: PMC10370036
DOI: 10.1101/2023.07.11.548615

Abstract

In the brain, all neurons are driven by the activity of other neurons, some of which maybe simultaneously recorded, but most are not. As such, models of neuronal activity need to account for simultaneously recorded neurons and the influences of unmeasured neurons. This can be done through inclusion of model terms for observed external variables (e.g., tuning to stimuli) as well as terms for latent sources of variability. Determining the influence of groups of neurons on each other relative to other influences is important to understand brain functioning. The parameters of statistical models fit to data are commonly used to gain insight into the relative importance of those influences. Scientific interpretation of models hinge upon unbiased parameter estimates. However, evaluation of biased inference is rarely performed and sources of bias are poorly understood. Through extensive numerical study and analytic calculation, we show that common inference procedures and models are typically biased. We demonstrate that accurate parameter selection before estimation resolves model non-identifiability and mitigates bias. In diverse neurophysiology data sets, we found that contributions of coupling to other neurons are often overestimated while tuning to exogenous variables are underestimated in common methods. We explain heterogeneity in observed biases across data sets in terms of data statistics. Finally, counter to common intuition, we found that model non-identifiability contributes to bias, not variance, making it a particularly insidious form of statistical error. Together, our results identify the causes of statistical biases in common models of neural data, provide inference procedures to mitigate that bias, and reveal and explain the impact of those biases in diverse neural data sets.

PubMed Disclaimer

Figures

**Figure 1:. Neural activity depends on external stimuli, simultaneously recorded neurons, and unobserved neural activity.**
Each row corresponds to a separate neuroscience dataset. **a-c.** Single-unit spikes recorded from macaque monkey primary visual cortex (red dot on the right) in response to drifting gratings (blue diagram on the left). b. Spike rasters for 20 distinct single units as a function of time from the onset of stimulus. c. The distribution of pairwise noise correlations across the neural population, calculated for each pair of units. d. Micro-electrocorticography ( $μ$ -ECoG) recordings ( $z$ -scored $H γ$ response) from rat primary auditory cortex (right) in response to tone pips at varying frequencies (left). e. The high- $γ$ response for 20 different electrodes as a function of time from stimulus onset. f. The distribution of pairwise noise correlations for the $μ$ -ECoG dataset. g. Single-unit spikes recorded from rat hippocampus (right) during a spatial decision-making task in a maze (left). h. Spike raster from 20 single units as a function of time in the maze. i. The distribution of pairwise noise correlations across the neural population, calculated for each pair of units.

**Figure 2:. Models of neural activity capturing the impact of tuning, functional coupling, and unobserved influences.**
a. Neural datasets are comprised of recordings (electrode) from observed neurons $y$ (center circle) that respond to an external stimulus $x$ . The recording apparatus may fail to capture unobserved activity $z$ (dashed circle). The recorded neuronal activity depends on the external stimulus (tuning), the within-population interactions (functional coupling) and the unobserved activity. **b-d.** Commonly used systems neuroscience models capturing tuning and functional coupling include the b. tuning model, where neuronal dependence on the external stimulus is modeled; c. functional coupling model, where a neuron’s dependence on neighboring neurons is modeled; and d. the coupling and tuning (CoTu) model, which models both factors simultaneously. e. The static CoTuLa model extends the tuning and coupling model by simultaneously capturing the joint impact of external tuning $x$ and unobserved activity $z$ on the target neuron $y_{i}$ and non-target neurons $y_{\neg i}$ . In the graphical model, $z$ is a latent variable (light blue coloring). f. A dynamic extension of the static CoTuLa model, which captures the external stimuli and latent variable influence on the neural population at each time point. The model also incorporates both temporal correlations in the stimulus and the latent variable.

**Figure 3:. Unbiased parameter estimation requires accurate model selection during inference of the CoTuLa model.**
Each plot presents normalized bias as a function of the parameter-generating distributions’ means: the tuning mean (x-axes) and the coupling mean (y-axes). We examined three conditions of model selection: oracle selection (**a-d.**), inferred selection (**e-h.**), and no selection (**i-l**). In each, the top row depicts results for coupling parameters, while the bottom row depicts results for tuning parameters. Tuning and coupling biases are respectively plotted with the same colormaps, shown in the colorbars on the right of each subplot.

**Figure 4:. Sparse inference in the static CoTuLa model mitigates bias in neural data.**
**a-c.** Results from monkey V1 (PVC) data. a. Example fitted tuning curves on single units using a tuning model (black), CoTu model (gray) and CoTuLa model (red). b. Tuning modulations obtained from CoTu model (x-axis) compared to those obtained from CoTuLa model (y-axis) across the single-units in the population. Inset depicts the distribution (median and IQR) of tuning modulations, for the tuning (black), CoTu (grey), and CoTuLa (red) models. Note the log-scale on the axes. c. Comparison of coupling parameters between the CoTu (x-axis) and CoTuLa (y-axis) models. Points denote coupling parameters across models. Inset depicts the distribution (median and IQR) of coupling parameter magnitudes, for the tuning (black), CoTu (grey), and CoTuLa (red) models. **d-f.** Results on rat $μ$ -ECoG data, with similar plots as the top row. c. Example fitted tuning curves on single electrodes **d-f.** Same as **a-c.** but for rat primary auditory cortex data. **d-e.** Comparison of tuning modulations. e. Comparison of coupling parameters. g. Fraction of variance in the neural responses captured by each term in the CoTu and CoTuLa models, shown in a stacked bar plot. h. Tuning-coupling ratios, calculated from the variance fractions, based on the CoTu and CoTuLa model inference procedures. Data are displayed as medians and IQR (interquartile range) over all neurons in the fit. Significance markers denote $p < 10^{- 3}$ for Wilcoxon signed-rank test.

**Figure 5:. Temporal correlations in unobserved variability can create biases in the dynamical CoTuLa model.**
Each panel visualizes the median and interquartile range of the normalized biases for the dynamical functional coupling (Co) and tuning (Tu) parameters of the model, with 50 realizations each. a. Schematic illustration of the origin of simultaneity in the dynamical model, in the presence of temporal correlation. **b-e.** Systematic biases in OLS estimates. Data were generated, according to four different conditions of the CoTuLa model: b. no temporal correlation in either stimuli or noise, c. temporal correlation in the stimuli, d. temporal correlation in the noise, and e. temporal correlation in both stimuli and noise.

**Figure 6:. Sparse inference in the dynamical CoTuLa model mitigates biases in synthetic data.**
Normalized biases for the coupling (Co.) and tuning (Tu.) parameters, under four different generative models with and without temporal correlations. We applied different inference procedures to the same set of data. **a-c.** CoTu model inference, with a. no selection, b. inferred selection, and c. oracle selection. **d-f.** Similar to **a-c.** but for CoTuLa model inference.

**Figure 7:. Sparse inference in the dynamical CoTuLa model can mitigate bias in neural data.**
**a-c.** Data from primary visual cortex. a. Example tuning curves from the tuning-only (black), CoTu (grey), and CoTuLa (red) model inference procedures. b. Comparison of tuning modulations from CoTu and CoTuLa model inference procedures. Inset: box plots (median and IQR) of the distributions of tuning modulations for the three models. c. Comparison of coupling modulations. Scatter plot shows all elements of the coupling matrix. Inset: magnitudes of the non-zero coupling elements (non-zero elements excluded for visual clarity). **d-f.** Similar to a-c, but for data from rat hippocampus. **g-i.** Comparison of fits for synthetic, primary visual cortex and hippocampus data. For the hippocampus data, results from early days (E, days 2–4) and late days (L, days 5–9) are shown separately. g. Fraction of variance in the neural responses captured by each term, shown is a stacked bar plot. h. Estimated strengths of coupling $a$ , tuning correlation $h$ , and latent correlation $g$ . See text for details. i. Tuning-coupling ratios, calculated from the variance fractions, based on the CoTu and CoTuLa model inference procedures. Box plots show the median and the IQR.

**Figure 8:. Modulation of tuning, coupling and latent parameters across early and late phases in hippocampal experiment.**
a. Day-by-day view of the variance fractions, showing large changes over the first few days of the experiment. b. Comparison of variance fractions between the early phase (days 2–4 since the animal was introduced to the experiment) and the late phase (days 5–9). Box plots show median (dot), mean (horizontal line) and the IQR. P values are from Wilcoxon rank-sum tests comparing distributions between early and late phases.

**Figure 9:. Structural non-identifiabilities within the static CoTuLa model contributes to consistent biased estimation.**
a. A toy loss surface (z-axis) in two parameters (x- and y-axes). Red lines denote example identifiability subspaces for this loss surface. **b-c**. Biases calculated for coupling and tuning parameters obtained from CoTuLa inference on data generated from a sparse, identifiable CoTuLa model. CoTuLa inference was conducted without specification of a model support, and thus had structural non-identifiability. Biases are shown across 30 different initializations. Different colored histograms correspond to different means of the underlying parameter distributions. d. The identifiability family for the static CoTuLa model behaves like a truncated parabola when the latent dimension $K = 1$ . Black line denotes the identifiability family plotted as a function of principal axes $P_{0}$ and $P_{1}$ , which depend on $a$ , $b_{i}$ , and $l_{i}$ . The black curve denotes parameter configurations for which the log-likelihood is unchanged. The blue surface denotes the point of truncation, where the private variance $Ψ_{i}$ would become negative. e. The experiment set-up for examining structural non-identifiabilities. Orange curve denotes an identifiability family at initialization. Many models are initialized at different points along the identifiability family (denoted by ×). Parameter inference is performed using EM until a fitted identifiability family is reached (green curve). The fitted solutions (denoted by × on the green curve) cluster near the top of the identifiable class, where private variance is maximized. f. The fitted solutions (× marks) obtained empirically from the experiment described in e., but visualized with two principal components.

See this image and copyright information in PMC

References

1. Paninski L., Pillow J. & Lewi J. in Computational Neuroscience: Theoretical Insights into Brain Function 493–507 (Elsevier, 2007).
1. Kass R. E. et al. Computational Neuroscience: Mathematical and Statistical Perspectives. Annual Review of Statistics and Its Application 5, 183–214 (2018). - PMC - PubMed
1. Truccolo W., Eden U., Fellows M., Donoghue J. & Brown E. A Point Process Framework for Relating Neural Spiking Activity to Spiking History, Neural Ensemble, and Extrinsic Covariate Effects. Journal of Neurophysiology 93, 1074–1089. arXiv: NIHMS150003 (2004). - PubMed
1. Paninski L. Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems 15, 243–262 (2004). - PubMed
1. Pillow J. W. et al. Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature 454, 995 (2008). - PMC - PubMed

Publication types

Actions

Grants and funding

R01 NS118648/NS/NINDS NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

Resolving Non-identifiability Mitigates Bias in Models of Neural Tuning and Functional Coupling

Affiliations

Resolving Non-identifiability Mitigates Bias in Models of Neural Tuning and Functional Coupling

Authors

Affiliations

Abstract

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources