. 2022 Oct 1;6(4):1243-1274.

doi: 10.1162/netn_a_00259. eCollection 2022.

Conservative significance testing of tripartite statistical relations in multivariate neural data

Aleksejs Fomins^{1

2}, Yaroslav Sych^{1

3

4}, Fritjof Helmchen^{1

2}

Affiliations

¹ Brain Research Institute, University of Zurich, Zurich, Switzerland.
² Neuroscience Center Zurich, University of Zurich, Switzerland.
³ Experimental Neurology Center, Department of Neurology, Inselspital University Hospital Bern, Bern, Switzerland.
⁴ Present address: Institute of Cellular and Integrative Neurosciences, University of Strasbourg and CNRS, Strasbourg, France.

PMID: 38800452
PMCID: PMC11117094
DOI: 10.1162/netn_a_00259

Conservative significance testing of tripartite statistical relations in multivariate neural data

Aleksejs Fomins et al. Netw Neurosci. 2022.

. 2022 Oct 1;6(4):1243-1274.

doi: 10.1162/netn_a_00259. eCollection 2022.

Authors

Aleksejs Fomins^{1

2}, Yaroslav Sych^{1

3

4}, Fritjof Helmchen^{1

2}

Affiliations

¹ Brain Research Institute, University of Zurich, Zurich, Switzerland.
² Neuroscience Center Zurich, University of Zurich, Switzerland.
³ Experimental Neurology Center, Department of Neurology, Inselspital University Hospital Bern, Bern, Switzerland.
⁴ Present address: Institute of Cellular and Integrative Neurosciences, University of Strasbourg and CNRS, Strasbourg, France.

PMID: 38800452
PMCID: PMC11117094
DOI: 10.1162/netn_a_00259

Abstract

An important goal in systems neuroscience is to understand the structure of neuronal interactions, frequently approached by studying functional relations between recorded neuronal signals. Commonly used pairwise measures (e.g., correlation coefficient) offer limited insight, neither addressing the specificity of estimated neuronal interactions nor potential synergistic coupling between neuronal signals. Tripartite measures, such as partial correlation, variance partitioning, and partial information decomposition, address these questions by disentangling functional relations into interpretable information atoms (unique, redundant, and synergistic). Here, we apply these tripartite measures to simulated neuronal recordings to investigate their sensitivity to noise. We find that the considered measures are mostly accurate and specific for signals with noiseless sources but experience significant bias for noisy sources.We show that permutation testing of such measures results in high false positive rates even for small noise fractions and large data sizes. We present a conservative null hypothesis for significance testing of tripartite measures, which significantly decreases false positive rate at a tolerable expense of increasing false negative rate. We hope our study raises awareness about the potential pitfalls of significance testing and of interpretation of functional relations, offering both conceptual and practical advice.

Keywords: Functional connectivity; Multicollinearity; Partial information decomposition; Redundancy; Significance testing; Synergy.

Plain language summary

Tripartite functional relation measures enable the study of interesting effects in neural recordings, such as redundancy, functional connection specificity, and synergistic coupling. However, estimators of such relations are commonly validated using noiseless signals, whereas neural recordings typically contain noise. Here we systematically study the performance of tripartite estimators using simulated noisy neural signals. We demonstrate that permutation testing is not a robust procedure for inferring ground truth statistical relations from commonly used tripartite relation estimators. We develop an adjusted conservative testing procedure, reducing false positive rates of the studied estimators when applied to noisy data. Besides addressing significance testing, our results should aid in accurate interpretation of tripartite functional relations and functional connectivity.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

<b>Figure 1.</b> — **Figure 1.**
(A) Sketch of partial information decomposition. Sketches of this form will be employed throughout this paper. The colors will always denote the corresponding information atoms , , , . The width of individual lines or triangles qualitatively indicates the magnitude of the effect. In this plot, all information atoms are shown with maximal magnitude for reference. (B) Example questions about tripartite relations that may be of interest in neuroscience. Left: Is the functional connection between X and Z specific with respect to the confounding variable Y? Middle: Are X, Y, and Z redundantly encoding the same information? Right: Could Z control synchronization between X and Y? (for example, if X and Y control forelimbs and hind limbs, respectively, and Z determines if the animal is currently running or resting). Note: the three sketches are made as a function of time for illustrative purposes only. In principle, information atoms can be computed across any data dimension. Here, we compute information atoms across trials.

formula image — **Figure 1.**
(A) Sketch of partial information decomposition. Sketches of this form will be employed throughout this paper. The colors will always denote the corresponding information atoms , , , . The width of individual lines or triangles qualitatively indicates the magnitude of the effect. In this plot, all information atoms are shown with maximal magnitude for reference. (B) Example questions about tripartite relations that may be of interest in neuroscience. Left: Is the functional connection between X and Z specific with respect to the confounding variable Y? Middle: Are X, Y, and Z redundantly encoding the same information? Right: Could Z control synchronization between X and Y? (for example, if X and Y control forelimbs and hind limbs, respectively, and Z determines if the animal is currently running or resting). Note: the three sketches are made as a function of time for illustrative purposes only. In principle, information atoms can be computed across any data dimension. Here, we compute information atoms across trials.

<b>Figure 2.</b> — **Figure 2.**
Noise in neuronal observables. A typical aim is the estimation of information atoms (blue arrows) between neuronal signals of interest (blue areas X*, Y*, and Z*) underlying the recorded data. However, the observables the experimenter has access to (black areas X, Y, and Z) typically are not the pure signals of interest. In the simplest case considered here, observables are corrupted by additive noise (red ν_x, ν_y, and ν_z). Blue arrows in the middle indicate tripartite interaction effects between the signals of interest (i.e., synergy).

<b>Figure 3.</b> — **Figure 3.**
(A) A thought experiment setup. Left: Multivariate neuronal signals are recorded in a behaving test subject (courtesy to SciDraw). Middle: Neuronal signals X, Y, and Z are observed during N_tr trials with the same duration T and are plotted as a function of trial time for three example trials. Green vertical lines indicate a sample time step at which the analysis is performed. Right: 3D scatter plot of X, Y, and Z across trials sampled at the fixed time step t (green). 2D projections indicate that X correlates to Z (purple), while Y is uncorrelated to either X or Z. (B) A sketch of the simulation procedure. First the ground truth model is used to generate multiple samples of the ground truth variables X*, Y*, Z*. Then, the observable model adds noise to the data, producing observables X, Y, Z. Finally, the measure is used to compute information atoms for the given data sample. (C) We explored four ground truth models (mRed, mUnq, mXOR, mSum), three observable models (PureSrc, NoisyX, Noisy), four measures (PCorr, VP, BROJA PID, MMI PID), which each report four different information atoms (except PCorr, see below). In the observational model, green color denotes pure variables (no unexplained variance), and yellow denotes noisy variables. All models had discrete and continuous versions.

<b>Figure 4.</b> — **Figure 4.**
Performance of tripartite analysis measures on PureSrc model. (A) PCorr for the pure source mUnq model. Plotted is the PCorr magnitude as function of noise fraction of the model. Red line is the critical value corresponding to p value of 0.01 for permutation testing. For most noise fractions the information atom values are significant, correctly resulting in true positives. (B) Same as A, but for the mRed model. For all noise fractions, most of the estimated information atom values are not significant, correctly resulting in true negatives. (C) Variance partitioning redundant information atom for the pure source mUnq model. In this case, roughly 60% of false positive redundant information atoms are significant, much more than reasonable to expect by chance. (D and E) Sketch of the detected information atoms for noise fraction of 0.25 as function of measure (rows) and ground truth model (columns). Line thickness indicates fraction of significant information atoms (permutation test, p value 0.01). Emphasized in green are the theoretically expected results for the underlying ground truth model. All measures correctly identify true positives and true negatives in each model.

<b>Figure 5.</b> — **Figure 5.**
Performance of tripartite analysis measures on model data with noisy source variables. (A) PCorr values as function of the noise fraction using the Noisy discrete mRed model. Red line denotes critical value (p value 0.01) based on a permutation test (same in B–D). Red dashed arrow indicates transition from true negatives to false positives (same in B, C). (B) Same as A, but for VP U(X → Z|Y). (C) Same as A and B, but for BROJA PID S(X : Y → Z). (D) PCorr as function of the data size N_tr for a fixed noise fraction of 0.25 using the Noisy mRed model. (E–H) Same as A–D but for continuous variable models. (I) Sketch of the detected information atoms for Noisy discrete model at noise fraction of 0.25. Line thickness indicates the fraction of significant information atoms (permutation test, p value 0.01). (J) Same as I, but for continuous variable models.

<b>Figure 6.</b> — **Figure 6.**
(A) Algorithm to determine the adjusted critical value for redundant and synergistic information atoms. The function *threshold* finds the critical value for a given ground truth and observable model. Function *max_threshold* maximizes the critical value over all observable models. For unique information atoms, the same algorithm would iterate over a line p_x = p_y = p_z instead of a 3D grid. (B) Distribution of false positive S(X : Y → Z) (red curve) for discrete MMI PID measure using mRed model as function of noise fraction along the line p_x = p_y, p_z = 0. Corresponding true positive R(X : Y → Z) values (green) are plotted for comparison. Vertical dashed line denotes the noise fraction with maximal expected false positive S(X : Y → Z) value. Horizontal dashed line denotes the 1% upper percentile of S(X : Y → Z) at that noise fraction, corresponding to the p value 0.01 critical value for H_adj. (C) Same as B, but for the continuous variable MMI PID measure.

<b>Figure 7.</b> — **Figure 7.**
Performance of tripartite analysis measures on model data with noisy source variables (Noisy model), tested against H_adj. The conservative test significantly reduces false positives in all measures and information atoms at the expense of increasing false negatives. (A–D) Discrete measure values as function of noise fraction, corresponding exactly to Figure 5A–D. Purple lines denotes the critical values due to H_adj. (E–H) Continuous measure values as function of noise fraction, corresponding exactly to Figure 5E–H. Purple lines same as above. (I and J) Sketch of detected discrete-variable (I) and continuous variable (J) information atoms. Same as in Figure 5I–J, except that the fractions of significant information atoms are estimated using the conservative critical values.

<b>Figure 8.</b> — **Figure 8.**
Two different ground truth designs that can produce indistinguishable data. Three populations X, Y, and Z redundantly encode a latent variable T. In model 1, the population Y additionally encodes another latent variable V, whereas in model 2 the second latent variable is additionally encoded by X and Z.

See this image and copyright information in PMC

References

1. Achen, C. H. (1990). What does “explained variance” explain?: Reply. Political Analysis, 2, 173–184. 10.1093/pan/2.1.173 - DOI
1. Aguirre, G. K., Zarahn, E., & D’Esposito, M. (1998). The variability of human, BOLD hemodynamic responses. NeuroImage, 8(4), 360–369. 10.1006/nimg.1998.0369, - DOI - PubMed
1. Amari, S., Nakahara, H., Wu, S., & Sakai, Y. (2003). Synchronous firing and higher-order interactions in neuron pool. Neural Computation, 15(1), 127–142. 10.1162/089976603321043720, - DOI - PubMed
1. Andrews, D. F. (1974). A robust method for multiple linear regression. Technometrics, 16(4), 523–531. 10.1080/00401706.1974.10489233 - DOI
1. Babinski, K., Lê, K.-T., & Séguéla, P. (1999). Molecular cloning and regional distribution of a human proton receptor subunit with biphasic functional properties. Journal of Neurochemistry, 72(1), 51–57. 10.1046/j.1471-4159.1999.0720051.x, - DOI - PubMed

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Conservative significance testing of tripartite statistical relations in multivariate neural data

Affiliations

Conservative significance testing of tripartite statistical relations in multivariate neural data

Authors

Affiliations

Abstract

Plain language summary

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources