Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 1;6(4):1243-1274.
doi: 10.1162/netn_a_00259. eCollection 2022.

Conservative significance testing of tripartite statistical relations in multivariate neural data

Affiliations

Conservative significance testing of tripartite statistical relations in multivariate neural data

Aleksejs Fomins et al. Netw Neurosci. .

Abstract

An important goal in systems neuroscience is to understand the structure of neuronal interactions, frequently approached by studying functional relations between recorded neuronal signals. Commonly used pairwise measures (e.g., correlation coefficient) offer limited insight, neither addressing the specificity of estimated neuronal interactions nor potential synergistic coupling between neuronal signals. Tripartite measures, such as partial correlation, variance partitioning, and partial information decomposition, address these questions by disentangling functional relations into interpretable information atoms (unique, redundant, and synergistic). Here, we apply these tripartite measures to simulated neuronal recordings to investigate their sensitivity to noise. We find that the considered measures are mostly accurate and specific for signals with noiseless sources but experience significant bias for noisy sources.We show that permutation testing of such measures results in high false positive rates even for small noise fractions and large data sizes. We present a conservative null hypothesis for significance testing of tripartite measures, which significantly decreases false positive rate at a tolerable expense of increasing false negative rate. We hope our study raises awareness about the potential pitfalls of significance testing and of interpretation of functional relations, offering both conceptual and practical advice.

Keywords: Functional connectivity; Multicollinearity; Partial information decomposition; Redundancy; Significance testing; Synergy.

Plain language summary

Tripartite functional relation measures enable the study of interesting effects in neural recordings, such as redundancy, functional connection specificity, and synergistic coupling. However, estimators of such relations are commonly validated using noiseless signals, whereas neural recordings typically contain noise. Here we systematically study the performance of tripartite estimators using simulated noisy neural signals. We demonstrate that permutation testing is not a robust procedure for inferring ground truth statistical relations from commonly used tripartite relation estimators. We develop an adjusted conservative testing procedure, reducing false positive rates of the studied estimators when applied to noisy data. Besides addressing significance testing, our results should aid in accurate interpretation of tripartite functional relations and functional connectivity.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

<b>Figure 1.</b>
Figure 1.
(A) Sketch of partial information decomposition. Sketches of this form will be employed throughout this paper. The colors will always denote the corresponding information atoms formula image, formula image, formula image, formula image. The width of individual lines or triangles qualitatively indicates the magnitude of the effect. In this plot, all information atoms are shown with maximal magnitude for reference. (B) Example questions about tripartite relations that may be of interest in neuroscience. Left: Is the functional connection between X and Z specific with respect to the confounding variable Y? Middle: Are X, Y, and Z redundantly encoding the same information? Right: Could Z control synchronization between X and Y? (for example, if X and Y control forelimbs and hind limbs, respectively, and Z determines if the animal is currently running or resting). Note: the three sketches are made as a function of time for illustrative purposes only. In principle, information atoms can be computed across any data dimension. Here, we compute information atoms across trials.
<b>Figure 2.</b>
Figure 2.
Noise in neuronal observables. A typical aim is the estimation of information atoms (blue arrows) between neuronal signals of interest (blue areas X*, Y*, and Z*) underlying the recorded data. However, the observables the experimenter has access to (black areas X, Y, and Z) typically are not the pure signals of interest. In the simplest case considered here, observables are corrupted by additive noise (red νx, νy, and νz). Blue arrows in the middle indicate tripartite interaction effects between the signals of interest (i.e., synergy).
<b>Figure 3.</b>
Figure 3.
(A) A thought experiment setup. Left: Multivariate neuronal signals are recorded in a behaving test subject (courtesy to SciDraw). Middle: Neuronal signals X, Y, and Z are observed during Ntr trials with the same duration T and are plotted as a function of trial time for three example trials. Green vertical lines indicate a sample time step at which the analysis is performed. Right: 3D scatter plot of X, Y, and Z across trials sampled at the fixed time step t (green). 2D projections indicate that X correlates to Z (purple), while Y is uncorrelated to either X or Z. (B) A sketch of the simulation procedure. First the ground truth model is used to generate multiple samples of the ground truth variables X*, Y*, Z*. Then, the observable model adds noise to the data, producing observables X, Y, Z. Finally, the measure is used to compute information atoms for the given data sample. (C) We explored four ground truth models (mRed, mUnq, mXOR, mSum), three observable models (PureSrc, NoisyX, Noisy), four measures (PCorr, VP, BROJA PID, MMI PID), which each report four different information atoms (except PCorr, see below). In the observational model, green color denotes pure variables (no unexplained variance), and yellow denotes noisy variables. All models had discrete and continuous versions.
<b>Figure 4.</b>
Figure 4.
Performance of tripartite analysis measures on PureSrc model. (A) PCorr for the pure source mUnq model. Plotted is the PCorr magnitude as function of noise fraction of the model. Red line is the critical value corresponding to p value of 0.01 for permutation testing. For most noise fractions the information atom values are significant, correctly resulting in true positives. (B) Same as A, but for the mRed model. For all noise fractions, most of the estimated information atom values are not significant, correctly resulting in true negatives. (C) Variance partitioning redundant information atom for the pure source mUnq model. In this case, roughly 60% of false positive redundant information atoms are significant, much more than reasonable to expect by chance. (D and E) Sketch of the detected information atoms for noise fraction of 0.25 as function of measure (rows) and ground truth model (columns). Line thickness indicates fraction of significant information atoms (permutation test, p value 0.01). Emphasized in green are the theoretically expected results for the underlying ground truth model. All measures correctly identify true positives and true negatives in each model.
<b>Figure 5.</b>
Figure 5.
Performance of tripartite analysis measures on model data with noisy source variables. (A) PCorr values as function of the noise fraction using the Noisy discrete mRed model. Red line denotes critical value (p value 0.01) based on a permutation test (same in B–D). Red dashed arrow indicates transition from true negatives to false positives (same in B, C). (B) Same as A, but for VP U(XZ|Y). (C) Same as A and B, but for BROJA PID S(X : YZ). (D) PCorr as function of the data size Ntr for a fixed noise fraction of 0.25 using the Noisy mRed model. (E–H) Same as A–D but for continuous variable models. (I) Sketch of the detected information atoms for Noisy discrete model at noise fraction of 0.25. Line thickness indicates the fraction of significant information atoms (permutation test, p value 0.01). (J) Same as I, but for continuous variable models.
<b>Figure 6.</b>
Figure 6.
(A) Algorithm to determine the adjusted critical value for redundant and synergistic information atoms. The function threshold finds the critical value for a given ground truth and observable model. Function max_threshold maximizes the critical value over all observable models. For unique information atoms, the same algorithm would iterate over a line px = py = pz instead of a 3D grid. (B) Distribution of false positive S(X : YZ) (red curve) for discrete MMI PID measure using mRed model as function of noise fraction along the line px = py, pz = 0. Corresponding true positive R(X : YZ) values (green) are plotted for comparison. Vertical dashed line denotes the noise fraction with maximal expected false positive S(X : YZ) value. Horizontal dashed line denotes the 1% upper percentile of S(X : YZ) at that noise fraction, corresponding to the p value 0.01 critical value for Hadj. (C) Same as B, but for the continuous variable MMI PID measure.
<b>Figure 7.</b>
Figure 7.
Performance of tripartite analysis measures on model data with noisy source variables (Noisy model), tested against Hadj. The conservative test significantly reduces false positives in all measures and information atoms at the expense of increasing false negatives. (A–D) Discrete measure values as function of noise fraction, corresponding exactly to Figure 5A–D. Purple lines denotes the critical values due to Hadj. (E–H) Continuous measure values as function of noise fraction, corresponding exactly to Figure 5E–H. Purple lines same as above. (I and J) Sketch of detected discrete-variable (I) and continuous variable (J) information atoms. Same as in Figure 5I–J, except that the fractions of significant information atoms are estimated using the conservative critical values.
<b>Figure 8.</b>
Figure 8.
Two different ground truth designs that can produce indistinguishable data. Three populations X, Y, and Z redundantly encode a latent variable T. In model 1, the population Y additionally encodes another latent variable V, whereas in model 2 the second latent variable is additionally encoded by X and Z.

References

    1. Achen, C. H. (1990). What does “explained variance” explain?: Reply. Political Analysis, 2, 173–184. 10.1093/pan/2.1.173 - DOI
    1. Aguirre, G. K., Zarahn, E., & D’Esposito, M. (1998). The variability of human, BOLD hemodynamic responses. NeuroImage, 8(4), 360–369. 10.1006/nimg.1998.0369, - DOI - PubMed
    1. Amari, S., Nakahara, H., Wu, S., & Sakai, Y. (2003). Synchronous firing and higher-order interactions in neuron pool. Neural Computation, 15(1), 127–142. 10.1162/089976603321043720, - DOI - PubMed
    1. Andrews, D. F. (1974). A robust method for multiple linear regression. Technometrics, 16(4), 523–531. 10.1080/00401706.1974.10489233 - DOI
    1. Babinski, K., Lê, K.-T., & Séguéla, P. (1999). Molecular cloning and regional distribution of a human proton receptor subunit with biphasic functional properties. Journal of Neurochemistry, 72(1), 51–57. 10.1046/j.1471-4159.1999.0720051.x, - DOI - PubMed

LinkOut - more resources