Estimating Information Theoretic Measures via Multidimensional Gaussianization
- PMID: 39527441
- DOI: 10.1109/TPAMI.2024.3495827
Estimating Information Theoretic Measures via Multidimensional Gaussianization
Abstract
Information theory is an outstanding framework for measuring uncertainty, dependence, and relevance in data and systems. It has several desirable properties for real-world applications: naturally deals with multivariate data, can handle heterogeneous data, and the measures can be interpreted. However, it has not been adopted by a wider audience because obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality. We propose an indirect way of estimating information based on a multivariate iterative Gaussianization transform. The proposed method has a multivariate-to-univariate property: it reduces the challenging estimation of multivariate measures to a composition of marginal operations applied in each iteration of the Gaussianization. Therefore, the convergence of the resulting estimates depends on the convergence of well-understood univariate entropy estimates, and the global error linearly depends on the number of times the marginal estimator is invoked. We introduce Gaussianization-based estimates for Total Correlation, Entropy, Mutual Information, and Kullback-Leibler Divergence. Results on artificial data show that our approach is superior to previous estimators, particularly in high-dimensional scenarios. We also illustrate the method's performance in different fields to obtain interesting insights. We make the tools and datasets publicly available to provide a test bed for analyzing future methodologies.
LinkOut - more resources
Full Text Sources
