Ensemble estimators for multivariate entropy estimation
- PMID: 25897177
- PMCID: PMC4401872
- DOI: 10.1109/TIT.2013.2251456
Ensemble estimators for multivariate entropy estimation
Abstract
The problem of estimation of density functionals like entropy and mutual information has received much attention in the statistics and information theory communities. A large class of estimators of functionals of the probability density suffer from the curse of dimensionality, wherein the mean squared error (MSE) decays increasingly slowly as a function of the sample size T as the dimension d of the samples increases. In particular, the rate is often glacially slow of order O(T-γ/d ), where γ > 0 is a rate parameter. Examples of such estimators include kernel density estimators, k-nearest neighbor (k-NN) density estimators, k-NN entropy estimators, intrinsic dimension estimators and other examples. In this paper, we propose a weighted affine combination of an ensemble of such estimators, where optimal weights can be chosen such that the weighted estimator converges at a much faster dimension invariant rate of O(T-1). Furthermore, we show that these optimal weights can be determined by solving a convex optimization problem which can be performed offline and does not require training data. We illustrate the superior performance of our weighted estimator for two important applications: (i) estimating the Panter-Dite distortion-rate factor and (ii) estimating the Shannon entropy for testing the probability distribution of a random sample.
Figures




References
-
- Beirlant J, Dudewicz EJ, Györfi L, Van der Meulen EC. Nonparametric entropy estimation: An overview. Intl Journal of Mathematical and Statistical Sciences. 1997;6:17–40.
-
- Birge L, Massart P. Estimation of integral functions of a density. The Annals of Statistics. 1995;23(1):11–29.
-
- Costa JA, Hero AO. Geodesic entropic graphs for dimension and entropy estimation in manifold learning. Signal Processing, IEEE Transactions on. 2004;52(8):2210–2221.
-
- Fukunaga K, Hostetler LD. IEEE Transactions on Information Theory. 1973. Optimization of k-nearest-neighbor density estimates.
-
- Giné E, Mason DM. Uniform in bandwidth estimation of integral functionals of the density function. Scandinavian Journal of Statistics. 2008;35:739761.
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources