Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul;59(7):4374-4388.
doi: 10.1109/TIT.2013.2251456.

Ensemble estimators for multivariate entropy estimation

Affiliations

Ensemble estimators for multivariate entropy estimation

Kumar Sricharan et al. IEEE Trans Inf Theory. 2013 Jul.

Abstract

The problem of estimation of density functionals like entropy and mutual information has received much attention in the statistics and information theory communities. A large class of estimators of functionals of the probability density suffer from the curse of dimensionality, wherein the mean squared error (MSE) decays increasingly slowly as a function of the sample size T as the dimension d of the samples increases. In particular, the rate is often glacially slow of order O(T-γ/d ), where γ > 0 is a rate parameter. Examples of such estimators include kernel density estimators, k-nearest neighbor (k-NN) density estimators, k-NN entropy estimators, intrinsic dimension estimators and other examples. In this paper, we propose a weighted affine combination of an ensemble of such estimators, where optimal weights can be chosen such that the weighted estimator converges at a much faster dimension invariant rate of O(T-1). Furthermore, we show that these optimal weights can be determined by solving a convex optimization problem which can be performed offline and does not require training data. We illustrate the superior performance of our weighted estimator for two important applications: (i) estimating the Panter-Dite distortion-rate factor and (ii) estimating the Shannon entropy for testing the probability distribution of a random sample.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Variation of MSE of Panter-Dite factor estimates using standard kernel plug-in estimator [14], truncated kernel plug-in estimator (3.3), histogram plug-in estimator[17], k-NN estimator [20], entropic graph estimator [18] and the weighted ensemble estimator (3.6). (a) Variation of MSE of Panter-Dite factor estimates as a function of sample size T. From the figure, we see that the proposed weighted estimator has the fastest MSE rate of convergence wrt sample size T (d = 6). (b) Variation of MSE of Panter-Dite factor estimates as a function of dimension d. From the figure, we see that the MSE of the proposed weighted estimator has the slowest rate of growth with increasing dimension d (T = 3000).
Figure 2
Figure 2
Entropy estimates using standard kernel plug-in estimator, truncated kernel plug-in estimator and the weighted estimator, for random samples corresponding to hypothesis H0 and H1. The weighted estimator provides better discrimination ability by suppressing the bias, at the cost of some additional variance. (a) Entropy estimates for random samples corresponding to hypothesis H0 (experiments 1–500) and H1 (experiments 501–1000). (b) Histogram envelopes of entropy estimates for random samples corresponding to hypothesis H0 (blue) and H1 (red).
Figure 3
Figure 3
Comparison of performance in terms of ROC for the distribution testing problem. The weighted estimator uniformly outperforms the individual plug-in estimators.
Figure 4
Figure 4
Illustration for the proof of Lemma 4.

References

    1. Beirlant J, Dudewicz EJ, Györfi L, Van der Meulen EC. Nonparametric entropy estimation: An overview. Intl Journal of Mathematical and Statistical Sciences. 1997;6:17–40.
    1. Birge L, Massart P. Estimation of integral functions of a density. The Annals of Statistics. 1995;23(1):11–29.
    1. Costa JA, Hero AO. Geodesic entropic graphs for dimension and entropy estimation in manifold learning. Signal Processing, IEEE Transactions on. 2004;52(8):2210–2221.
    1. Fukunaga K, Hostetler LD. IEEE Transactions on Information Theory. 1973. Optimization of k-nearest-neighbor density estimates.
    1. Giné E, Mason DM. Uniform in bandwidth estimation of integral functionals of the density function. Scandinavian Journal of Statistics. 2008;35:739761.

LinkOut - more resources