KullbackLeibler Divergence Estimation of Continuous Distributions
 Proceedings of IEEE International Symposium on Information Theory
, 2008
Abstract—We present a method for estimating the KL divergence between continuous densities and we prove it converges almost surely. Divergence estimation is typically solved estimating the densities first. Our main result shows this intermediate step is unnecessary and that the divergence can be either estimated using the empirical cdf or knearestneighbour density estimation, which does not converge to the true measure for finite k. The convergence proof is based on describing the statistics of our estimator using waitingtimes distributions, as the exponential or Erlang. We illustrate the proposed estimators and show how they compare to existing methods based on density estimation, and we also outline how our divergence estimators can be used for solving the twosample problem. I.
Schemes for BiDirectional Modeling of Discrete Stationary Sources
, 2005
Adaptive models are developed to deal with bidirectional modeling of unknown discrete stationary sources, which can be generally applied to statistical inference problems such as noncausal universal discrete denoising that exploits bidirectional dependencies. Efficient algorithms for constructing those models are developed and implemented. Denoising is a primary focus of the application of those models, and we compare their performance to that of the DUDE algorithm [1] for universal discrete denoising.
Universal erasure entropy estimation
 In Proc. of the 2006 IEEE Intl. Symp. on Inform. Theory, (ISIT’06
, 2006
Abstract — Erasure entropy rate (introduced recently by Verdú and Weissman) differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, universal algorithms for estimating erasure entropy rate are proposed based on the basic and extended contexttree weighting (CTW) algorithms. Consistency results are shown for those CTW based algorithms. Simulation results for those algorithms applied to Markov sources, tree sources and English texts are compared to those obtained by fixedorder plugin estimators with different orders. An estimate of the erasure entropy of English texts based on the proposed algorithms is about 0.22 bits per letter, which can be compared to an estimate of about 1.3 bits per letter for the entropy rate of English texts by a similar CTW based algorithm.
Universal estimation of erasure entropy
 IEEE Trans. Inf. Theory
Abstract—Erasure entropy rate differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, consistent universal algorithms for estimating erasure entropy rate are proposed based on the basic and extended contexttree weighting (CTW) algorithms. Simulation results for those algorithms applied to Markov sources, tree sources, and English texts are compared to those obtained by fixedorder plugin estimators with different orders. Index Terms—Bidirectional context tree, contexttree weighting, data compression, entropy rate, universal algorithms, universal modeling. I.
Universal Estimation of Directed Information via Sequential Probability Assignments
Abstract—We propose four approaches to estimating the directed information rate between a pair of jointly stationary ergodic processes with the help of universal probability assignments. The four approaches yield estimators with different merits such as nonnegativity and boundedness. We establish consistency of these estimators in various senses and derive nearoptimal rates of convergence in the minimax sense under mild conditions. The estimators carry over directly to estimating other information measures of stationary ergodic processes, such as entropy rate and mutual information rate, and provide alternatives to classical approaches in the existing literature. Guided by the theoretical results, we use context tree weighting as the vehicle for the implementations of the proposed estimators. Experiments on synthetic and real data are presented, demonstrating the potential of the proposed schemes in practice and the efficacy of directed information estimation as a tool for detecting and measuring causality and delay. Index Terms—Causal influence, context tree weighting, directed information, rate of convergence, universal probability assignment I.
Quantization via empirical divergence maximization
 Signal Processing, IEEE Transactions on
Nonparametric changepoint detection using string matching
, 2011
