Results 1 - 10
of
35
Estimation of the information by an adaptive partitioning of the observation space
- IEEE Transactions on Information Theory
, 1999
"... Abstract—We demonstrate that it is possible to approximate the mutual information arbitrarily closely in probability by calculating relative frequencies on appropriate partitions and achieving conditional independence on the rectangles of which the partitions are made. Empirical results, including a ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
Abstract—We demonstrate that it is possible to approximate the mutual information arbitrarily closely in probability by calculating relative frequencies on appropriate partitions and achieving conditional independence on the rectangles of which the partitions are made. Empirical results, including a comparison with maximum-likelihood estimators, are presented. Index Terms—Data-dependent partitions, maximum-likelihood estimation, mutual information, nonparametric estimation.
ICA Using Spacings Estimates of Entropy
- Journal of Machine Learning Research
, 2003
"... This paper presents a new algorithm for the independent components analysis (ICA) problem based on an efficient entropy estimator. Like many previous methods, this algorithm directly minimizes the measure of departure from independence according to the estimated Kullback-Leibler divergence betwee ..."
Abstract
-
Cited by 36 (3 self)
- Add to MetaCart
This paper presents a new algorithm for the independent components analysis (ICA) problem based on an efficient entropy estimator. Like many previous methods, this algorithm directly minimizes the measure of departure from independence according to the estimated Kullback-Leibler divergence between the joint distribution and the product of the marginal distributions. We pair this approach with efficient entropy estimators from the statistics literature. In particular, the entropy estimator we use is consistent and exhibits rapid convergence. The algorithm based on this estimator is simple, computationally efficient, intuitively appealing, and outperforms other well known algorithms. In addition, the estimator's relative insensitivity to outliers translates into superior performance by our ICA algorithm on outlier tests. We present favorable comparisons to the Kernel ICA, FAST-ICA, JADE, and extended Infomax algorithms in extensive simulations. We also provide public domain source code for our algorithms.
Alpha-Divergence for Classification, Indexing and Retrieval
- UNIVERSITY OF MICHIGAN
, 2001
"... Motivated by Chernoff's bound on asymptotic probability of error we propose the alpha-divergence measure and a surrogate, the alpha-Jensen difference, for feature classification, indexing and retrieval in image and other databases. The alpha- ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
Motivated by Chernoff's bound on asymptotic probability of error we propose the alpha-divergence measure and a surrogate, the alpha-Jensen difference, for feature classification, indexing and retrieval in image and other databases. The alpha-
Asymptotic Theory of Greedy Approximations to Minimal K-Point Random Graphs
"... Let Xn = fx 1 ; : : : ; xn g, be an i.i.d. sample having multivariate distribution P . We derive a.s. limits for the power weighted edge weight function of greedy approximations to a class of minimal graphs spanning k of the n samples. The class includes minimal k-point graphs constructed by the p ..."
Abstract
-
Cited by 31 (13 self)
- Add to MetaCart
Let Xn = fx 1 ; : : : ; xn g, be an i.i.d. sample having multivariate distribution P . We derive a.s. limits for the power weighted edge weight function of greedy approximations to a class of minimal graphs spanning k of the n samples. The class includes minimal k-point graphs constructed by the partitioning method of Ravi, Sundaram, Marathe, Rosenkrantz and Ravi [43] where the edge weight function satises the quasi-additive property of Redmond and Yukich [45]. In particular this includes greedy approximations to the k-point minimal spanning tree (k-MST), Steiner tree (k-ST), and the traveling salesman problem (k-TSP). An expression for the inuence function of the minimal weight function is given which characterizes the asymptotic sensitivity of the graph weight to perturbations in the underlying distribution. The inuence function takes a form which indicates that the k-point minimal graph in d > 1 dimensions has robustness properties in IR d which are analogous to those of rank order statistics in one dimension. A direct result of our theory is that the log-weight of the k-point minimal graph is a consistent nonparametric estimate of the Renyi entropy of the distribution P . Possible applications of this work include: analysis of random communication network topologies, estimation of the mixing coecient in -contaminated mixture models, outlier discrimination and rejection, clustering and pattern recognition, robust non-parametric regression, two sample matching and image registration.
A nonparametric statistical method for image segmentation using information theory and curve evolution
- IEEE Trans. Image Processing
, 2005
"... Abstract—In this paper, we present a new information-theoretic approach to image segmentation. We cast the segmentation problem as the maximization of the mutual information between the region labels and the image pixel intensities, subject to a constraint on the total length of the region boundarie ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
Abstract—In this paper, we present a new information-theoretic approach to image segmentation. We cast the segmentation problem as the maximization of the mutual information between the region labels and the image pixel intensities, subject to a constraint on the total length of the region boundaries. We assume that the probability densities associated with the image pixel intensities within each region are completely unknown a priori, and we formulate the problem based on nonparametric density estimates. Due to the nonparametric structure, our method does not require the image regions to have a particular type of probability distribution and does not require the extraction and use of a particular statistic. We solve the information-theoretic optimization problem by deriving the associated gradient flows and applying curve evolution techniques. We use level-set methods to implement the resulting evolution. The experimental results based on both synthetic and real images demonstrate that the proposed technique can solve a variety of challenging image segmentation problems. Futhermore, our method, which does not require any training, performs as good as methods based on training. Index Terms—Curve evolution, image segmentation, information theory, level-set methods, nonparametric density estimation.
Divergence estimation of continuous distributions based on data-dependent partitions
- IEEE Transactions on Information Theory
, 2005
"... Abstract—We present a universal estimator of the divergence @ A for two arbitrary continuous distributions and satisfying certain regularity conditions. This algorithm, which observes independent and identically distributed (i.i.d.) samples from both and, is based on the estimation of the Radon–Niko ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Abstract—We present a universal estimator of the divergence @ A for two arbitrary continuous distributions and satisfying certain regularity conditions. This algorithm, which observes independent and identically distributed (i.i.d.) samples from both and, is based on the estimation of the Radon–Nikodym derivative � � via a data-dependent partition of the observation space. Strong convergence of this estimator is proved with an empirically equivalent segmentation of the space. This basic estimator is further improved by adaptive partitioning schemes and by bias correction. The application of the algorithms to data with memory is also investigated. In the simulations, we compare our estimators with the direct plug-in estimator and estimators based on other partitioning approaches. Experimental results show that our methods achieve the best convergence performance in most of the tested cases. Index Terms—Bias correction, data-dependent partition, divergence, Radon–Nikodym derivative, stationary and ergodic data, universal estimation of information measures. I.
Factorial coding of natural images: how effective are linear models in removing higher-order dependencies?
- JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A
, 2006
"... The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), z ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included into the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2x2 to 16x16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the `edge filters‘ found with ICA lead only to a surprisingly small improvement in terms of its actual objective.
Estimation of Rényi information divergence via pruned minimal spanning trees
- in IEEE Workshop on Higher Order Statistics, Caesaria
, 1999
"... In this paper we develop robust estimators of the Rényi information divergence (I-divergence) given a reference distribution and a random sample from an unknown distribution. Estimation is performed by constructing a minimal spanning tree (MST) passing through the random sample points and applying a ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
In this paper we develop robust estimators of the Rényi information divergence (I-divergence) given a reference distribution and a random sample from an unknown distribution. Estimation is performed by constructing a minimal spanning tree (MST) passing through the random sample points and applying a change of measure which flattens the reference distribution. In a mixture model where the reference distribution is contaminated by an unknown noise distribution one can use these results to reject noise samples by implementing a greedy algorithm for pruning the �-longest branches of the MST, resulting in a tree called the �-MST. We illustrate this procedure in the context of density discrimination and robust clustering for a planar mixture model. 1.
A New Class Of Entropy Estimators For Multi-Dimensional Densities
, 2003
"... We present a new class of estimators for approximating the entropy of multi-dimensional probability densities based on a sample of the density. These estimators extend the classic "m-spacing" estimators of Vasicek and others for estimating entropies of one-dimensional probability densities. Unlike p ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We present a new class of estimators for approximating the entropy of multi-dimensional probability densities based on a sample of the density. These estimators extend the classic "m-spacing" estimators of Vasicek and others for estimating entropies of one-dimensional probability densities. Unlike plug-in estimators of entropy, which first estimate a probability density and then compute its entropy, our estimators avoid the difcult intermediate step of density estimation. For fixed dimension, the estimators are polynomial in the sample size. Similarities to consistent and asymptotically efficient one-dimensional estimators of entropy suggest that our estimators may share these properties.
Segmenting and tracking the left ventricle by learning the dynamics in cardiac images
- IN IPMI,
, 2005
"... Having accurate left ventricle (LV) segmentations across a cardiac cycle provides useful quantitative (e.g. ejection fraction) and qualitative information for diagnosis of certain heart conditions. Existing LV segmentation techniques are founded mostly upon algorithms for segmenting static images. ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Having accurate left ventricle (LV) segmentations across a cardiac cycle provides useful quantitative (e.g. ejection fraction) and qualitative information for diagnosis of certain heart conditions. Existing LV segmentation techniques are founded mostly upon algorithms for segmenting static images. In order to exploit the dynamic structure of the heart in a principled manner, we approach the problem of LV segmentation as a recursive estimation problem. In our framework, LV boundaries constitute the dynamic system state to be estimated, and a sequence of observed cardiac images constitute the data. By formulating the problem as one of state estimation, the segmentation at each particular time is based not only on the data observed at that instant, but also on predictions based on past segmentations. This requires a dynamical system model of the LV, which we propose to learn from training data through an information-theoretic approach. To incorporate the learned dynamic model into our segmentation framework and obtain predictions, we use ideas from particle filtering. Our framework uses a curve evolution method to combine such predictions with the observed images to estimate the LV boundaries at each time. We demonstrate the effectiveness of the proposed approach on a large set of cardiac images. We observe that our approach provides more accurate segmentations than those from static image segmentation techniques, especially when the observed data are of limited quality.

