Results 1 - 10
of
43
Kernel measures of conditional dependence
- In Adv. NIPS
, 2008
"... We propose a new measure of conditional dependence of random variables, based on normalized cross-covariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a ..."
Abstract
-
Cited by 31 (24 self)
- Add to MetaCart
We propose a new measure of conditional dependence of random variables, based on normalized cross-covariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a wide class of kernels. At the same time, it has a straightforward empirical estimate with good convergence behaviour. We discuss the theoretical properties of the measure, and demonstrate its application in experiments. 1
ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context
- BMC Bioinformatics
, 2006
"... * To whom correspondence should be addressed ..."
A class of Rényi information estimators for multidimensional densities
- Annals of Statistics
, 2008
"... A class of estimators of the Rényi and Tsallis entropies of an unknown distribution f in R m is presented. These estimators are based on the kth nearest-neighbor distances computed from a sample of N i.i.d. vectors with distribution f. We show that entropies of any order q, including Shannon’s entro ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
A class of estimators of the Rényi and Tsallis entropies of an unknown distribution f in R m is presented. These estimators are based on the kth nearest-neighbor distances computed from a sample of N i.i.d. vectors with distribution f. We show that entropies of any order q, including Shannon’s entropy, can be estimated consistently with minimal assumptions on f.Moreover, we show that it is straightforward to extend the nearest-neighbor method to estimate the statistical distance between two distributions using one i.i.d. sample from each.
Hierarchical clustering based on mutual information
, 2003
"... Motivation: Clustering is a frequently used concept in variety of bioinformatical applications. We present a new method for hierarchical clustering of data called mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Motivation: Clustering is a frequently used concept in variety of bioinformatical applications. We present a new method for hierarchical clustering of data called mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property: The MI between three objects X, Y, and Z is equal to the sum of the MI between X and Y, plus the MI between Z and the combined object (XY). Results: We use this both in the Shannon (probabilistic) version of information theory, where the “objects ” are probability distributions represented by random samples, and in the Kolmogorov (algorithmic) version, where the “objects ” are symbol sequences. We apply our method to the construction of mammal phylogenetic trees from mitochondrial DNA sequences and we reconstruct the fetal ECG from the output of independent components analysis (ICA) applied to the ECG of a pregnant woman. Availability: The programs for estimation of MI and for clustering (probabilistic version) are available
Separating a real-life nonlinear image mixture
- Journal of Machine Learning Research
, 2005
"... Note: The pagination of this version is slightly different from the pagination of the version published in the Journal of Machine Learning Research, and available in its web site. The contents are the same, however. When acquiring an image of a paper document, the image printed on the back page some ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Note: The pagination of this version is slightly different from the pagination of the version published in the Journal of Machine Learning Research, and available in its web site. The contents are the same, however. When acquiring an image of a paper document, the image printed on the back page sometimes shows through. The mixture of the front- and back-page images thus obtained is markedly nonlinear, and thus constitutes a good real-life test case for nonlinear blind source separation. This paper addresses a difficult version of this problem, corresponding to the use of “onion skin ” paper, which results in a relatively strong nonlinearity of the mixture, which becomes close to singular in the lighter regions of the images. The separation is achieved through the MISEP technique, which is an extension of the well known INFOMAX method. The separation results are assessed with objective quality measures. They show an improvement over the results obtained with linear separation, but have room for further improvement.
Nonlinear Multivariate Analysis of Neurophysiological Signals
- Progress in Neurobiology
, 2005
"... Multivariate time series analysis is extensively used in neurophysiology with the aim of studying the relationship between simultaneously recorded signals. Recently, advances on information theory and nonlinear dynamical systems theory have allowed the study of various types of synchronization from ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Multivariate time series analysis is extensively used in neurophysiology with the aim of studying the relationship between simultaneously recorded signals. Recently, advances on information theory and nonlinear dynamical systems theory have allowed the study of various types of synchronization from time series. In this work, we first describe the multivariate linear methods most commonly used in neurophysiology and show that they can be extended to assess the existence of nonlinear interdependences between signals. We then review the concepts of entropy and mutual information followed by a detailed description of nonlinear methods based on the concepts of phase synchronization, generalized synchronization and event synchronization. In all cases, we show how to apply these methods to study different kinds of neurophysiological data. Finally, we illustrate the use of multivariate surrogate data test for the assessment of the strength (strong or weak) and the type (linear or nonlinear) of interdependence between neurophysiological signals.
Nonlinear Extraction of Independent Components of Natural Images Using Radial Gaussianization
, 2009
"... We consider the problem of efficiently encoding a signal by transforming it to a new representation whose components are statistically independent. A widely studied linear solution, known as independent component analysis (ICA), exists for the case when the signal is generated as a linear transforma ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We consider the problem of efficiently encoding a signal by transforming it to a new representation whose components are statistically independent. A widely studied linear solution, known as independent component analysis (ICA), exists for the case when the signal is generated as a linear transformation of independent nongaussian sources. Here, we examine a complementary case, in which the source is nongaussian and elliptically symmetric. In this case, no invertible linear transform suffices to decompose the signal into independent components, but we show that a simple nonlinear transformation, which we call radial gaussianization (RG), is able to remove all dependencies. We then examine this methodology in the context of natural image statistics. We first show that distributions of spatially proximal bandpass filter responses are better described as elliptical than as linearly transformed independent sources. Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either nearby pairs or blocks of bandpass filter responses is significantly greater than that achieved by ICA. Finally, we show that the RG transformation may be closely approximated by divisive normalization, which has been used to model the nonlinear response properties of visual neurons.
A Nearest-Neighbor Approach to Estimating Divergence between Continuous Random Vectors
, 2006
"... A method for divergence estimation between multidimensional distributions based on nearest neighbor distances is proposed. Given i.i.d. samples, both the bias and the variance of this estimator are proven to vanish as sample sizes go to infinity. In experiments on high-dimensional data, the nearest ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
A method for divergence estimation between multidimensional distributions based on nearest neighbor distances is proposed. Given i.i.d. samples, both the bias and the variance of this estimator are proven to vanish as sample sizes go to infinity. In experiments on high-dimensional data, the nearest neighbor approach generally exhibits faster convergence compared to previous algorithms based on partitioning.
Estimation of Rényi entropy and mutual information based on generalized nearest-neighbor graphs
, 2010
"... We present simple and computationally efficient nonparametric estimators of Rényi entropy and mutual information based on an i.i.d. sample drawn from an unknown, absolutely continuous distribution over R d. The estimators are calculated as the sum of p-th powers of the Euclidean lengths of the edges ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
We present simple and computationally efficient nonparametric estimators of Rényi entropy and mutual information based on an i.i.d. sample drawn from an unknown, absolutely continuous distribution over R d. The estimators are calculated as the sum of p-th powers of the Euclidean lengths of the edges of the ‘generalized nearest-neighbor ’ graph of the sample and the empirical copula of the sample respectively. For the first time, we prove the almost sure consistency of these estimators and upper bounds on their rates of convergence, the latter of which under the assumption that the density underlying the sample is Lipschitz continuous. Experiments demonstrate their usefulness in independent subspace analysis. 1
Statistical modelling and steganalysis of DFT-based image steganography
- Proc. of SPIE Electronic Imaging
, 2006
"... Note: This is a revised version of the original SPIE 2006 paper in which a mistake was made in normalizing features before feeding them to the classifier: features from cover images and features from stego images were normalized differently. Due to this mistake, extra information about image classes ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Note: This is a revised version of the original SPIE 2006 paper in which a mistake was made in normalizing features before feeding them to the classifier: features from cover images and features from stego images were normalized differently. Due to this mistake, extra information about image classes was introduced to classification and the result was exceptionally good—the detection rate is close to 100%. We correct the mistake in this revision and most changes are made at the end of Sec. 4 and in Sec. 5 to present the correct results. Corresponding changes are also made in the conclusion (Sec. 6) while other sections remain intact. An accurate statistical model of cover images is essential to the success of both steganography and steganalysis. We study the statistics of the full-frame two-dimensional discrete Fourier transform (DFT) coefficients of natural images and show that the independently and identically distributed model with unit exponential distribution is not a sufficiently accurate description of the statistics of normalized image periodograms. Consequently, the stochastic quantization index modulation (QIM) algorithm that aims at preserving this model is detectable in principle. To discriminate the resulted stegoimages from cover images, we train a learning system on them. Building upon a state-of-the-art steganalysis method using the statistical moments of wavelet characteristic functions, we propose new features that are more sensitive to data embedding. The addition of these features significantly improves the steganalyzer’s receiver operating characteristic (ROC) curve.

