#### DMCA

## ICA Using Spacings Estimates of Entropy (2003)

### Cached

### Download Links

- [www.jmlr.org]
- [www.eecs.berkeley.edu]
- [www.ai.mit.edu]
- [www.jmlr.org]
- [vis-www.cs.umass.edu]
- [www.cs.umass.edu]
- [people.csail.mit.edu]
- [people.csail.mit.edu]
- [cs.umass.edu]
- [people.cs.umass.edu]
- [jmlr.csail.mit.edu]
- [people.cs.umass.edu]
- [people.cs.umass.edu]
- [people.cs.umass.edu]
- [www.eecs.berkeley.edu]
- [www.cs.umass.edu]
- [vis-www.cs.umass.edu]
- [cs.umass.edu]
- [people.cs.umass.edu]
- [people.cs.umass.edu]
- [people.cs.umass.edu]
- [people.cs.umass.edu]
- [www.kecl.ntt.co.jp]
- DBLP

### Other Repositories/Bibliography

Venue: | Journal of Machine Learning Research |

Citations: | 68 - 3 self |

### Citations

12168 |
Elements of Information Theory
- Cover, Thomas
(Show Context)
Citation Context ... if and only if all of the variables are mutually independent, we take (1) as a direct measure of mutual independence. 1273 LEARNED-MILLER AND FISHER As a function of X and W it is easily shown (c.f. =-=Cover and Thomas, 1991-=-) that J(Y ) = D i=1 H(Y i ) -H(X 1 , . . . , XD ) - log(|W |) . In particular, the change in the entropy of the joint distribution under linear transformation is simply the logarithm of the Jacobian ... |

10534 |
A mathematical theory of communication,” The
- Shannon
- 1948
(Show Context)
Citation Context ... 1 , , y D )log p(y 1 , , y D ) p(y 1 )p(y 2 )...p(y D ) d (1) = KL # p(y 1 , , y D )|| D i=1 p(y i ) # = D i=1 H(Y i ) -H(Y 1 , ,Y D ). Here d = dy 1 dy 2 dy D and H(Y ) is the differential entropy (=-=Shannon, 1948-=-) of the continuous multi-dimensional random variable Y . The right side of (1) is the Kullback-Leibler divergence (Kullback, 1959), or relative entropy, between the joint density of {Y 1 , . . . ,Y D... |

2229 | Survey on independent component analysis
- Hyvärinen
- 1999
(Show Context)
Citation Context ..., it has been shown [3, 2] that one can recover the original sources up to a scaling and permutation provided that at most one of the underlying sources is Gaussian and the rest are non-Gaussian. See =-=[4]-=- for an extensive bibliography of the ICA problem. Many approaches start the analysis of the problem by considering the contrast function [2] J(Y ) (3) = ∫ p(y1, · · · , yD) log p(y1, · · · , yD) p(y1... |

1800 |
Independent Component Analysis, a new concept
- Comon
- 1994
(Show Context)
Citation Context ...CA USING SPACINGS ESTIMATES OF ENTROPY via a transformation W on observations of X , that is Y = WX = WAS = BS. Given the minimal statement of the problem, it has been shown (Benveniste et al., 1980, =-=Comon, 1994-=-) that one can recover the original sources up to a scaling and permutation provided that at most one of the underlying sources is Gaussian and the rest are non-Gaussian. Upon pre-whitening the observ... |

1752 | Information theory and statistics
- Kullback
- 1951
(Show Context)
Citation Context ...s. We begin by setting up the problem and discussing aspects of the contrast function originally proposed by Comon (1994), which can be simplified to a sum of one-dimensional marginal entropies (c.f. =-=Kullback, 1959-=-). In Section 2, we discuss entropy estimates based on order statistics. One method in particular satisfies our requirements, the mspacingsestimator (Vasicek, 1976). This entropy estimator leads natur... |

853 | Fast and robust fixed-point algorithms for independent component analysis
- Hyvarinen
- 1999
(Show Context)
Citation Context ...t al., 1992, Pearlmutter and Parra, 1996), moment/cumulant based methods (Comon, 1994, Cardoso and Souloumiac, 1996, Cardoso, 1999a, Hyvärinen, 2001), entropy based methods (Bell and Sejnowski, 1995,=-= Hyvärinen, 1999-=-, Bercher and Vignat, 2000), and correlation based methods (Jutten and Herault, 1991, Bach and Jordan, 2002). Many approaches start the analysis of the problem by considering the contrast function (Co... |

612 | A new learning algorithm for blind signal separation
- Amari, Cichocki, et al.
- 1996
(Show Context)
Citation Context ...e input X for the algorithm6. We then measured the “difference” between the true matrix A and the W returned by the algorithm, according to the well-known criterion (Amari error), due to Amari et al. =-=[15]-=-. Table 1 shows the mean results for each source density on each row, with N = 250, the number of input points, and 100 replications of each experiment. The best performing algorithm on each row is sh... |

551 |
Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture
- Jutten, Hérault
- 1991
(Show Context)
Citation Context ...on, 1994, Cardoso and Souloumiac, 1996, Cardoso, 1999a, Hyv arinen, 2001), entropy based methods (Bell and Sejnowski, 1995, Hyv arinen, 1999, Bercher and Vignat, 2000), and correlation based methods (=-=Jutten and Herault, 1991-=-, Bach and Jordan, 2002). Many approaches start the analysis of the problem by considering the contrast function (Comon, 1994) J(Y ) = # p(y 1 , , y D )log p(y 1 , , y D ) p(y 1 )p(y 2 )...p(y D ) d (... |

453 | Kernel independent component analysis
- Bach, Jordan
- 2002
(Show Context)
Citation Context ...n O(KN logN). 3.2. Experiments in two dimensions To test the algorithm experimentally, we performed a large set of experiments, largely drawn from the comprehensive tests developed by Bach and Jordan =-=[10]-=-. Our tests included comparisons with FastICA [11], the JADE algorithm [12], the extended Infomax algorithm [13], and KernelICA using the generalized variance [10]. For 18 different one-dimensional de... |

367 |
An information maximization approach to blind separation and blind deconvolution
- Bell, Sejnowski
- 1995
(Show Context)
Citation Context ...ood based methods (Pham et al., 1992, Pearlmutter and Parra, 1996), moment/cumulant based methods (Comon, 1994, Cardoso and Souloumiac, 1996, Cardoso, 1999a, Hyv arinen, 2001), entropy based methods (=-=Bell and Sejnowski, 1995-=-, Hyv arinen, 1999, Bercher and Vignat, 2000), and correlation based methods (Jutten and Herault, 1991, Bach and Jordan, 2002). Many approaches start the analysis of the problem by considering the con... |

306 | Independent Component Analysis Using an Extended Infomax Algorithm for Mixed Sub-Gaussian and Super-Gaussian Sources
- Lee, Girolami, et al.
- 1999
(Show Context)
Citation Context ...py, between the joint density of {Y 1 , . . . ,Y D } and the product of its marginals. The utility of (1) for purposes of the ICA problem has been well documented in the literature (c.f. Comon, 1994, =-=Lee et al., 1999-=-a). Briefly, we note that for mutually independent random variablessY 1 ,Y 2 , ...,Y D we have J(Y ) = # p(y 1 , y 2 , ..., y D )log p(y 1 , y 2 , ..., y D ) p(y 1 )p(y 2 )...p(y D ) d = # p(y 1 , y 2... |

265 |
A First Course in Order Statistics
- Arnold, Balakrishnan, et al.
- 1992
(Show Context)
Citation Context ... sample of Z denoted by Z 1 , Z 2 , ..., Z N . The order statistics of a random sample of Z are simply the elements of the sample rearranged in non-decreasing order: Z (1) # Z (2) # ... # Z (N) (c.f. =-=Arnold et al., 1992-=-). A spacing of order m, or m-spacing, is then defined to be Z (i+m) -Z (i) , for 1 # isi +m # N. Finally, if m is a function of N, one may define the m N -spacing as Z (i+m N ) -Z (i) . 1274 ICA USIN... |

249 | High-Order contrasts for independent component analysis. Tutorial NIPS*98
- Cardoso
- 1998
(Show Context)
Citation Context ...ble approaches can be roughly grouped into maximum likelihood based methods (Pham et al., 1992, Pearlmutter and Parra, 1996), moment/cumulant based methods (Comon, 1994, Cardoso and Souloumiac, 1996, =-=Cardoso, 1999-=-a, Hyv arinen, 2001), entropy based methods (Bell and Sejnowski, 1995, Hyv arinen, 1999, Bercher and Vignat, 2000), and correlation based methods (Jutten and Herault, 1991, Bach and Jordan, 2002). Man... |

192 | Jacobi Angles For Simultaneous Diagonalization
- Cardoso, Souloumiac
- 1996
(Show Context)
Citation Context ...roaches. Some of the more notable approaches can be roughly grouped into maximum likelihood based methods (Pham et al., 1992, Pearlmutter and Parra, 1996), moment/cumulant based methods (Comon, 1994, =-=Cardoso and Souloumiac, 1996-=-, Cardoso, 1999a, Hyv arinen, 2001), entropy based methods (Bell and Sejnowski, 1995, Hyv arinen, 1999, Bercher and Vignat, 2000), and correlation based methods (Jutten and Herault, 1991, Bach and Jor... |

133 |
Separation of a mixture of independent sources through a maximum likelihood approach
- Pham, Garrat, et al.
- 1992
(Show Context)
Citation Context ...ward with a minimum of assumptions it has been well studied, resulting in a vast array of approaches. Some of the more notable approaches can be roughly grouped into maximum likelihood based methods (=-=Pham et al., 1992-=-, Pearlmutter and Parra, 1996), moment/cumulant based methods (Comon, 1994, Cardoso and Souloumiac, 1996, Cardoso, 1999a, Hyv arinen, 2001), entropy based methods (Bell and Sejnowski, 1995, Hyv arinen... |

104 | A unifying information-theoretic framework for independent component analysis
- Lee, Girolami, et al.
- 2000
(Show Context)
Citation Context ...py, between the joint density of {Y 1 , . . . ,Y D } and the product of its marginals. The utility of (1) for purposes of the ICA problem has been well documented in the literature (c.f. Comon, 1994, =-=Lee et al., 1999-=-a). Briefly, we note that for mutually independent random variablessY 1 ,Y 2 , ...,Y D we have J(Y ) = # p(y 1 , y 2 , ..., y D )log p(y 1 , y 2 , ..., y D ) p(y 1 )p(y 2 )...p(y D ) d = # p(y 1 , y 2... |

103 | A context-sensitive generalization of ICA
- Pearlmutter, Parra
- 1996
(Show Context)
Citation Context ... of assumptions it has been well studied, resulting in a vast array of approaches. Some of the more notable approaches can be roughly grouped into maximum likelihood based methods (Pham et al., 1992, =-=Pearlmutter and Parra, 1996-=-), moment/cumulant based methods (Comon, 1994, Cardoso and Souloumiac, 1996, Cardoso, 1999a, Hyv arinen, 2001), entropy based methods (Bell and Sejnowski, 1995, Hyv arinen, 1999, Bercher and Vignat, 2... |

99 |
Robust identification of a nonminimum phase system: blind adjustment of a linear equalizer in data communications
- Benveniste, Goursat, et al.
- 1980
(Show Context)
Citation Context ... the mixing matrix 1272 ICA USING SPACINGS ESTIMATES OF ENTROPY via a transformation W on observations of X , that is Y = WX = WAS = BS. Given the minimal statement of the problem, it has been shown (=-=Benveniste et al., 1980-=-, Comon, 1994) that one can recover the original sources up to a scaling and permutation provided that at most one of the underlying sources is Gaussian and the rest are non-Gaussian. Upon pre-whiteni... |

99 | New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit - Hyvärinen - 1998 |

98 |
A test for normality based on sample entropy
- Vasicek
- 1976
(Show Context)
Citation Context ...nd the minimization of J(Y ) reduces to finding W ∗ = argmin W H(Y1) + · · ·+H(YD). (6) To estimate this quantity, we adopt a different entropy estimator, almost identical to one described by Vasicek =-=[6]-=- and others1 in the statistics literature. These estimators, which are discussed below, are known as spacings estimates. 2. SPACINGS ESTIMATES OF ENTROPY Consider a one-dimensional random variable Z, ... |

97 |
der Meulen, “Nonparametric entropy estimation: An overview
- Beirlant, Dudewicz, et al.
- 1997
(Show Context)
Citation Context ...ller and John W. Fisher III. LEARNED-MILLER AND FISHER which is robust to outliers. For this task, we turned to the statistics literature, where entropy estimators have been studied extensively (c.f. =-=Beirlant et al., 1997-=-). 4. As the optimization landscape has potentially many local minima, we eschew gradient descent methods. The fact that the estimator is computationally efficient allows for a global search method. T... |

58 |
and E.Oja. A fast fixed point algorithm for independent component analysis. Neural computation
- Hyvarinen
- 1997
(Show Context)
Citation Context ...the algorithm experimentally, we performed a large set of experiments, largely drawn from the comprehensive tests developed by Bach and Jordan (2002). Our tests included comparisons with FastICA (Hyv =-=arinen and Oja, 1997-=-), the JADE algorithm (Cardoso, 1999b), the extended Infomax algorithm (Lee et al., 1999b), and two versions of KernelICA: KCCA and KGV (Bach and Jordan, 2002). For each of the 18 one-dimensional dens... |

27 |
Modern Concepts and Theorems of Mathematical Statistics
- Manoukian
- 1986
(Show Context)
Citation Context ...iable P(Z). P(Z), the random variable describing the "height" on the cumulative curve of a random draw from Z (as opposed to the function P(z)) is called the probability integral transform o=-=f Z (c.f. Manoukian, 1986-=-). Thus, the key insight is that the intervals between successive order statistics have the same expected probability mass. Using this idea, one can develop a simple entropy estimator. We start by app... |

21 | Estimating the entropy of a signal with applications - Bercher, Vignat - 2000 |

16 |
Limit theorems for sums of general functions of m-spacings
- Hall
- 1984
(Show Context)
Citation Context ...1) -Z (mi+1) ) # . (6) Under the conditions that m,N # , m N # 0, (7) 4. The addition of a small constant renders this estimator weakly consistent for bounded densities under certain tail conditions (=-=Hall, 1984-=-). 1276 ICA USING SPACINGS ESTIMATES OF ENTROPY 0 1 2 3 4 5 0 200 400 600 800 1000 0 1 2 3 4 5 0 200 400 600 800 1000 0 1 2 3 4 5 0 200 400 600 800 1000 0 1 2 3 4 5 0 200 400 600 800 1000 Figure 1: Hi... |

14 |
separation of instantaneous mixture of sources based on order statistics
- Blind
(Show Context)
Citation Context ...al conditions. For details of these results, see the review paper [1]. Perhaps the most important property of this estimator is that it is asymptotically efficient, as shown in [8]. We note that Pham =-=[9]-=- defined an ICA contrast function as a sum of terms very similar to (10). However, by choosing m = 1 as was done in that work, one no longer obtains a consistent estimator of entropy, and the efficien... |

7 | Fisher III, ―Independent components analysis by direct entropy minimization
- Miller, John
- 2003
(Show Context)
Citation Context ...epresents experiments in which two (generally different) source densities were chosen randomly from the set of 18 densities 5These densities and additional details of the experiments are described in =-=[14]-=-. 6Alternatively, we could have applied a random non-singular matrix to the data, and then whitened the data, keeping track of the whitening matrix. For the size of N in this experiment, these two met... |

4 |
Asymptotically optimal estimation of nonlinear functionals
- Levit
- 1978
(Show Context)
Citation Context ...nder a variety of general conditions. For details of these results, see the review paper [1]. Perhaps the most important property of this estimator is that it is asymptotically efficient, as shown in =-=[8]-=-. We note that Pham [9] defined an ICA contrast function as a sum of terms very similar to (10). However, by choosing m = 1 as was done in that work, one no longer obtains a consistent estimator of en... |

1 | arinen. New approximations of differential entropy for independent component analysis and projection pursuit - Hyv - 1997 |

1 | SPACINGS ESTIMATES OF ENTROPY A. Hyv arinen. Fast and robust fixed-point algorithms for independent component analysis - USING - 1999 |