Results 11  20
of
884
On the best rank1 approximation of higherorder supersymmetric tensors
 SIAM J. Matrix Anal. Appl
, 2002
"... Abstract. Recently the problem of determining the best, in the leastsquares sense, rank1 approximation to a higherorder tensor was studied and an iterative method that extends the wellknown power method for matriceswasproposed for itssolution. Thishigherorder power method is also proposed for th ..."
Abstract

Cited by 76 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Recently the problem of determining the best, in the leastsquares sense, rank1 approximation to a higherorder tensor was studied and an iterative method that extends the wellknown power method for matriceswasproposed for itssolution. Thishigherorder power method is also proposed for the special but important class of supersymmetric tensors, with no change. A simplified version, adapted to the special structure of the supersymmetric problem, is deemed unreliable, asitsconvergence isnot guaranteed. The aim of thispaper isto show that a symmetric version of the above method converges under assumptions of convexity (or concavity) for the functional induced by the tensor in question, assumptions that are very often satisfied in practical applications. The use of this version entails significant savings in computational complexity as compared to the unconstrained higherorder power method. Furthermore, a novel method for initializing the iterative processisdeveloped which hasbeen observed to yield an estimate that liescloser to the global optimum than the initialization suggested before. Moreover, its proximity to the global optimum is a priori quantifiable. In the course of the analysis, some important properties that the supersymmetry of a tensor implies for its square matrix unfolding are also studied.
ICA Using Spacings Estimates of Entropy
 Journal of Machine Learning Research
, 2003
"... This paper presents a new algorithm for the independent components analysis (ICA) problem based on an efficient entropy estimator. Like many previous methods, this algorithm directly minimizes the measure of departure from independence according to the estimated KullbackLeibler divergence betwee ..."
Abstract

Cited by 74 (3 self)
 Add to MetaCart
(Show Context)
This paper presents a new algorithm for the independent components analysis (ICA) problem based on an efficient entropy estimator. Like many previous methods, this algorithm directly minimizes the measure of departure from independence according to the estimated KullbackLeibler divergence between the joint distribution and the product of the marginal distributions. We pair this approach with efficient entropy estimators from the statistics literature. In particular, the entropy estimator we use is consistent and exhibits rapid convergence. The algorithm based on this estimator is simple, computationally efficient, intuitively appealing, and outperforms other well known algorithms. In addition, the estimator's relative insensitivity to outliers translates into superior performance by our ICA algorithm on outlier tests. We present favorable comparisons to the Kernel ICA, FASTICA, JADE, and extended Infomax algorithms in extensive simulations. We also provide public domain source code for our algorithms.
Infinite Sparse Factor Analysis and Infinite Independent Components Analysis
"... Abstract. A nonparametric Bayesian extension of Independent Components Analysis (ICA) is proposed where observed data Y is modelled as a linear superposition, G, of a potentially infinite number of hidden sources, X. Whether a given source is active for a specific data point is specified by an infin ..."
Abstract

Cited by 60 (11 self)
 Add to MetaCart
(Show Context)
Abstract. A nonparametric Bayesian extension of Independent Components Analysis (ICA) is proposed where observed data Y is modelled as a linear superposition, G, of a potentially infinite number of hidden sources, X. Whether a given source is active for a specific data point is specified by an infinite binary matrix, Z. The resulting sparse representation allows increased data reduction compared to standard ICA. We define a prior on Z using the Indian Buffet Process (IBP). We describe four variants of the model, with Gaussian or Laplacian priors on X and the one or twoparameter IBPs. We demonstrate Bayesian inference under these models using a Markov Chain Monte Carlo (MCMC) algorithm on synthetic and gene expression data and compare to standard ICA algorithms. 1
Blind source separation using Renyi’s mutual information,”
 IEEE Signal Processing Lett.,
, 2001
"... AbstractA blind source separation algorithm is proposed that is based on minimizing Renyi's mutual information by means of nonparametric probability density function (PDF) estimation. The twostqge process consists of spatial whitening and a series of Givens rotations and produces a cost func ..."
Abstract

Cited by 57 (20 self)
 Add to MetaCart
(Show Context)
AbstractA blind source separation algorithm is proposed that is based on minimizing Renyi's mutual information by means of nonparametric probability density function (PDF) estimation. The twostqge process consists of spatial whitening and a series of Givens rotations and produces a cost function consisting only of marginal entropies. This formulation avoids the problems of PDF inaccuracy due to truncation of series expansion and the estimation of joint PDFs in highdimensional spaces given the typical paucity of data. Simulations illustrate the superior efficiency, in terms of data length, of the proposed method compared to fast independent component analysis (FastICA), Comon's minimum mutual information, and Bell and Sejnowski's Infomax.
Optimal Linear Representations of Images for Object Recognition
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... Linear representations of images are commonly used in object recognition; however, frequently used ones (namely, PCA, ICA, and FDA) are generally far from optimal in terms of actual recognition performance. We propose a (MonteCarlo) simulated annealing algorithm that leads to optimal linear represe ..."
Abstract

Cited by 55 (13 self)
 Add to MetaCart
Linear representations of images are commonly used in object recognition; however, frequently used ones (namely, PCA, ICA, and FDA) are generally far from optimal in terms of actual recognition performance. We propose a (MonteCarlo) simulated annealing algorithm that leads to optimal linear representations by maximizing the performance over subspaces. We illustrate its effectiveness using recognition experiments.
Efficient Variant of Algorithm for FastICA for Independent Component Analysis Attaining the CramérRao Lower Bound
 IEEE Trans. Neural Net
, 2006
"... Abstract—FastICA is one of the most popular algorithms for independent component analysis (ICA), demixing a set of statistically independent sources that have been mixed linearly. A key question is how accurate the method is for finite data samples. We propose an improved version of the FastICA alg ..."
Abstract

Cited by 54 (5 self)
 Add to MetaCart
(Show Context)
Abstract—FastICA is one of the most popular algorithms for independent component analysis (ICA), demixing a set of statistically independent sources that have been mixed linearly. A key question is how accurate the method is for finite data samples. We propose an improved version of the FastICA algorithm which is asymptotically efficient, i.e., its accuracy given by the residual error variance attains the Cramér–Rao lower bound (CRB). The error is thus as small as possible. This result is rigorously proven under the assumption that the probability distribution of the independent signal components belongs to the class of generalized Gaussian (GG) distributions with parameter, denoted GG () for 2. We name the algorithm efficient FastICA (EFICA). Computational complexity of a Matlab implementation of the algorithm is shown to be only slightly (about three times) higher than that of the standard symmetric FastICA. Simulations corroborate these claims and show superior performance of the algorithm compared with algorithm JADE of Cardoso and Souloumiac and nonparametric ICA of Boscolo et al. on separating sources with distributionGG ( ) with arbitrary, as well as on sources with bimodal distribution, and a good performance in separating linearly mixed speech signals. Index Terms—Algorithm FastICA, blind deconvolution, blind source separation, Cramér–Rao lower bound (CRB), independent component analysis (ICA). I.
Bayesian Learning via Stochastic Gradient Langevin Dynamics
"... In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small minibatches. By adding the right amount of noise to a standard stochastic gradient optimization algorithm we show that the iterates will converge to samples from the true posterior ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
(Show Context)
In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small minibatches. By adding the right amount of noise to a standard stochastic gradient optimization algorithm we show that the iterates will converge to samples from the true posterior distribution as we anneal the stepsize. This seamless transition between optimization and Bayesian posterior sampling provides an inbuilt protection against overfitting. We also propose a practical method for Monte Carlo estimates of posterior statistics which monitors a “sampling threshold ” and collects samples after it has been surpassed. We apply the method to three models: a mixture of Gaussians, logistic regression and ICA with natural gradients. 1.
Bubbles: A Unifying Framework for LowLevel Statistical Properties of Natural Image Sequences
, 2003
"... This paper proposes a unifying framework for several models of the statistical structure of natural image sequences. The framework combines three properties: sparseness, temporal coherence, and energy correlations; these will be reviewed below. It leads to models where the joint activation of the li ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
This paper proposes a unifying framework for several models of the statistical structure of natural image sequences. The framework combines three properties: sparseness, temporal coherence, and energy correlations; these will be reviewed below. It leads to models where the joint activation of the linear filters (simple cells) takes the form of "bubbles," which are regions of activity that are localized both in time and in space, space meaning the cortical surface or a grid on which the filters are arranged. The paper is organized as follows. First, we discuss the principal statistical properties of natural images investigated so far, and we examine how these can be used in the estimation of a linear image model (Section 2). Then we show how sparseness and temporal coherence can be combined in a single model, which is based on the concept of temporal bubbles, and attempt to demonstrate that this gives a better model of the outputs of Gaborlike linear filters than either of the criteria alone (Section 3). We extend the model to include topography as well, leading to the intuitive notion of spatiotemporal bubbles (Section 4). We also discuss the extensions of the framework to spatiotemporal receptive fields (Section 5). Finally, we discuss the utility of our model and its relation to other models (Section 6)
Denoising Source Separation
"... A new algorithmic framework called denoising source separation (DSS) is introduced. The main benefit of this framework is that it allows for easy development of new source separation algorithms which are optimised for specific problems. In this framework, source separation algorithms are constuct ..."
Abstract

Cited by 49 (7 self)
 Add to MetaCart
(Show Context)
A new algorithmic framework called denoising source separation (DSS) is introduced. The main benefit of this framework is that it allows for easy development of new source separation algorithms which are optimised for specific problems. In this framework, source separation algorithms are constucted around denoising procedures. The resulting algorithms can range from almost blind to highly specialised source separation algorithms. Both simple linear and more complex nonlinear or adaptive denoising schemes are considered. Some existing independent component analysis algorithms are reinterpreted within DSS framework and new, robust blind source separation algorithms are suggested. Although DSS algorithms need not be explicitly based on objective functions, there is often an implicit objective function that is optimised. The exact relation between the denoising procedure and the objective function is derived and a useful approximation of the objective function is presented. In the experimental section, various DSS schemes are applied extensively to artificial data, to real magnetoencephalograms and to simulated CDMA mobile network signals. Finally, various extensions to the proposed DSS algorithms are considered. These include nonlinear observation mappings, hierarchical models and overcomplete, nonorthogonal feature spaces. With these extensions, DSS appears to have relevance to many existing models of neural information processing.