• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces. (2000)

by A Hyvarinen, P O Hoyer
Venue:Neur. Comp.
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 201
Next 10 →

FastISA: A fast fixed-point algorithm for independent subspace analysis

by Aapo Hyvärinen, Urs Köster
"... ..."
Abstract - Cited by 651 (23 self) - Add to MetaCart
Abstract not found

Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria

by Tuomas Virtanen - IEEE Trans. On Audio, Speech and Lang. Processing , 2007
"... Abstract—An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a time-varying gain ..."
Abstract - Cited by 189 (30 self) - Add to MetaCart
Abstract—An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a time-varying gain. Each sound source, in turn, is modeled as a sum of one or more components. The parameters of the components are estimated by minimizing the reconstruction error between the input spectrogram and the model, while restricting the component spectrograms to be nonnegative and favoring components whose gains are slowly varying and sparse. Temporal continuity is favored by using a cost term which is the sum of squared differences between the gains in adjacent frames, and sparseness is favored by penalizing nonzero gains. The proposed iterative estimation algorithm is initialized with random values, and the gains and the spectra are then alternatively updated using multiplicative update rules until the values converge. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and drum sounds. The performance of the proposed method was compared with independent subspace analysis and basic nonnegative matrix factorization, which are based on the same linear model. According to these simulations, the proposed method enables a better separation quality than the previous algorithms. Especially, the temporal continuity criterion improved the detection of pitched musical sounds. The sparseness criterion did not produce significant improvements. Index Terms—Acoustic signal analysis, audio source separation, blind source separation, music, nonnegative matrix factorization, sparse coding, unsupervised learning. I.
(Show Context)

Citation Context

...frequency line as a phase-invariant feature calculated in each frame. The factorization of the spectrogram can be seen as separation of phase-independent features into (2) invariant feature subspaces =-=[16]-=-. Letting the magnitude or power spectrum in frame to be the observation, the separation can be done using basic ICA, as explained above. With this procedure the estimated gains of different component...

Representation learning: A review and new perspectives.

by Yoshua Bengio , Aaron Courville , Pascal Vincent - of IEEE Conf. Comp. Vision Pattern Recog. (CVPR), , 2005
"... Abstract-The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can b ..."
Abstract - Cited by 173 (4 self) - Add to MetaCart
Abstract-The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. This motivates longer term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation, and manifold learning.
(Show Context)

Citation Context

... the same irrespective of where a specific feature is located inside its pooling region. Empirically, the use of pooling seems to contribute significantly to improved classification accuracy in object classification tasks [133], [35], [36]. A successful variant of pooling connected to sparse coding is L2 pooling [103], [110], [122], for which the pool output is the square root of the possibly weighted sum of squares of filter outputs. Ideally, we would like to generalize feature pooling so as to learn what features should be pooled together, for example, as successfully done in several papers [100], [110], [122], [160], [53], [49], [76]. In this way, the pool output learns to be invariant to the variations captured by the span of the features pooled. 11.2.1 Patch-Based Training The simplest approach for learning a convolutional layer in an unsupervised fashion is patch-based training: simply feeding a generic unsupervised feature learning algorithm with local patches extracted at random positions of the inputs. The resulting feature extractor can then be swiped over the input to produce the convolutional feature maps. That map may be used as a new input for the next layer, and the opera...

Sparse deep belief net model for visual area V2

by Chaitanya Ekanadham - Advances in Neural Information Processing Systems 20 , 2008
"... Abstract 1 Motivated in part by the hierarchical organization of the neocortex, a number of recently proposed algorithms have tried to learn hierarchical, or “deep, ” structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed ..."
Abstract - Cited by 164 (19 self) - Add to MetaCart
Abstract 1 Motivated in part by the hierarchical organization of the neocortex, a number of recently proposed algorithms have tried to learn hierarchical, or “deep, ” structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This thesis describes an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks described by Hinton et al. (2006). We learn two layers of representation in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the gabor functions known to model simple cell receptive fields in area V1. Further, the second layer in our model encodes various combinations of the first layer responses in the data. Specifically, it picks up both collinear (“contour”) features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex “corner ” features matches well with the results from Ito & Komatsu’s study of neural responses to angular stimuli in area V2 of the macaque. This suggests that our sparse variant of deep belief networks holds promise for modeling more higher-order features that are encoded in visual cortex. Conversely, one may also interpret the results reported here as suggestive that visual area V2 is performing computations on its input similar to those performed in (sparse) deep belief networks. This plausible relationship generates some intriguing hypotheses about V2 computations. 1 This thesis is an extended version of an earlier paper by Honglak Lee, Chaitanya Ekanadham, and Andrew Ng titled “Sparse deep belief net model for visual area V2.” 1
(Show Context)

Citation Context

...dels [15, 6, 16] are able to learn features that are more complex than simple oriented bars. For example, hierarchical sparse models of natural images have accounted for complex cell receptive fields =-=[17]-=-, topography [18, 6], colinearity and contour coding [19]. Other models, such as [20], have also been shown to give V1 complex cell-like properties. 2.2 Features in visual cortex area V2 It remains un...

A Fast Fixed-Point Algorithm for Independent Component Analysis of Complex Valued Signals

by Ella Bingham, Aapo Hyvärinen , 2000
"... Separation of complex valued signals is a frequently arising problem in signal processing. For example, separation of convolutively mixed source signals involves computations on complex valued signals. In this article it is assumed that the original, complex valued source signals are mutually statis ..."
Abstract - Cited by 133 (1 self) - Add to MetaCart
Separation of complex valued signals is a frequently arising problem in signal processing. For example, separation of convolutively mixed source signals involves computations on complex valued signals. In this article it is assumed that the original, complex valued source signals are mutually statistically independent, and the problem is solved by the independent component analysis (ICA) model. ICA is a statistical method for transforming an observed multidimensional random vector into components that are mutually as independent as possible. In this article, a fast xed-point type algorithm that is capable of separating complex valued, linearly mixed source signals is presented and its computational efficiency is shown by simulations. Also, the local consistency of the estimator given by the algorithm is proved.

Learning Invariant Features through Topographic Filter Maps

by Koray Kavukcuoglu, Rob Fergus, Yann Lecun
"... Several recently-proposed architectures for highperformance object recognition are composed of two main stages: a feature extraction stage that extracts locallyinvariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier. The first stage is often compos ..."
Abstract - Cited by 119 (20 self) - Add to MetaCart
Several recently-proposed architectures for highperformance object recognition are composed of two main stages: a feature extraction stage that extracts locallyinvariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier. The first stage is often composed of three main modules: (1) a bank of filters (often oriented edge detectors); (2) a non-linear transform, such as a point-wise squashing functions, quantization, or normalization; (3) a spatial pooling operation which combines the outputs of similar filters over neighboring regions. We propose a method that automatically learns such feature extractors in an unsupervised fashion by simultaneously learning the filters and the pooling units that combine multiple filter outputs together. The method automatically generates topographic maps of similar filters that extract features of orientations, scales, and positions. These similar filters are pooled together, producing locally-invariant outputs. The learned feature descriptors give comparable results as SIFT on image recognition tasks for which SIFT is well suited, and better results than SIFT on tasks for which SIFT is less well suited. 1.
(Show Context)

Citation Context

...d methods to learn pooled features in the context of computational models of the mammalian primary visual cortex. The idea relies on imposing sparsification criteria on small groups of filter outputs =-=[10, 6, 8]-=-, which can be related to the Group Lasso method for regularization [27]. When the filters that are pooled together are organized in a regular array (1D or 2D), the filters form topographic maps in wh...

A Two-Layer Sparse Coding Model Learns Simple and Complex Cell Receptive Fields and Topography From Natural Images

by Aapo Hyvärinen, Patrik O. Hoyer - VISION RESEARCH , 2001
"... The classical receptive fields of simple cells in the visual cortex have been shown to emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse, i.e. significantly activated only rarely. Here, we show that this single principle of sparseness can ..."
Abstract - Cited by 110 (16 self) - Add to MetaCart
The classical receptive fields of simple cells in the visual cortex have been shown to emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse, i.e. significantly activated only rarely. Here, we show that this single principle of sparseness can also lead to emergence of topography (columnar organization) and complex cell properties as well. These are obtained by maximizing the sparsenesses of locally pooled energies, which correspond to complex cell outputs. Thus we obtain a highly parsimonious model of how these properties of the visual cortex are adapted to the characteristics of the natural input.

Document clustering using nonnegative matrix factorization.

by F Shahnaz, M Berry, P Pauca, R Plemmons - Information Processing and Management , 2006
"... ..."
Abstract - Cited by 107 (8 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...ing scheme, based on the study of neural networks has been suggested by Hoyer [6]. This scheme is applicable to the decomposition of datasets into independent feature subspaces by Hyvärinen and Hoyer =-=[7]-=-. The method proposed by Hoyer [6] has an important feature that enforces a statistical sparsity of the H matrix. As the sparsity of H increases, the basis vectors become more localized, i.e., the par...

Learning Optimized Features for Hierarchical Models of Invariant Object Recognition

by Heiko Wersing, Edgar Körner , 2002
"... There is an ongoing debate over the capabilities of hierarchical neural feed-forward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense rese ..."
Abstract - Cited by 93 (28 self) - Add to MetaCart
There is an ongoing debate over the capabilities of hierarchical neural feed-forward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense research. We propose a feedforward model for recognition that shares components like weightsharing, pooling stages, and competitive nonlinearities with earlier approaches, but focus on new methods for learning optimal featuredetecting cells in intermediate stages of the hierarchical network.
(Show Context)

Citation Context

...analysis (Bell & Sejnowski 1997). These cells perform the initial visual processing and are thus attributed to the initial stages in hierarchical processing. Extensions for complex cells (Hyvärinen &=-= Hoyer 2000; -=-Hoyer & Hyvärinen 2002) and color and stereo coding cells (Hoyer & Hyvärinen 2000) were shown. Recently, also principles of temporal stability or slowness have been proposed and applied to the learn...

A Multi-Layer Sparse Coding Network Learns Contour Coding From Natural Images

by Patrik O. Hoyer, Aapo Hyvärinen , 2002
"... ..."
Abstract - Cited by 61 (10 self) - Add to MetaCart
Abstract not found
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University