Results 1  10
of
584
Linear spatial pyramid matching using sparse coding for image classification
 in IEEE Conference on Computer Vision and Pattern Recognition(CVPR
, 2009
"... Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algo ..."
Abstract

Cited by 497 (21 self)
 Add to MetaCart
(Show Context)
Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multiscale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to stateoftheart performance on several benchmarks by using a single type of descriptors. 1.
From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images
, 2007
"... A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combin ..."
Abstract

Cited by 427 (36 self)
 Add to MetaCart
(Show Context)
A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinatorial in nature, are there efficient methods for finding the sparsest solution? These questions have been answered positively and constructively in recent years, exposing a wide variety of surprising phenomena; in particular, the existence of easilyverifiable conditions under which optimallysparse solutions can be found by concrete, effective computational methods. Such theoretical results inspire a bold perspective on some important practical problems in signal and image processing. Several wellknown signal and image processing problems can be cast as demanding solutions of undetermined systems of equations. Such problems have previously seemed, to many, intractable. There is considerable evidence that these problems often have sparse solutions. Hence, advances in finding sparse solutions to underdetermined systems energizes research on such signal and image processing problems – to striking effect. In this paper we review the theoretical results on sparse solutions of linear systems, empirical
Image denoising by sparse 3D transformdomain collaborative filtering
 IEEE TRANS. IMAGE PROCESS
, 2007
"... We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2D image fragments (e.g., blocks) into 3D data arrays which we call “groups.” Collaborative filtering is a special procedure d ..."
Abstract

Cited by 424 (32 self)
 Add to MetaCart
We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2D image fragments (e.g., blocks) into 3D data arrays which we call “groups.” Collaborative filtering is a special procedure developed to deal with these 3D groups. We realize it using the three successive steps: 3D transformation of a group, shrinkage of the transform spectrum, and inverse 3D transformation. The result is a 3D estimate that consists of the jointly filtered grouped image blocks. By attenuating the noise, the collaborative filtering reveals even the finest details shared by grouped blocks and, at the same time, it preserves the essential unique features of each individual block. The filtered blocks are then returned to their original positions. Because these blocks are overlapping, for each pixel, we obtain many different estimates which need to be combined. Aggregation is a particular averaging procedure which is exploited to take advantage of this redundancy. A significant improvement is obtained by a specially developed collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its efficient implementation are presented in full detail; an extension to colorimage denoising is also developed. The experimental results demonstrate that this computationally scalable algorithm achieves stateoftheart denoising performance in terms of both peak signaltonoise ratio and subjective visual quality.
Online learning for matrix factorization and sparse coding
, 2010
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to ad ..."
Abstract

Cited by 330 (31 self)
 Add to MetaCart
(Show Context)
Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to adapt it to specific data. Variations of this problem include dictionary learning in signal processing, nonnegative matrix factorization and sparse principal component analysis. In this paper, we propose to address these tasks with a new online optimization algorithm, based on stochastic approximations, which scales up gracefully to large data sets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems. A proof of convergence is presented, along with experiments with natural images and genomic data demonstrating that it leads to stateoftheart performance in terms of speed and optimization for both small and large data sets.
Extracting and Composing Robust Features with Denoising Autoencoders
, 2008
"... Previous work has shown that the difficulties in learning deep generative or discriminative models can be overcome by an initial unsupervised learning step that maps inputs to useful intermediate representations. We introduce and motivate a new training principle for unsupervised learning of a repre ..."
Abstract

Cited by 251 (32 self)
 Add to MetaCart
Previous work has shown that the difficulties in learning deep generative or discriminative models can be overcome by an initial unsupervised learning step that maps inputs to useful intermediate representations. We introduce and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern. This approach can be used to train autoencoders, and these denoising autoencoders can be stacked to initialize deep architectures. The algorithm can be motivated from a manifold learning and information theoretic perspective or from a generative model perspective. Comparative experiments clearly show the surprising advantage of corrupting the input of autoencoders on a pattern classification benchmark suite.
Sparse representation for color image restoration
 the IEEE Trans. on Image Processing
, 2007
"... Sparse representations of signals have drawn considerable interest in recent years. The assumption that natural signals, such as images, admit a sparse decomposition over a redundant dictionary leads to efficient algorithms for handling such sources of data. In particular, the design of well adapted ..."
Abstract

Cited by 219 (30 self)
 Add to MetaCart
(Show Context)
Sparse representations of signals have drawn considerable interest in recent years. The assumption that natural signals, such as images, admit a sparse decomposition over a redundant dictionary leads to efficient algorithms for handling such sources of data. In particular, the design of well adapted dictionaries for images has been a major challenge. The KSVD has been recently proposed for this task [1], and shown to perform very well for various grayscale image processing tasks. In this paper we address the problem of learning dictionaries for color images and extend the KSVDbased grayscale image denoising algorithm that appears in [2]. This work puts forward ways for handling nonhomogeneous noise and missing information, paving the way to stateoftheart results in applications such as color image denoising, demosaicing, and inpainting, as demonstrated in this paper. EDICS Category: COLCOLR (Color processing) I.
Image SuperResolution via Sparse Representation
"... This paper presents a new approach to singleimage superresolution, based on sparse signal representation. Research on image statistics suggests that image patches can be wellrepresented as a sparse linear combination of elements from an appropriately chosen overcomplete dictionary. Inspired by th ..."
Abstract

Cited by 194 (9 self)
 Add to MetaCart
This paper presents a new approach to singleimage superresolution, based on sparse signal representation. Research on image statistics suggests that image patches can be wellrepresented as a sparse linear combination of elements from an appropriately chosen overcomplete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the lowresolution input, and then use the coefficients of this representation to generate the highresolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low resolution and high resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches which simply sample a large amount of image patch pairs, reducing the computation cost substantially. The effectiveness of such a sparsity prior is demonstrated for general image superresolution and also for the special case of face hallucination. In both cases, our algorithm can generate highresolution images that are competitive or even superior in quality to images produced by other similar SR methods, but with faster processing speed.
Sparse Representation For Computer Vision and Pattern Recognition
, 2009
"... Techniques from sparse signal representation are beginning to see significant impact in computer vision, often on nontraditional applications where the goal is not just to obtain a compact highfidelity representation of the observed signal, but also to extract semantic information. The choice of ..."
Abstract

Cited by 146 (9 self)
 Add to MetaCart
(Show Context)
Techniques from sparse signal representation are beginning to see significant impact in computer vision, often on nontraditional applications where the goal is not just to obtain a compact highfidelity representation of the observed signal, but also to extract semantic information. The choice of dictionary plays a key role in bridging this gap: unconventional dictionaries consisting of, or learned from, the training samples themselves provide the key to obtaining stateoftheart results and to attaching semantic meaning to sparse signal representations. Understanding the good performance of such unconventional dictionaries in turn demands new algorithmic and analytical techniques. This review paper highlights a few representative examples of how the interaction between sparse signal representation and computer vision can enrich both fields, and raises a number of open questions for further study.
Image superresolution as sparse representation of raw image patches
, 2008
"... This paper addresses the problem of generating a superresolution (SR) image from a single lowresolution input image. We approach this problem from the perspective of compressed sensing. The lowresolution image is viewed as downsampled version of a highresolution image, whose patches are assumed t ..."
Abstract

Cited by 135 (19 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of generating a superresolution (SR) image from a single lowresolution input image. We approach this problem from the perspective of compressed sensing. The lowresolution image is viewed as downsampled version of a highresolution image, whose patches are assumed to have a sparse representation with respect to an overcomplete dictionary of prototype signalatoms. The principle of compressed sensing ensures that under mild conditions, the sparse representation can be correctly recovered from the downsampled signal. We will demonstrate the effectiveness of sparsity as a prior for regularizing the otherwise illposed superresolution problem. We further show that a small set of randomly chosen raw patches from training images of similar statistical nature to the input image generally serve as a good dictionary, in the sense that the computed representation is sparse and the recovered highresolution image is competitive or even superior in quality to images produced by other SR methods.