Results 1  10
of
92
Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity
, 2010
"... A general framework for solving image inverse problems is introduced in this paper. The approach is based on Gaussian mixture models, estimated via a computationally efficient MAPEM algorithm. A dual mathematical interpretation of the proposed framework with structured sparse estimation is describe ..."
Abstract

Cited by 55 (8 self)
 Add to MetaCart
(Show Context)
A general framework for solving image inverse problems is introduced in this paper. The approach is based on Gaussian mixture models, estimated via a computationally efficient MAPEM algorithm. A dual mathematical interpretation of the proposed framework with structured sparse estimation is described, which shows that the resulting piecewise linear estimate stabilizes the estimation when compared to traditional sparse inverse problem techniques. This interpretation also suggests an effective dictionary motivated initialization for the MAPEM algorithm. We demonstrate that in a number of image inverse problems, including inpainting, zooming, and deblurring, the same algorithm produces either equal, often significantly better, or very small margin worse results than the best published ones, at a lower computational cost. 1 I.
CHiLasso: A collaborative hierarchical sparse modeling framework
, 2010
"... Sparse modeling is a powerful framework for data analysis and processing. Traditionally, encoding in this framework is performed by solving an ℓ1regularized linear regression problem, commonly referred to as Lasso or basis pursuit. In this work we combine the sparsityinducing property of the Lasso ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
Sparse modeling is a powerful framework for data analysis and processing. Traditionally, encoding in this framework is performed by solving an ℓ1regularized linear regression problem, commonly referred to as Lasso or basis pursuit. In this work we combine the sparsityinducing property of the Lasso model at the individual feature level, with the blocksparsity property of the Group Lasso model, where sparse groups of features are jointly encoded, obtaining a sparsity pattern hierarchically structured. This results in the Hierarchical Lasso (HiLasso), which shows important practical modeling advantages. We then extend this approach to the collaborative case, where a set of simultaneously coded signals share the same sparsity pattern at the higher (group) level, but not necessarily at the lower (inside the group) level, obtaining the collaborative HiLasso model (CHiLasso). Such signals then share the same active groups, or classes, but not necessarily the same active set. This model is very well suited for applications such as source identification and separation. An efficient optimization procedure, which guarantees convergence to the global optimum, is developed for these new models. The underlying presentation of the new framework and optimization approach is complemented with experimental examples and theoretical results regarding recovery guarantees for the proposed models.
Betanegative binomial process and Poisson factor analysis
 In AISTATS
, 2012
"... A betanegative binomial (BNB) process is proposed, leading to a betagammaPoisson process, which may be viewed as a “multiscoop” generalization of the betaBernoulli process. The BNB process is augmented into a betagammagammaPoisson hierarchical structure, and applied as a nonparametric Bayesia ..."
Abstract

Cited by 22 (9 self)
 Add to MetaCart
(Show Context)
A betanegative binomial (BNB) process is proposed, leading to a betagammaPoisson process, which may be viewed as a “multiscoop” generalization of the betaBernoulli process. The BNB process is augmented into a betagammagammaPoisson hierarchical structure, and applied as a nonparametric Bayesian prior for an infinite Poisson factor analysis model. A finite approximation for the beta process Lévy random measure is constructed for convenient implementation. Efficient MCMC computations are performed with data augmentation and marginalization techniques. Encouraging results are shown on document count matrix factorization. 1
Learning sparse codes for hyperspectral imagery
 IEEE Journal of Selected Topics in Signal Processing
, 2011
"... The spectral features in hyperspectral imagery (HSI) contain significant structure that, if properly characterized could enable more efficient data acquisition and improved data analysis. Because most pixels contain reflectances of just a few materials, we propose that a sparse coding model is well ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
(Show Context)
The spectral features in hyperspectral imagery (HSI) contain significant structure that, if properly characterized could enable more efficient data acquisition and improved data analysis. Because most pixels contain reflectances of just a few materials, we propose that a sparse coding model is wellmatched to HSI data. Sparsity models consider each pixel as a combination of just a few elements from a larger dictionary, and this approach has proven effective in a wide range of applications. Furthermore, previous work has shown that optimal sparse coding dictionaries can be learned from a dataset with no other a priori information (in contrast to many HSI “endmember ” discovery algorithms that assume the presence of pure spectra or side information). We modified an existing unsupervised learning approach and applied it to HSI data (with significant ground truth labeling) to learn an optimal sparse coding dictionary. Using this learned dictionary, we demonstrate three main findings: i) the sparse coding model learns spectral signatures of materials in the scene and locally approximates nonlinear manifolds for individual materials, ii) this learned dictionary can be used to infer HSIresolution data with very high accuracy from simulated imagery collected at multispectrallevel resolution, and iii) this learned dictionary improves the performance of a supervised classification algorithm, both in terms of the classifier complexity and generalization from very small training sets.
Dictionary Learning for Noisy and Incomplete Hyperspectral Images
, 2011
"... We consider analysis of noisy and incomplete hyperspectral imagery, with the objective of removing the noise and inferring the missing data. The noise statistics may be wavelengthdependent, and the fraction of data missing (at random) may be substantial, including potentially entire bands, offering ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
We consider analysis of noisy and incomplete hyperspectral imagery, with the objective of removing the noise and inferring the missing data. The noise statistics may be wavelengthdependent, and the fraction of data missing (at random) may be substantial, including potentially entire bands, offering the potential to significantly reduce the quantity of data that need be measured. To achieve this objective, the imagery is divided into contiguous threedimensional (3D) spatiospectral blocks, of spatial dimension much less than the image dimension. It is assumed that each such 3D block may be represented as a linear combination of dictionary elements of the same dimension, plus noise, and the dictionary elements are learned in situ based on the observed data (no a priori training). The number of dictionary elements needed for representation of any particular block is typically small relative to the block dimensions, and all the image blocks are processed jointly (“collaboratively”) to infer the underlying dictionary. We address dictionary learning from a Bayesian perspective, considering two distinct means of imposing sparse dictionary usage. These models allow inference of the number of dictionary elements needed as well as the underlying wavelengthdependent noise statistics. It is demonstrated that drawing the dictionary elements from a Gaussian process prior, imposing structure on the wavelength dependence of the dictionary elements, yields significant advantages, relative to the moreconventional approach of using an i.i.d. Gaussian prior for the dictionary elements; this advantage is particularly evident in the presence of noise. The framework is demonstrated by processing hyperspectral imagery with a significant number of voxels missing uniformly at random, with imagery at specific wavelengths missing entirely, and in the presence of substantial additive noise.
Online GroupStructured Dictionary Learning ∗
"... We develop a dictionary learning method which is (i) online, (ii) enables overlapping group structures with (iii) nonconvex sparsityinducing regularization and (iv) handles the partially observable case. Structured sparsity and the related group norms have recently gained widespread attention in g ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
We develop a dictionary learning method which is (i) online, (ii) enables overlapping group structures with (iii) nonconvex sparsityinducing regularization and (iv) handles the partially observable case. Structured sparsity and the related group norms have recently gained widespread attention in groupsparsity regularized problems in the case when the dictionary is assumed to be known and fixed. However, when the dictionary also needs to be learned, the problem is much more difficult. Only a few methods have been proposed to solve this problem, and they can handle two of these four desirable properties at most. To the best of our knowledge, our proposed method is the first one that possesses all of these properties. We investigate several interesting special cases of our framework, such as the online, structured, sparse nonnegative matrix factorization, and demonstrate the efficiency of our algorithm with several numerical experiments. 1.
Sparse Signal Recovery and Acquisition with Graphical Models  A review of a broad set of sparse models, analysis tools, and recovery algorithms within the graphical models formalism
, 2010
"... Many applications in digital signal processing, machine learning, and communications feature a linear regression problem in which unknown data points, hidden variables, or code words are projected into a lower dimensional space via y 5 Fx 1 n. (1) In the signal processing context, we refer to x [ R ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
Many applications in digital signal processing, machine learning, and communications feature a linear regression problem in which unknown data points, hidden variables, or code words are projected into a lower dimensional space via y 5 Fx 1 n. (1) In the signal processing context, we refer to x [ R N as the signal, y [ R M as measurements with M, N, F[R M3N as the measurement matrix, and n [ R M as the noise. The measurement matrix F is a matrix with random entries in data streaming, an overcomplete dictionary of features in sparse Bayesian learning, or a code matrix in communications [1]–[3]. Extracting x from y in (1) is ill posed in general since M, N and the measurement matrix F hence has a nontrivial null space; given any vector v in this null space, x 1 v defines a solution that produces the same observations y. Additional information is therefore necessary to distinguish the true x among the infinitely many possible solutions [1], [2], [4], [5]. It is now well known that sparse
Greedy Dictionary Selection for Sparse Representation
"... We discuss how to construct a dictionary by selecting its columns from multiple candidate bases to allow sparse representation of signals. By sparse representation, we mean that only a few dictionary elements, compared to the ambient signal dimension, can be used to wellapproximate the signals. We ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
We discuss how to construct a dictionary by selecting its columns from multiple candidate bases to allow sparse representation of signals. By sparse representation, we mean that only a few dictionary elements, compared to the ambient signal dimension, can be used to wellapproximate the signals. We formulate both the selection of the dictionary columns and the sparse representation of signals as a joint combinatorial optimization problem. The proposed combinatorial objective maximizes variance reduction over the set of signals by constraining the size of the dictionary as well as the number of dictionary columns that can be used to represent each signal. We show that if the columns of the candidate bases are incoherent, our objective function satisfies approximate submodularity. We exploit this property to develop efficient greedy algorithms with wellcharacterized theoretical performance. Applications of dictionary selection include denoising, inpainting, and compressive sensing. We evaluate our approach to reconstruct dictionaries from sparse samples, and also apply it to an image inpainting problem. 1
Universal MAP estimation in compressed sensing
 In Proc. Allerton Conf. Communication, Control, and Computing
, 2011
"... Abstract—We study the compressed sensing (CS) estimation problem where an input is measured via a linear matrix multiplication under additive noise. While this setup usually assumes sparsity or compressibility in the observed signal during recovery, the signal structure that can be leveraged is ofte ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
Abstract—We study the compressed sensing (CS) estimation problem where an input is measured via a linear matrix multiplication under additive noise. While this setup usually assumes sparsity or compressibility in the observed signal during recovery, the signal structure that can be leveraged is often not known a priori. In this paper, we consider universal CS recovery, where the statistics of a stationary ergodic signal source are estimated simultaneously with the signal itself. We provide initial theoretical, algorithmic, and experimental evidence based on maximum a posteriori (MAP) estimation that shows the promise of universality in CS, particularly for lowcomplexity sources that do not exhibit standard sparsity or compressibility. I.
Spike and Slab Variational Inference for MultiTask and Multiple Kernel Learning
"... We introduce a variational Bayesian inference algorithm which can be widely applied to sparse linear models. The algorithm is based on the spike and slab prior which, from a Bayesian perspective, is the golden standard for sparse inference. We apply the method to a general multitask and multiple ke ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
We introduce a variational Bayesian inference algorithm which can be widely applied to sparse linear models. The algorithm is based on the spike and slab prior which, from a Bayesian perspective, is the golden standard for sparse inference. We apply the method to a general multitask and multiple kernel learning model in which a common set of Gaussian process functions is linearly combined with taskspecific sparse weights, thus inducing relation between tasks. This model unifies several sparse linear models, such as generalized linear models, sparse factor analysis and matrix factorization with missing values, so that the variational algorithm can be applied to all these cases. We demonstrate our approach in multioutput Gaussian process regression, multiclass classification, image processing applications and collaborative filtering. 1