Results 1  10
of
55
Bayesian Compressive Sensing
, 2007
"... The data of interest are assumed to be represented as Ndimensional real vectors, and these vectors are compressible in some linear basis B, implying that the signal can be reconstructed accurately using only a small number M ≪ N of basisfunction coefficients associated with B. Compressive sensing ..."
Abstract

Cited by 327 (24 self)
 Add to MetaCart
(Show Context)
The data of interest are assumed to be represented as Ndimensional real vectors, and these vectors are compressible in some linear basis B, implying that the signal can be reconstructed accurately using only a small number M ≪ N of basisfunction coefficients associated with B. Compressive sensing is a framework whereby one does not measure one of the aforementioned Ndimensional signals directly, but rather a set of related measurements, with the new measurements a linear combination of the original underlying Ndimensional signal. The number of required compressivesensing measurements is typically much smaller than N, offering the potential to simplify the sensing system. Let f denote the unknown underlying Ndimensional signal, and g a vector of compressivesensing measurements, then one may approximate f accurately by utilizing knowledge of the (underdetermined) linear relationship between f and g, in addition to knowledge of the fact that f is compressible in B. In this paper we employ a Bayesian formalism for estimating the underlying signal f based on compressivesensing measurements g. The proposed framework has the following properties: (i) in addition to estimating the underlying signal f, “error bars ” are also estimated, these giving a measure of confidence in the inverted signal; (ii) using knowledge of the error bars, a principled means is provided for determining when a sufficient
Sparse Bayesian learning for basis selection
 IEEE Transactions on Signal Processing
, 2004
"... Abstract—Sparse Bayesian learning (SBL) and specifically relevance vector machines have received much attention in the machine learning literature as a means of achieving parsimonious representations in the context of regression and classification. The methodology relies on a parameterized prior tha ..."
Abstract

Cited by 144 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Sparse Bayesian learning (SBL) and specifically relevance vector machines have received much attention in the machine learning literature as a means of achieving parsimonious representations in the context of regression and classification. The methodology relies on a parameterized prior that encourages models with few nonzero weights. In this paper, we adapt SBL to the signal processing problem of basis selection from overcomplete dictionaries, proving several results about the SBL cost function that elucidate its general behavior and provide solid theoretical justification for this application. Specifically, we have shown that SBL retains a desirable property of the 0norm diversity measure (i.e., the global minimum is achieved at the maximally sparse solution) while often possessing a more limited constellation of local minima. We have also demonstrated that the local minima that do exist are achieved at sparse solutions. Later, we provide a novel interpretation of SBL that gives us valuable insight into why it is successful in producing sparse representations. Finally, we include simulation studies comparing sparse Bayesian learning with Basis Pursuit and the more recent FOCal Underdetermined System Solver (FOCUSS) class of basis selection algorithms. These results indicate that our theoretical insights translate directly into improved performance. Index Terms—Basis selection, diversity measures, linear inverse problems, sparse Bayesian learning, sparse representations. I.
Fast Marginal Likelihood Maximisation for Sparse Bayesian Models
 Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics
, 2003
"... The 'sparse Bayesian' modelling approach, as exemplified by the 'relevance vector machine ', enables sparse classification and regression functions to be obtained by linearlyweighting a small nmnber of fixed basis functions from a large dictionary of potential candidates. S ..."
Abstract

Cited by 115 (0 self)
 Add to MetaCart
The 'sparse Bayesian' modelling approach, as exemplified by the 'relevance vector machine ', enables sparse classification and regression functions to be obtained by linearlyweighting a small nmnber of fixed basis functions from a large dictionary of potential candidates. Such a model conveys a nmnber of advantages over the related and very popular 'support vector machine', but the necessary 'training' procedure optimisation of the marginal likelihood function is typically much slower. We describe a new and highly accelerated algorithm which exploits recentlyelucidated properties of the marginal likelihood function to enable maximisation via a principled and efficient sequential addition and deletion of candidate basis functions.
Bayesian inference and optimal design in the sparse linear model
 Workshop on Artificial Intelligence and Statistics
"... The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of ..."
Abstract

Cited by 110 (12 self)
 Add to MetaCart
The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of Bayesian optimal design (or experiment planning), for which accurate estimates of uncertainty are essential. To this end, we employ expectation propagation approximate inference for the linear model with Laplace prior, giving new insight into numerical stability properties and proposing a robust algorithm. We also show how to estimate model hyperparameters by empirical Bayesian maximisation of the marginal likelihood, and propose ideas in order to scale up the method to very large underdetermined problems. We demonstrate the versatility of our framework on the application of gene regulatory network identification from microarray expression data, where both the Laplace prior and the active experimental design approach are shown to result in significant improvements. We also address the problem of sparse coding of natural images, and show how our framework can be used for compressive sensing tasks. Part of this work appeared in Seeger et al. (2007b). The gene network identification application appears in Steinke et al. (2007).
The generalized LASSO
 IEEE Transactions on Neural Networks
"... Abstract—In the last few years, the support vector machine (SVM) method has motivated new interest in kernel regression techniques. Although the SVM has been shown to exhibit excellent generalization properties in many experiments, it suffers from several drawbacks, both of a theoretical and a techn ..."
Abstract

Cited by 72 (0 self)
 Add to MetaCart
(Show Context)
Abstract—In the last few years, the support vector machine (SVM) method has motivated new interest in kernel regression techniques. Although the SVM has been shown to exhibit excellent generalization properties in many experiments, it suffers from several drawbacks, both of a theoretical and a technical nature: the absence of probabilistic outputs, the restriction to Mercer kernels, and the steep growth of the number of support vectors with increasing size of the training set. In this paper, we present a different class of kernel regressors that effectively overcome the above problems. We call this approach generalized LASSO regression. It has a clear probabilistic interpretation, can handle learning sets that are corrupted by outliers, produces extremely sparse solutions, and is capable of dealing with largescale problems. For regression functionals which can be modeled as iteratively reweighted leastsquares (IRLS) problems, we present a highly efficient algorithm with guaranteed global convergence. This defies a unique framework for sparse regression models in the very rich class of IRLS models, including various types of robust regression models and logistic regression. Performance studies for many standard benchmark datasets effectively demonstrate the advantages of this model over related approaches. Index Terms—Kernel regression, probabilistic interpretation, robust loss functions, sparisity, support vector machines (SVMs). tion by the following expansion, employing a Mercer kernel In the SVM regression setting, the approximation quality is measured in terms of aninsensitive loss function. Deviations smaller than a predefined value produce no costs at all, while larger deviations are penalized linearly for for. Implicit with this loss function is a model of noise in which the data are corrupted by uniform noise on the interval, cf. the discussion in Section IIIC. The model complexity is controlled by: 1) choosing an adequate kernel function, i.e., an adequate mapping from the input space into some feature space
Sparse signal recovery with temporally correlated source vectors using sparse Bayesian learning
 IEEE J. Sel. Topics Signal Process
, 2011
"... Abstract — We address the sparse signal recovery problem in the context of multiple measurement vectors (MMV) when elements in each nonzero row of the solution matrix are temporally correlated. Existing algorithms do not consider such temporal correlation and thus their performance degrades signific ..."
Abstract

Cited by 54 (14 self)
 Add to MetaCart
Abstract — We address the sparse signal recovery problem in the context of multiple measurement vectors (MMV) when elements in each nonzero row of the solution matrix are temporally correlated. Existing algorithms do not consider such temporal correlation and thus their performance degrades significantly with the correlation. In this work, we propose a block sparse Bayesian learning framework which models the temporal correlation. We derive two sparse Bayesian learning (SBL) algorithms, which have superior recovery performance compared to existing algorithms, especially in the presence of high temporal correlation. Furthermore, our algorithms are better at handling highly underdetermined problems and require less rowsparsity on the solution matrix. We also provide analysis of the global and local minima of their cost function, and show that the SBL cost function has the very desirable property that the global minimum is at the sparsest solution to the MMV problem. Extensive experiments also provide some interesting results that motivate future theoretical research on the MMV model.
Predictive Automatic Relevance Determination by Expectation Propagation
 in Proceedings of the 21st International Conference on Machine Learning
, 2004
"... In many realworld classification problems the input contains a large number of potentially irrelevant features. This paper proposes a new Bayesian framework for determining the relevance of input features. This approach extends one of the most successful Bayesian methods for feature selection and ..."
Abstract

Cited by 54 (13 self)
 Add to MetaCart
(Show Context)
In many realworld classification problems the input contains a large number of potentially irrelevant features. This paper proposes a new Bayesian framework for determining the relevance of input features. This approach extends one of the most successful Bayesian methods for feature selection and sparse learning, known as Automatic Relevance Determination (ARD). ARD finds the relevance of features by optimizing the model marginal likelihood, also known as the evidence. We show that this can lead to overfitting. To address this problem, we propose Predictive ARD based on estimating the predictive performance of the classifier. While the actual leaveoneout predictive performance is generally very costly to compute, the expectation propagation (EP) algorithm proposed by Minka provides an estimate of this predictive performance as a sideeffect of its iterations. We exploit this in our algorithm to do feature selection, and to select data points in a sparse Bayesian kernel classifier. Moreover, we provide two other improvements to previous algorithms, by replacing Laplace’s approximation with the generally more accurate EP, and by incorporating the fast optimization algorithm proposed by Faul and Tipping. Our experiments show that our method based on the EP estimate of predictive performance is more accurate on test data than relevance determination by optimizing the evidence.
Sparse probabilistic projections
"... We present a generative model for performing sparse probabilistic projections, which includes sparse principal component analysis and sparse canonical correlation analysis as special cases. Sparsity is enforced by means of automatic relevance determination or by imposing appropriate prior distributi ..."
Abstract

Cited by 41 (6 self)
 Add to MetaCart
(Show Context)
We present a generative model for performing sparse probabilistic projections, which includes sparse principal component analysis and sparse canonical correlation analysis as special cases. Sparsity is enforced by means of automatic relevance determination or by imposing appropriate prior distributions, such as generalised hyperbolic distributions. We derive a variational ExpectationMaximisation algorithm for the estimation of the hyperparameters and show that our novel probabilistic approach compares favourably to existing techniques. We illustrate how the proposed method can be applied in the context of cryptoanalysis as a preprocessing tool for the construction of template attacks. 1
Gene selection in cancer classification using sparse logistic regression with bayesian regularization
 Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl386 ..."
Perspectives on sparse Bayesian learning
 Advances in Neural Information Processing Systems 16
, 2004
"... Recently, relevance vector machines (RVM) have been fashioned from a sparse Bayesian learning (SBL) framework to perform supervised learning using a weight prior that encourages sparsity of representation. The methodology incorporates an additional set of hyperparameters governing the prior, one for ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
(Show Context)
Recently, relevance vector machines (RVM) have been fashioned from a sparse Bayesian learning (SBL) framework to perform supervised learning using a weight prior that encourages sparsity of representation. The methodology incorporates an additional set of hyperparameters governing the prior, one for each weight, and then adopts a specific approximation to the full marginalization over all weights and hyperparameters. Despite its empirical success however, no rigorous motivation for this particular approximation is currently available. To address this issue, we demonstrate that SBL can be recast as the application of a rigorous variational approximation to the full model by expressing the prior in a dual form. This formulation obviates the necessity of assuming any hyperpriors and leads to natural, intuitive explanations of why sparsity is achieved in practice. 1