Results 1  10
of
33
Bayesian inference and optimal design in the sparse linear model
 Workshop on Artificial Intelligence and Statistics
"... The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of ..."
Abstract

Cited by 111 (13 self)
 Add to MetaCart
The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of Bayesian optimal design (or experiment planning), for which accurate estimates of uncertainty are essential. To this end, we employ expectation propagation approximate inference for the linear model with Laplace prior, giving new insight into numerical stability properties and proposing a robust algorithm. We also show how to estimate model hyperparameters by empirical Bayesian maximisation of the marginal likelihood, and propose ideas in order to scale up the method to very large underdetermined problems. We demonstrate the versatility of our framework on the application of gene regulatory network identification from microarray expression data, where both the Laplace prior and the active experimental design approach are shown to result in significant improvements. We also address the problem of sparse coding of natural images, and show how our framework can be used for compressive sensing tasks. Part of this work appeared in Seeger et al. (2007b). The gene network identification application appears in Steinke et al. (2007).
Sufficient conditions for convergence of the sumproduct algorithm
 IEEE Trans. IT
, 2007
"... Abstract—Novel conditions are derived that guarantee convergence ..."
Abstract

Cited by 62 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Novel conditions are derived that guarantee convergence
Approximations for Binary Gaussian Process Classification
"... We provide a comprehensive overview of many recent algorithms for approximate inference in Gaussian process models for probabilistic binary classification. The relationships between several approaches are elucidated theoretically, and the properties of the different algorithms are corroborated by ex ..."
Abstract

Cited by 62 (2 self)
 Add to MetaCart
We provide a comprehensive overview of many recent algorithms for approximate inference in Gaussian process models for probabilistic binary classification. The relationships between several approaches are elucidated theoretically, and the properties of the different algorithms are corroborated by experimental results. We examine both 1) the quality of the predictive distributions and 2) the suitability of the different marginal likelihood approximations for model selection (selecting hyperparameters) and compare to a gold standard based on MCMC. Interestingly, some methods produce good predictive distributions although their marginal likelihood approximations are poor. Strong conclusions are drawn about the methods: The Expectation Propagation algorithm is almost always the method of choice unless the computational budget is very tight. We also extend existing methods in various ways, and provide unifying code implementing all approaches. Keywords: Gaussian process priors, probabilistic classification, Laplaces’s approximation, expectation propagation, variational bounding, mean field methods, marginal likelihood evidence,
Relational learning with Gaussian processes
 In NIPS 19
, 2007
"... Correlation between instances is often modelled via a kernel function using input attributes of the instances. Relational knowledge can further reveal additional pairwise correlations between variables of interest. In this paper, we develop a class of models which incorporates both reciprocal relat ..."
Abstract

Cited by 45 (10 self)
 Add to MetaCart
Correlation between instances is often modelled via a kernel function using input attributes of the instances. Relational knowledge can further reveal additional pairwise correlations between variables of interest. In this paper, we develop a class of models which incorporates both reciprocal relational information and input attributes using Gaussian process techniques. This approach provides a novel nonparametric Bayesian framework with a datadependent covariance function for supervised learning tasks. We also apply this framework to semisupervised learning. Experimental results on several real world data sets verify the usefulness of this algorithm. 1
Sufficient conditions for convergence of loopy belief propagation
 In Proc. Conference on Uncertainty in Artificial Intelligence (UAI
, 2005
"... We derive novel conditions that guarantee convergence of Loopy Belief Propagation (also known as the SumProduct algorithm) to a unique fixed point. Our results are provably stronger than existing sufficient conditions. We show that the improvement can be quite substantial; in particular, for binary ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
(Show Context)
We derive novel conditions that guarantee convergence of Loopy Belief Propagation (also known as the SumProduct algorithm) to a unique fixed point. Our results are provably stronger than existing sufficient conditions. We show that the improvement can be quite substantial; in particular, for binary variables with (anti)ferromagnetic interactions, our conditions seem to be sharp.
Expectation propagation for exponential families
, 2005
"... This is a tutorial describing the Expectation Propagation (EP) algorithm for a general exponential family. Our focus is on simplicity of exposition. Although the overhead of translating a specific model into its exponential family representation can be considerable, many apparent complications of EP ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
(Show Context)
This is a tutorial describing the Expectation Propagation (EP) algorithm for a general exponential family. Our focus is on simplicity of exposition. Although the overhead of translating a specific model into its exponential family representation can be considerable, many apparent complications of EP can simply be sidestepped by working in this canonical representation. Note: This material is extracted from the Appendix of my PhD thesis (see www.kyb.tuebingen.mpg.de/bs/people/seeger/papers/thesis.html). 1 Exponential Families Definition 1 (Exponential Family) A set F of distributions with densities P (xθ) = exp � θ T φ(x) − Φ(θ) � , θ ∈ Θ, Φ(θ) = log exp � θ T φ(x) � dµ(x) w.r.t. a base measure µ is called an exponential family. Here, θ are called natural parameters, Θ the natural parameter space, φ(x) the sufficient statistics, and Φ(θ) is the log partition function. Furthermore, η = Eθ[φ(x)] are called moment parameters, where Eθ[·]
Fixed points of generalized approximate message passing with arbitrary matrices
 in Proc. ISIT
, 2013
"... ar ..."
(Show Context)
MIMO Detection for HighOrder QAM Based on a Gaussian Tree Approximation
, 2011
"... This paper proposes a new detection algorithm for MIMO communication systems employing highorder QAM constellations. The factor graph that corresponds to this problem is very loopy; in fact, it is a complete graph. Hence, a straightforward application of the Belief Propagation (BP) algorithm yield ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
This paper proposes a new detection algorithm for MIMO communication systems employing highorder QAM constellations. The factor graph that corresponds to this problem is very loopy; in fact, it is a complete graph. Hence, a straightforward application of the Belief Propagation (BP) algorithm yields very poor results. Our algorithm is based on an optimal tree approximation of the Gaussian density of the unconstrained linear system. The finiteset constraint is then applied to obtain a cyclefree discrete distribution. Simulation results show that even though the approximation is not directly applied to the exact discrete distribution, applying the BP algorithm to the cyclefree factor graph outperforms current methods in terms of both performance and complexity. The improved performance of the proposed algorithm is demonstrated on the problem of MIMO detection.
Robust Gaussian Process Regression with a Studentt Likelihood
"... This paper considers the robust and efficient implementation of Gaussian process regression with a Studentt observation model, which has a nonlogconcave likelihood. The challenge with the Studentt model is the analytically intractable inference which is why several approximative methods have bee ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
This paper considers the robust and efficient implementation of Gaussian process regression with a Studentt observation model, which has a nonlogconcave likelihood. The challenge with the Studentt model is the analytically intractable inference which is why several approximative methods have been proposed. Expectation propagation (EP) has been found to be a very accurate method in many empirical studies but the convergence of EP is known to be problematic with models containing nonlogconcave site functions. In this paper we illustrate the situations where standard EP fails to converge and review different modifications and alternative algorithms for improving the convergence. We demonstrate that convergence problems may occur during the typeII maximum a posteriori (MAP) estimation of the hyperparameters and show that standard EP may not converge in the MAP values with some difficult data sets. We present a robust implementation which relies primarily on parallel EP updates and uses a momentmatchingbased doubleloop algorithm with adaptively selected step size in difficult cases. The predictive performance of EP is compared with Laplace, variational Bayes, and Markov chain Monte Carlo approximations. Keywords: Gaussian process, robust regression, Studentt distribution, approximate inference, expectation propagation
Variational and stochastic inference for Bayesian source separation
, 2007
"... We tackle the general linear instantaneous model (possibly underdetermined and noisy) where we model the source prior with a Student t distribution. The conjugateexponential characterisation of the t distribution as an infinite mixture of scaled Gaussians enables us to do efficient inference. We st ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
We tackle the general linear instantaneous model (possibly underdetermined and noisy) where we model the source prior with a Student t distribution. The conjugateexponential characterisation of the t distribution as an infinite mixture of scaled Gaussians enables us to do efficient inference. We study two wellknown inference methods, Gibbs sampler and variational Bayes for Bayesian source separation. We derive both techniques as local message passing algorithms to highlight their algorithmic similarities and to contrast their different convergence characteristics and computational requirements. Our simulation results suggest that typical posterior distributions in source separation have multiple local maxima. Therefore we propose a hybrid approach where we explore the state space with a Gibbs sampler and then switch to a deterministic algorithm. This approach seems to be able to combine the speed of the variational approach with the robustness of the Gibbs sampler.