Results 1  10
of
74
NonParametric Bayesian Dictionary Learning for Sparse Image Representations
"... Nonparametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this nonparametric method naturally infers ..."
Abstract

Cited by 92 (34 self)
 Add to MetaCart
(Show Context)
Nonparametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this nonparametric method naturally infers an appropriate dictionary size. The Dirichlet process and a probit stickbreaking process are also considered to exploit structure within an image. The proposed method can learn a sparse dictionary in situ; training images may be exploited if available, but they are not required. Further, the noise variance need not be known, and can be nonstationary. Another virtue of the proposed method is that sequential inference can be readily employed, thereby allowing scaling to large images. Several example results are presented, using both Gibbs and variational Bayesian inference, with comparisons to other stateoftheart approaches.
A Simple Algorithm for Nuclear Norm Regularized Problems
"... Optimization problems with a nuclear norm regularization, such as e.g. low norm matrix factorizations, have seen many applications recently. We propose a new approximation algorithm building upon the recent sparse approximate SDP solver of (Hazan, 2008). The experimental efficiency of our method is ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
(Show Context)
Optimization problems with a nuclear norm regularization, such as e.g. low norm matrix factorizations, have seen many applications recently. We propose a new approximation algorithm building upon the recent sparse approximate SDP solver of (Hazan, 2008). The experimental efficiency of our method is demonstrated on large matrix completion problems such as the Netflix dataset. The algorithm comes with strong convergence guarantees, and can be interpreted as a first theoretically justified variant of SimonFunktype SVD heuristics. The method is free of tuning parameters, and very easy to parallelize. 1.
flda: matrix factorization through latent dirichlet allocation
 In Proc. of WSDM ’10
, 2010
"... We propose fLDA, a novel matrix factorization method to predict ratings in recommender system applications where a “bagofwords ” representation for item metadata is natural. Such scenarios are commonplace in web applications like content recommendation, ad targeting and web search where items ar ..."
Abstract

Cited by 44 (0 self)
 Add to MetaCart
(Show Context)
We propose fLDA, a novel matrix factorization method to predict ratings in recommender system applications where a “bagofwords ” representation for item metadata is natural. Such scenarios are commonplace in web applications like content recommendation, ad targeting and web search where items are articles, ads and web pages respectively. Because of data sparseness, regularization is key to good predictive accuracy. Our method works by regularizing both user and item factors simultaneously through user features and the bag of words associated with each item. Specifically, each word in an item is associated with a discrete latent factor often referred to as the topic of the word; item topics are obtained by averaging topics across all words in an item. Then, user rating on an item is modeled as user’s affinity to the item’s topics where user affinity to topics (user factors) and topic assignments to words in items (item factors) are learned jointly in a supervised fashion. To avoid overfitting, user and item factors are regularized through Gaussian linear regression and Latent Dirichlet Allocation (LDA) priors respectively. We show our model is accurate, interpretable and handles both coldstart and warmstart scenarios seamlessly through a single model. The efficacy of our method is illustrated on benchmark datasets and a new dataset from Yahoo! Buzz where fLDA provides superior predictive accuracy in coldstart scenarios and is comparable to stateoftheart methods in warmstart scenarios. As a byproduct, fLDA also identifies interesting topics that explains useritem interactions. Our method also generalizes a recently proposed technique called supervised LDA (sLDA) to collaborative filtering applications. While sLDA estimates item topic vectors in a supervised fashion for a single regression, fLDA incorporates multiple regressions (one for each user) in estimating the item factors.
Bayesian Robust Principal Component Analysis
, 2010
"... A hierarchical Bayesian model is considered for decomposing a matrix into lowrank and sparse components, assuming the observed matrix is a superposition of the two. The matrix is assumed noisy, with unknown and possibly nonstationary noise statistics. The Bayesian framework infers an approximate r ..."
Abstract

Cited by 42 (4 self)
 Add to MetaCart
A hierarchical Bayesian model is considered for decomposing a matrix into lowrank and sparse components, assuming the observed matrix is a superposition of the two. The matrix is assumed noisy, with unknown and possibly nonstationary noise statistics. The Bayesian framework infers an approximate representation for the noise statistics while simultaneously inferring the lowrank and sparseoutlier contributions; the model is robust to a broad range of noise levels, without having to change model hyperparameter settings. In addition, the Bayesian framework allows exploitation of additional structure in the matrix. For example, in video applications each row (or column) corresponds to a video frame, and we introduce a Markov dependency between consecutive rows in the matrix (corresponding to consecutive frames in the video). The properties of this Markov process are also inferred based on the observed matrix, while simultaneously denoising and recovering the lowrank and sparse components. We compare the Bayesian model to a stateoftheart optimizationbased implementation of robust PCA; considering several examples, we demonstrate competitive performance of the proposed model.
Random function priors for exchangeable arrays with applications to graphs and relational data
 Advances in Neural Information Processing Systems
, 2012
"... Abstract A fundamental problem in the analysis of structured relational data like graphs, networks, databases, and matrices is to extract a summary of the common structure underlying relations between individual entities. Relational data are typically encoded in the form of arrays; invariance to th ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
(Show Context)
Abstract A fundamental problem in the analysis of structured relational data like graphs, networks, databases, and matrices is to extract a summary of the common structure underlying relations between individual entities. Relational data are typically encoded in the form of arrays; invariance to the ordering of rows and columns corresponds to exchangeable arrays. Results in probability theory due to Aldous, Hoover and Kallenberg show that exchangeable arrays can be represented in terms of a random measurable function which constitutes the natural model parameter in a Bayesian model. We obtain a flexible yet simple Bayesian nonparametric model by placing a Gaussian process prior on the parameter function. Efficient inference utilises elliptical slice sampling combined with a random sparse approximation to the Gaussian process. We demonstrate applications of the model to network data and clarify its relation to models in the literature, several of which emerge as special cases.
Learning Probabilistic NonLinear Latent Variable Models for Tracking Complex Activities
"... A common approach for handling the complexity and inherent ambiguities of 3D human pose estimation is to use pose priors learned from training data. Existing approaches however, are either too simplistic (linear), too complex to learn, or can only learn latent spaces from “simple data”, i.e., single ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
(Show Context)
A common approach for handling the complexity and inherent ambiguities of 3D human pose estimation is to use pose priors learned from training data. Existing approaches however, are either too simplistic (linear), too complex to learn, or can only learn latent spaces from “simple data”, i.e., single activities such as walking or running. In this paper, we present an efficient stochastic gradient descent algorithm that is able to learn probabilistic nonlinear latent spaces composed of multiple activities. Furthermore, we derive an incremental algorithm for the online setting which can update the latent space without extensive relearning. We demonstrate the effectiveness of our approach on the task of monocular and multiview tracking and show that our approach outperforms the stateoftheart. 1
Mixed Membership Matrix Factorization
"... Discrete mixed membership modeling and continuous latent factor modeling (also known as matrix factorization) are two popular, complementary approaches to dyadic data analysis. In this work, we develop a fully Bayesian framework for integrating the two approaches into unified Mixed Membership Matrix ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
(Show Context)
Discrete mixed membership modeling and continuous latent factor modeling (also known as matrix factorization) are two popular, complementary approaches to dyadic data analysis. In this work, we develop a fully Bayesian framework for integrating the two approaches into unified Mixed Membership Matrix Factorization (M 3 F) models. We introduce two M 3 F models, derive Gibbs sampling inference procedures, and validate our methods on the EachMovie, MovieLens, and Netflix Prize collaborative filtering datasets. We find that, even when fitting fewer parameters, the M 3 F models outperform stateoftheart latent factor approaches on all benchmarks, yielding the greatest gains in accuracy on sparselyrated, highvariance items. 1.
A Comparative Study of Collaborative Filtering Algorithms,” [Online] arXiv: 1205.3193
, 2012
"... Collaborative filtering is a rapidly advancing research area. Every year several new techniques are proposed and yet it is not clear which of the techniques work best and under what conditions. In this paper we conduct a study comparing several collaborative filtering techniques – both classic and ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
Collaborative filtering is a rapidly advancing research area. Every year several new techniques are proposed and yet it is not clear which of the techniques work best and under what conditions. In this paper we conduct a study comparing several collaborative filtering techniques – both classic and recent stateoftheart – in a variety of experimental contexts. Specifically, we report conclusions controlling for number of items, number of users, sparsity level, performance criteria, and computational complexity. Our conclusions identify what algorithms work well and in what conditions, and contribute to both industrial deployment collaborative filtering algorithms and to the research community. 1
Spike and Slab Variational Inference for MultiTask and Multiple Kernel Learning
"... We introduce a variational Bayesian inference algorithm which can be widely applied to sparse linear models. The algorithm is based on the spike and slab prior which, from a Bayesian perspective, is the golden standard for sparse inference. We apply the method to a general multitask and multiple ke ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
We introduce a variational Bayesian inference algorithm which can be widely applied to sparse linear models. The algorithm is based on the spike and slab prior which, from a Bayesian perspective, is the golden standard for sparse inference. We apply the method to a general multitask and multiple kernel learning model in which a common set of Gaussian process functions is linearly combined with taskspecific sparse weights, thus inducing relation between tasks. This model unifies several sparse linear models, such as generalized linear models, sparse factor analysis and matrix factorization with missing values, so that the variational algorithm can be applied to all these cases. We demonstrate our approach in multioutput Gaussian process regression, multiclass classification, image processing applications and collaborative filtering. 1
Nonparametric bayesian matrix completion
 In SAM
, 2010
"... Abstract—The BetaBinomial processes are considered for inferring missing values in matrices. The model moves beyond the lowrank assumption, modeling the matrix columns as residing in a nonlinear subspace. Largescale problems are considered via efficient Gibbs sampling, yielding predictions as wel ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
Abstract—The BetaBinomial processes are considered for inferring missing values in matrices. The model moves beyond the lowrank assumption, modeling the matrix columns as residing in a nonlinear subspace. Largescale problems are considered via efficient Gibbs sampling, yielding predictions as well as a measure of confidence in each prediction. Algorithm performance is considered for several datasets, with encouraging performance relative to existing approaches. I.