Results 1  10
of
28
Bilinear Generalized Approximate Message Passing
, 2013
"... Abstract—We extend the generalized approximate message passing (GAMP) approach, originally proposed for highdimensional generalizedlinear regression in the context of compressive sensing, to the generalizedbilinear case, which enables its application to matrix completion, robust PCA, dictionary l ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract—We extend the generalized approximate message passing (GAMP) approach, originally proposed for highdimensional generalizedlinear regression in the context of compressive sensing, to the generalizedbilinear case, which enables its application to matrix completion, robust PCA, dictionary learning, and related matrixfactorization problems. In the first part of the paper, we derive our Bilinear GAMP (BiGAMP) algorithm as an approximation of the sumproduct belief propagation algorithm in the highdimensional limit, where centrallimit theorem arguments and Taylorseries approximations apply, and under the assumption of statistically independent matrix entries with known priors. In addition, we propose an adaptive damping mechanism that aids convergence under finite problem sizes, an expectationmaximization (EM)based method to automatically tune the parameters of the assumed priors, and two rankselection strategies. In the second part of the paper, we discuss the specializations of EMBiGAMP to the problems of matrix completion, robust PCA, and dictionary learning, and present the results of an extensive empirical study comparing EMBiGAMP to stateoftheart algorithms on each problem. Our numerical results, using both synthetic and realworld datasets, demonstrate that EMBiGAMP yields excellent reconstruction accuracy (often best in class) while maintaining competitive runtimes and avoiding the need to tune algorithmic parameters. I.
NonConvex Rank Minimization via an Empirical Bayesian Approach
"... In many applications that require matrix solutions of minimal rank, the underlying cost function is nonconvex leading to an intractable, NPhard optimization problem. Consequently, the convex nuclear norm is frequently used as a surrogate penalty term for matrix rank. The problem is that in many pr ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
In many applications that require matrix solutions of minimal rank, the underlying cost function is nonconvex leading to an intractable, NPhard optimization problem. Consequently, the convex nuclear norm is frequently used as a surrogate penalty term for matrix rank. The problem is that in many practical scenarios there is no longer any guarantee that we can correctly estimate generative lowrank matrices of interest, theoretical special cases notwithstanding. Consequently, this paper proposes an alternative empirical Bayesianprocedure build upon a variational approximation that, unlike the nuclear norm, retains the same globally minimizing point estimate as the rank function under many useful constraints. However, locally minimizing solutions are largely smoothed away via marginalization, allowing the algorithm to succeed when standard convex relaxations completely fail. While the proposed methodology is generally applicable to a wide range of lowrank applications, we focus our attention on the robust principal component analysis problem (RPCA), which involves estimating an unknown lowrank matrix with unknown sparse corruptions. Theoretical and empirical evidence are presented to show that our method is potentially superior to related MAPbased approaches, for which the convex principle componentpursuit(PCP)algorithm(Candès et al., 2011) can be viewed as a special case. 1
Bayesian Robust Matrix Factorization for Image and Video Processing
"... Matrix factorization is a fundamental problem that is often encountered in many computer vision and machine learning tasks. In recent years, enhancing the robustness of matrix factorization methods has attracted much attention in the research community. To benefit from the strengths of full Bayesian ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Matrix factorization is a fundamental problem that is often encountered in many computer vision and machine learning tasks. In recent years, enhancing the robustness of matrix factorization methods has attracted much attention in the research community. To benefit from the strengths of full Bayesian treatment over point estimation, we propose here a full Bayesian approach to robust matrix factorization. For the generative process, the model parameters have conjugate priors and the likelihood (or noise model) takes the form of a Laplace mixture. For Bayesian inference, we devise an efficient sampling algorithm by exploiting a hierarchical view of the Laplace distribution. Besides the basic model, we also propose an extension which assumes that the outliers exhibit spatial or temporal proximity as encountered in many computer vision applications. The proposed methods give competitive experimental results when compared with several stateoftheart methods on some benchmark image and video processing tasks. 1.
Global analytic solution of fullyobserved variational Bayesian matrix factorization
 Journal of Machine Learning Research
"... The variational Bayesian (VB) approximation is known to be a promising approach to Bayesian estimation, when the rigorous calculation of the Bayes posterior is intractable. The VB approximation has been successfully applied to matrix factorization (MF), offering automatic dimensionality selection fo ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
The variational Bayesian (VB) approximation is known to be a promising approach to Bayesian estimation, when the rigorous calculation of the Bayes posterior is intractable. The VB approximation has been successfully applied to matrix factorization (MF), offering automatic dimensionality selection for principal component analysis. Generally, finding the VB solution is a nonconvex problem, and most methods rely on a local search algorithm derived through a standard procedure for the VB approximation. In this paper, we show that a better option is available for fullyobserved VBMF—the global solution can be analytically computed. More specifically, the global solution is a reweighted SVD of the observed matrix, and each weight can be obtained by solving a quartic equation with its coefficients being functions of the observed singular value. We further show that the global optimal solution of empirical VBMF (where hyperparameters are also learned from data) can also be analytically computed. We illustrate the usefulness of our results through experiments in multivariate analysis.
Global Solution of FullyObserved Variational Bayesian Matrix Factorization is ColumnWise Independent
"... Variational Bayesian matrix factorization (VBMF) efficiently approximates the posterior distribution of factorized matrices by assuming matrixwise independence of the two factors. A recent study on fullyobserved VBMF showed that, under a stronger assumption that the two factorized matrices are col ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Variational Bayesian matrix factorization (VBMF) efficiently approximates the posterior distribution of factorized matrices by assuming matrixwise independence of the two factors. A recent study on fullyobserved VBMF showed that, under a stronger assumption that the two factorized matrices are columnwise independent, the global optimal solution can be analytically computed. However, it was not clear how restrictive the columnwise independence assumption is. In this paper, we prove that the global solution under matrixwise independence is actually columnwise independent, implying that the columnwise independence assumption is harmless. A practical consequence of our theoretical finding is that the global solution under matrixwise independence (which is a standard setup) can be obtained analytically in a computationally very efficient way without any iterative algorithms. We experimentally illustrate advantages of using our analytic solution in probabilistic principal component analysis. 1
A Variational Approach for Sparse Component Estimation and LowRank Matrix Recovery
"... We propose a variational Bayesian based algorithm for the estimation of the sparse component of an outliercorrupted lowrank matrix, when linearly transformed composite data are observed. The model constitutes a generalization of robust principal component analysis. The problem considered herein is ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
We propose a variational Bayesian based algorithm for the estimation of the sparse component of an outliercorrupted lowrank matrix, when linearly transformed composite data are observed. The model constitutes a generalization of robust principal component analysis. The problem considered herein is applicable in various practical scenarios, such as foreground detection in blurred and noisy video sequences and detection of network anomalies among others. The proposed algorithm models the lowrank matrix and the sparse component using a hierarchical Bayesian framework, and employs a variational approach for inference of the unknowns. The effectiveness of the proposed algorithm is demonstrated using real life experiments, and its performance improvement over regularization based approaches is shown. Index Terms—Bayesian inference, variational approach, robust principal component analysis, foreground detection, network anomaly detection improvement of the proposed algorithm over its regularization based counterpart. This paper is organized as follows. In Section II we present the general data model and several areas of applications. A brief overview of the related work in each of these areas is also provided. In Section III we introduce the proposed hierarchical Bayesian model. Details of the variational inference procedure are provided in Section IV. Numerical examples are presented in Section V. Finally we draw conclusion remarks in Section VI. Notation: Matrices and vectors are denoted by uppercase and lowercase boldface letters, respectively. vec(·), diag(·) and Tr(·) are vectorization, diagonalization and trace operators, respectively. Given a matrix X, we denote as xi·, x·j and Xij its ith row, jth column and (i, j) th element, respectively.
Sparse additive matrix factorization for robust PCA and its generalization
 In Proceedings of Fourth Asian Conference on Machine Learning
"... Principal component analysis (PCA) can be regarded as approximating a data matrix with a lowrank one by imposing sparsity on its singular values, and its robust variant further captures sparse noise. In this paper, we extend such sparse matrix learning methods, and propose a novel unified framework ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Principal component analysis (PCA) can be regarded as approximating a data matrix with a lowrank one by imposing sparsity on its singular values, and its robust variant further captures sparse noise. In this paper, we extend such sparse matrix learning methods, and propose a novel unified framework called sparse additive matrix factorization (SAMF). SAMF systematically induces various types of sparsity by the socalled modelinduced regularization in the Bayesian framework. We propose an iterative algorithm called the mean update (MU) for the variational Bayesian approximation to SAMF, which gives the global optimal solution for a large subset of parameters in each step. We demonstrate the usefulness of our method on artificial data and the foreground/background video separation.
Probabilistic lowrank subspace clustering
 In Advances in Neural Information Processing Systems 25
, 2012
"... In this paper, we consider the problem of clustering data points into lowdimensional subspaces in the presence of outliers. We pose the problem using a density estimation formulation with an associated generative model. Based on this probability model, we first develop an iterative expectationmaxi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
In this paper, we consider the problem of clustering data points into lowdimensional subspaces in the presence of outliers. We pose the problem using a density estimation formulation with an associated generative model. Based on this probability model, we first develop an iterative expectationmaximization (EM) algorithm and then derive its global solution. In addition, we develop two Bayesian methods based on variational Bayesian (VB) approximation, which are capable of automatic dimensionality selection. While the first method is based on an alternating optimization scheme for all unknowns, the second method makes use of recent results in VB matrix factorization leading to fast and effective estimation. Both methods are extended to handle sparse outliers for robustness and can handle missing values. Experimental results suggest that proposed methods are very effective in subspace clustering and identifying outliers. 1
Pushing the limits of affine rank minimization by adapting probabilistic PCA.
 In Int. Conf.
, 2015
"... Abstract Many applications require recovering a matrix of minimal rank within an affine constraint set, with matrix completion a notable special case. Because the problem is NPhard in general, it is common to replace the matrix rank with the nuclear norm, which acts as a convenient convex surrogat ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract Many applications require recovering a matrix of minimal rank within an affine constraint set, with matrix completion a notable special case. Because the problem is NPhard in general, it is common to replace the matrix rank with the nuclear norm, which acts as a convenient convex surrogate. While elegant theoretical conditions elucidate when this replacement is likely to be successful, they are highly restrictive and convex algorithms fail when the ambient rank is too high or when the constraint set is poorly structured. Nonconvex alternatives fare somewhat better when carefully tuned; however, convergence to locally optimal solutions remains a continuing source of failure. Against this backdrop we derive a deceptively simple and parameterfree probabilistic PCAlike algorithm that is capable, over a wide battery of empirical tests, of successful recovery even at the theoretical limit where the number of measurements equals the degrees of freedom in the unknown lowrank matrix. Somewhat surprisingly, this is possible even when the affine constraint set is highly illconditioned. While proving general recovery guarantees remains evasive for nonconvex algorithms, Bayesianinspired or otherwise, we nonetheless show conditions whereby the underlying cost function has a unique stationary point located at the global optimum; no existing cost function we are aware of satisfies this property. The algorithm has also been successfully deployed on a computer vision application involving image rectification and a standard collaborative filtering benchmark.
Variational Bayesian Methods For Multimedia Problems
, 2013
"... Abstract—In this paper we present an introduction to Variational Bayesian (VB) methods in the context of probabilistic graphical models, and discuss their application in multimedia related problems. VB is a family of deterministic probability distribution approximation procedures that offer distinc ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper we present an introduction to Variational Bayesian (VB) methods in the context of probabilistic graphical models, and discuss their application in multimedia related problems. VB is a family of deterministic probability distribution approximation procedures that offer distinct advantages over alternative approaches based on stochastic sampling and those providing only point estimates. VB inference is flexible to be applied in different practical problems, yet is broad enough to subsume as its special cases several alternative inference approaches including Maximum A Posteriori (MAP) and the ExpectationMaximization (EM) algorithm. In this paper we also show the connections between VB and other posterior approximation methods such as the marginalizationbased Loopy Belief Propagation (LBP) and the Expectation Propagation (EP) algorithms. Specifically, both VB and EP are variational methods that minimize functionals based on the KullbackLeibler (KL) divergence. LBP, traditionally developed using graphical models, can also be viewed as a VB inference procedure. We present several multimedia related applications illustrating the use and effectiveness of the VB algorithms discussed herein. We hope that by reading this tutorial the readers will obtain a general understanding of Bayesian methods and establish connections among popular algorithms used in practice. Index Terms—Bayes methods, graphical models, multimedia signal processing, variational Bayes, inverse problems. I.