Results 1  10
of
19
The why and how of nonnegative matrix factorization
 REGULARIZATION, OPTIMIZATION, KERNELS, AND SUPPORT VECTOR MACHINES. CHAPMAN & HALL/CRC
, 2014
"... ..."
(Show Context)
Fast Bregman Divergence NMF using Taylor Expansion and Coordinate Descent
"... Nonnegative matrix factorization (NMF) provides a lower rank approximation of a matrix. Due to nonnegativity imposed on the factors, it gives a latent structure that is often more physically meaningful than other lower rank approximations such as singular value decomposition (SVD). Most of the algo ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Nonnegative matrix factorization (NMF) provides a lower rank approximation of a matrix. Due to nonnegativity imposed on the factors, it gives a latent structure that is often more physically meaningful than other lower rank approximations such as singular value decomposition (SVD). Most of the algorithms proposed in literature for NMF have been based on minimizing the Frobenius norm. This is partly due to the fact that the minimization problem based on the Frobenius norm provides much more flexibility in algebraic manipulation than other divergences. In this paper we propose a fast NMF algorithm that is applicable to general Bregman divergences. Through Taylor series expansion of the Bregman divergences, we reveal a relationship between Bregman divergences and Euclidean distance. This key relationship provides a new direction for NMF algorithms with general Bregman divergences when combined with the scalar block coordinate descent method. The proposed algorithm generalizes several recently proposed methods for computation of NMF with Bregman divergences and is computationally faster than existing alternatives. We demonstrate the effectiveness of our approach with experiments conducted on artificial as well as real world data.
H.: Fast rank2 nonnegative matrix factorization for hierarchical document clustering
 In: KDD ’13: Proc. of the 19th ACM Int. Conf. on Knowledge Discovery and Data Mining
, 2013
"... Nonnegative matrix factorization (NMF) has been successfully used as a clustering method especially for flat partitioning of documents. In this paper, we propose an efficient hierarchical document clustering method based on a new algorithm for rank2 NMF. When the two block coordinate descent fra ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Nonnegative matrix factorization (NMF) has been successfully used as a clustering method especially for flat partitioning of documents. In this paper, we propose an efficient hierarchical document clustering method based on a new algorithm for rank2 NMF. When the two block coordinate descent framework of nonnegative least squares is applied to computing rank2 NMF, each subproblem requires a solution for nonnegative least squares with only two columns in the matrix. We design the algorithm for rank2 NMF by exploiting the fact that an exhaustive search for the optimal active set can be performed extremely fast when solving these NNLS problems. In addition, we design a measure based on the results of rank2 NMF for determining which leaf node should be further split. On a number of text data sets, our proposed method produces highquality tree structures in significantly less time compared to other methods such as hierarchical Kmeans, standard NMF, and latent Dirichlet allocation.
A globally convergent algorithm for nonconvex optimization based on block coordinate update,” arXiv preprint arXiv:1410.1386
, 2014
"... Abstract. Nonconvex optimization problems arise in many areas of computational science and engineering and are (approximately) solved by a variety of algorithms. Existing algorithms usually only have local convergence or subsequence convergence of their iterates. We propose an algorithm for a gener ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Nonconvex optimization problems arise in many areas of computational science and engineering and are (approximately) solved by a variety of algorithms. Existing algorithms usually only have local convergence or subsequence convergence of their iterates. We propose an algorithm for a generic nonconvex optimization formulation, establish the convergence of its whole iterate sequence to a critical point along with a rate of convergence, and numerically demonstrate its efficiency. Specially, we consider the problem of minimizing a nonconvex objective function. Its variables can be treated as one block or be partitioned into multiple disjoint blocks. It is assumed that each nondifferentiable component of the objective function or each constraint applies to one block of variables. The differentiable components of the objective function, however, can apply to one or multiple blocks of variables together. Our algorithm updates one block of variables at time by minimizing a certain proxlinear surrogate. The order of update can be either deterministic or randomly shuffled in each round. In fact, our convergence analysis only needs that each block be updated at least once every fixed number of iterations. We obtain the convergence of the whole iterate sequence to a critical point under fairly loose conditions including, in particular, the Kurdyka Lojasiewicz (KL) condition, which is satisfied by a broad class of nonconvex/nonsmooth applications. Of course, these results apply to convex optimization as well. We apply our convergence result to the coordinate descent method for nonconvex regularized linear regression and also a modified rankone residue iteration method for nonnegative matrix factorization. We show that both the methods have global convergence. Numerically, we test our algorithm on nonnegative matrix and tensor factorization problems, with random shuffling enable to avoid local solutions.
Bounded Matrix Low Rank Approximation
"... Abstract—Matrix lower rank approximations such as nonnegative matrix factorization (NMF) have been successfully used to solve many data mining tasks. In this paper, we propose a new matrix lower rank approximation called Bounded Matrix Low Rank Approximation (BMA) which imposes a lower and an upper ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Matrix lower rank approximations such as nonnegative matrix factorization (NMF) have been successfully used to solve many data mining tasks. In this paper, we propose a new matrix lower rank approximation called Bounded Matrix Low Rank Approximation (BMA) which imposes a lower and an upper bound on every element of a lower rank matrix that best approximates a given matrix with missing elements. This new approximation models many real world problems, such as recommender systems, and performs better than other methods, such as singular value decompositions (SVD) or NMF. We present an efficient algorithm to solve BMA based on coordinate descent method. BMA is different from NMF as it imposes bounds on the approximation itself rather than on each of the low rank factors. We show that our algorithm is scalable for large matrices with missing elements on multi core systems with low memory. We present substantial experimental results illustrating that the proposed method outperforms the state of the art algorithms for recommender systems such as
Ellipsoidal Rounding for Nonnegative Matrix Factorization Under Noisy Separability
, 2013
"... We present a numerical algorithm for nonnegative matrix factorization (NMF) problems under noisy separability. An NMF problem under separability can be stated as one of finding all vertices of the convex hull of data points. The research interest of this paper is to find the vectors as close to the ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We present a numerical algorithm for nonnegative matrix factorization (NMF) problems under noisy separability. An NMF problem under separability can be stated as one of finding all vertices of the convex hull of data points. The research interest of this paper is to find the vectors as close to the vertices as possible in a situation in which noise is added to the data points. Our algorithm is designed to capture the shape of the convex hull of data points by using its enclosing ellipsoid. We show that the algorithm has correctness and robustness properties from theoretical and practical perspectives; correctness here means that if the data points do not contain any noise, the algorithm can find the vertices of their convex hull; robustness means that if the data points contain noise, the algorithm can find the nearvertices. Finally, we apply the algorithm to document clustering, and report the experimental results.
Alternating direction method of multipliers for nonnegative matrix factorization with the betadivergence
 In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014. BIBLIOGRAPHY 121
"... Nonnegative matrix factorization (NMF) is a popular method for learning interpretable features from nonnegative data, such as counts or magnitudes. Different cost functions are used with NMF in different applications. We develop an algorithm, based on the alternating direction method of multiplier ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Nonnegative matrix factorization (NMF) is a popular method for learning interpretable features from nonnegative data, such as counts or magnitudes. Different cost functions are used with NMF in different applications. We develop an algorithm, based on the alternating direction method of multipliers, that tackles NMF problems whose cost function is a betadivergence, a broad class of divergence functions. We derive simple, closedform updates for the most commonly used betadivergences. We demonstrate experimentally that this algorithm has faster convergence and yields superior results to stateoftheart algorithms for this problem. Index Terms — nonnegative matrix factorization, betadivergence, alternating direction method of multipliers. 1.
Generalized Low Rank Models
, 2014
"... Principal components analysis (PCA) is a wellknown technique for approximating a data set represented by a matrix by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompa ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Principal components analysis (PCA) is a wellknown technique for approximating a data set represented by a matrix by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, kmeans, kSVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results. This manuscript is a draft. Comments sent to udell@stanford.edu are welcome.
Contents
, 2014
"... Principal components analysis (PCA) is a wellknown technique for approximating a data set represented by a matrix by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompa ..."
Abstract
 Add to MetaCart
Principal components analysis (PCA) is a wellknown technique for approximating a data set represented by a matrix by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, kmeans, kSVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results. This manuscript is a draft. Comments sent to udell@stanford.edu are welcome. 1 ar