Results 1 - 10
of
90
Symmetric Nonnegative Matrix Factorization for Graph Clustering
"... Nonnegative matrix factorization (NMF) provides a lower rank approximation of a nonnegative matrix, and has been successfully used as a clustering method. In this paper, we offer some conceptual understanding for the capabilities and shortcomings of NMF as a clustering method. Then, we propose Symme ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
(Show Context)
Nonnegative matrix factorization (NMF) provides a lower rank approximation of a nonnegative matrix, and has been successfully used as a clustering method. In this paper, we offer some conceptual understanding for the capabilities and shortcomings of NMF as a clustering method. Then, we propose Symmetric NMF (SymNMF) as a general framework for graph clustering, which inherits the advantages of NMF by enforcing nonnegativity on the clustering assignment matrix. Unlike NMF, however, SymNMF is based on a similarity measure between data points, and factorizes a symmetric matrix containing pairwise similarity values (not necessarily nonnegative). We compare SymNMF with the widely-used spectral clustering methods, and give an intuitive explanation of why SymNMF captures the cluster structure embedded in the graph representation more naturally. In addition, we develop a Newton-like algorithm that exploits second-order information efficiently, so as to show the feasibility of SymNMF as a practical framework for graph clustering. Our experiments on artificial graph data, text data, and image data demonstrate the substantially enhanced clustering quality of SymNMF over spectral clustering and NMF. Therefore, SymNMF is able to achieve better clustering results on both linear and nonlinear manifolds, and serves as a potential basis for many extensions and applications. 1
Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework
- J GLOB OPTIM
, 2013
"... ..."
Lasso screening rules via dual polytope projection,” arXiv:1211.3966
, 2012
"... Lasso is a widely used regression technique to find sparse representations. When the di-mension of the feature space and the number of samples are extremely large, solving the Lasso problem remains challenging. To improve the efficiency of solving large-scale Lasso problems, El Ghaoui and his collea ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
(Show Context)
Lasso is a widely used regression technique to find sparse representations. When the di-mension of the feature space and the number of samples are extremely large, solving the Lasso problem remains challenging. To improve the efficiency of solving large-scale Lasso problems, El Ghaoui and his colleagues have proposed the SAFE rules which are able to quickly identify the inactive predictors, i.e., predictors that have 0 components in the solution vector. Then, the inactive predictors or features can be removed from the optimization problem to reduce its scale. By transforming the standard Lasso to its dual form, it can be shown that the inactive predictors include the set of inactive constraints on the optimal dual solution. In this paper, we propose an efficient and effective screening rule via Dual Polytope Projections (DPP), which is mainly based on the uniqueness and nonexpansiveness of the optimal dual solution due to the fact that the feasible set in the dual space is a convex and closed polytope. Moreover, we show that our screening rule can be extended to identify inactive groups in group Lasso. To the best of our knowledge, there is currently no exact screening rule for group Lasso. We have evaluated our screening rule using synthetic and real data sets. Results show that our rule is more effective in identifying inactive predictors than existing state-of-the-art screening rules for Lasso. 1
Nonnegative Matrix Factorization: A Comprehensive Review
- IEEE TRANS. KNOWLEDGE AND DATA ENG
, 2013
"... Nonnegative Matrix Factorization (NMF), a relatively novel paradigm for dimensionality reduction, has been in the ascendant since its inception. It incorporates the nonnegativity constraint and thus obtains the parts-based representation as well as enhancing the interpretability of the issue corres ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Nonnegative Matrix Factorization (NMF), a relatively novel paradigm for dimensionality reduction, has been in the ascendant since its inception. It incorporates the nonnegativity constraint and thus obtains the parts-based representation as well as enhancing the interpretability of the issue correspondingly. This survey paper mainly focuses on the theoretical research into NMF over the last 5 years, where the principles, basic models, properties, and algorithms of NMF along with its various modifications, extensions, and generalizations are summarized systematically. The existing NMF algorithms are divided into four categories: Basic NMF (BNMF),
An exploration of improving collaborative recommender systems via user-item subgroups
- In Proc. of WWW
, 2012
"... Collaborative filtering (CF) is one of the most successful recommendation approaches. It typically associates a user with a group of like-minded users based on their preferences over all the items, and recommends to the user those items enjoyed by others in the group. However we find that two users ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
(Show Context)
Collaborative filtering (CF) is one of the most successful recommendation approaches. It typically associates a user with a group of like-minded users based on their preferences over all the items, and recommends to the user those items enjoyed by others in the group. However we find that two users with similar tastes on one item subset may have to-tally different tastes on another set. In other words, there exist many user-item subgroups each consisting of a subset of items and a group of like-minded users on these items. It is more natural to make preference predictions for a user via the correlated subgroups than the entire user-item ma-trix. In this paper, to find meaningful subgroups, we for-mulate the Multiclass Co-Clustering (MCoC) problem and propose an effective solution to it. Then we propose an unified framework to extend the traditional CF algorithms by utilizing the subgroups information for improving their top-N recommendation performance. Our approach can be seen as an extension of traditional clustering CF models. Systematic experiments on three real world data sets have demonstrated the effectiveness of our proposed approach.
Clustering by Nonnegative Matrix Factorization Using Graph Random Walk
"... Nonnegative Matrix Factorization (NMF) is a promising relaxation technique for clustering analysis. However, conventional NMF methods that directly approximate the pairwise similarities using the least square error often yield mediocre performance for data in curved manifolds because they can captur ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Nonnegative Matrix Factorization (NMF) is a promising relaxation technique for clustering analysis. However, conventional NMF methods that directly approximate the pairwise similarities using the least square error often yield mediocre performance for data in curved manifolds because they can capture only the immediate similarities between data samples. Here we propose a new NMF clustering method which replaces the approximated matrix with its smoothed version using random walk. Our method can thus accommodate farther relationships between data samples. Furthermore, we introduce a novel regularization in the proposed objective function in order to improve over spectral clustering. The new learning objective is optimized by a multiplicative Majorization-Minimization algorithm with a scalable implementation for learning the factorizing matrix. Extensive experimental results on real-world datasets show that our method has strong performance in terms of cluster purity. 1
Transfer Sparse Coding for Robust Image Representation
, 2013
"... Sparse coding learns a set of basis functions such that each input signal can be well approximated by a linear combination of just a few of the bases. It has attracted in-creasing interest due to its state-of-the-art performance in BoW based image representation. However, when labeled and unlabeled ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Sparse coding learns a set of basis functions such that each input signal can be well approximated by a linear combination of just a few of the bases. It has attracted in-creasing interest due to its state-of-the-art performance in BoW based image representation. However, when labeled and unlabeled images are sampled from different distributions, they may be quantized into different visual words of the codebook and encoded with different representations, which may severely degrade classification performance. In this paper, we propose a Transfer Sparse Coding (TSC) approach to construct robust sparse representations for classifying cross-distribution images accurately. Specifically, we aim to minimize the distribution divergence between the labeled and unlabeled images, and incorporate this criterion into the objective function of sparse coding to make the new representations robust to the distribution difference. Experiments show that TSC can significantly outperform state-of-the-art methods on three types of computer vision datasets.
The why and how of nonnegative matrix factorization
- REGULARIZATION, OPTIMIZATION, KERNELS, AND SUPPORT VECTOR MACHINES. CHAPMAN & HALL/CRC
, 2014
"... ..."
(Show Context)
Tensor Factorization Using Auxiliary Information
"... Abstract. Most of the existing analysis methods for tensors (or multi-way arrays) only assume that tensors to be completed are of low rank. However, for example, when they are applied to tensor completion prob-lems, their prediction accuracy tends to be significantly worse when only limited entries ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Most of the existing analysis methods for tensors (or multi-way arrays) only assume that tensors to be completed are of low rank. However, for example, when they are applied to tensor completion prob-lems, their prediction accuracy tends to be significantly worse when only limited entries are observed. In this paper, we propose to use relation-ships among data as auxiliary information in addition to the low-rank assumption to improve the quality of tensor decomposition. We introduce two regularization approaches using graph Laplacians induced from the relationships, and design iterative algorithms for approximate solutions. Numerical experiments on tensor completion using synthetic and bench-mark datasets show that the use of auxiliary information improves com-pletion accuracy over the existing methods based only on the low-rank assumption, especially when observations are sparse. 1