Results 1  10
of
106
SemiSupervised Clustering via Matrix Factorization
, 2008
"... The recent years have witnessed a surge of interests of semisupervised clustering methods, which aim to cluster the data set under the guidance of some supervisory information. Usually those supervisory information takes the form of pairwise constraints that indicate the similarity/dissimilarity be ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
The recent years have witnessed a surge of interests of semisupervised clustering methods, which aim to cluster the data set under the guidance of some supervisory information. Usually those supervisory information takes the form of pairwise constraints that indicate the similarity/dissimilarity between the two points. In this paper, we propose a novel matrix factorization based approach for semisupervised clustering. In addition, we extend our algorithm to cocluster the data sets of different types with constraints. Finally the experiments on UCI data sets and real world Bulletin Board Systems (BBS) data sets show the superiority of our proposed method.
Weighted consensus clustering
, 2008
"... Consensus clustering has emerged as an important extension of the classical clustering problem. We propose weighted consensus clustering, where each input clustering is weighted and the weights are determined in such a way that the final consensus clustering provides a better quality solution, in wh ..."
Abstract

Cited by 29 (10 self)
 Add to MetaCart
Consensus clustering has emerged as an important extension of the classical clustering problem. We propose weighted consensus clustering, where each input clustering is weighted and the weights are determined in such a way that the final consensus clustering provides a better quality solution, in which clusters are better separated comparing to standard consensus clustering. Theoretically, we show that a reformulation of the wellknown L1 regularization LASSO problem is equivalent to the weight optimization of our weighted consensus clustering, and thus our approach provides sparse solutions which may resolve the difficult situation when the input clusterings diverge significantly. We also show that the weighted consensus clustering resolves the redundancy problem when many input clusterings correlate highly. Detailed algorithms are given. Experiments are carried out to demonstrate the effectiveness of the weighted consensus clustering.
Blind reflectometry
 In ECCV
, 2010
"... Abstract. Different materials reflect light in different ways, so reflectance is a useful surface descriptor. Existing systems for measuring reflectance are cumbersome, however, and although the process can be streamlined using cameras, projectors and clever catadioptrics, it generally requires comp ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Different materials reflect light in different ways, so reflectance is a useful surface descriptor. Existing systems for measuring reflectance are cumbersome, however, and although the process can be streamlined using cameras, projectors and clever catadioptrics, it generally requires complex infrastructure. In this paper we propose a simpler method for inferring reflectance from images, one that eliminates the need for active lighting and exploits natural illumination instead. The method’s distinguishing property is its ability to handle a broad class of isotropic reflectance functions, including those that are neither radiallysymmetric nor wellrepresented by lowparameter reflectance models. The key to the approach is a bivariate representation of isotropic reflectance that enables a tractable inference algorithm while maintaining generality. The resulting method requires only a camera, a light probe, and as little as one HDR image of a known, curved, homogeneous surface. 1
Convex sparse matrix factorizations
"... We present a convex formulation of dictionary learning for sparse signal decomposition. Convexity is obtained by replacing the usual explicit upper bound on the dictionary size by a convex rankreducing term similar to the trace norm. In particular, our formulation introduces an explicit tradeoff b ..."
Abstract

Cited by 25 (13 self)
 Add to MetaCart
(Show Context)
We present a convex formulation of dictionary learning for sparse signal decomposition. Convexity is obtained by replacing the usual explicit upper bound on the dictionary size by a convex rankreducing term similar to the trace norm. In particular, our formulation introduces an explicit tradeoff between size and sparsity of the decomposition of rectangular matrices. Using a large set of synthetic examples, we compare the estimation abilities of the convex and nonconvex approaches, showing that while the convex formulation has a single local minimum, this may lead in some cases to performance which is inferior to the local minima of the nonconvex formulation. 1
Unsupervised sentiment analysis with emotional signals
 In Proceedings of the 22nd international conference on World Wide Web, WWW’13. ACM
"... The explosion of social media services presents a great opportunity to understand the sentiment of the public via analyzing its largescale and opinionrich data. In social media, it is easy to amass vast quantities of unlabeled data, but very costly to obtain sentiment labels, which makes unsupervi ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
(Show Context)
The explosion of social media services presents a great opportunity to understand the sentiment of the public via analyzing its largescale and opinionrich data. In social media, it is easy to amass vast quantities of unlabeled data, but very costly to obtain sentiment labels, which makes unsupervised sentiment analysis essential for various applications. It is challenging for traditional lexiconbased unsupervised methods due to the fact that expressions in social media are unstructured, informal, and fastevolving. Emoticons and product ratings are examples of emotional signals that are associated with sentiments expressed in posts or words. Inspired by the wide availability of emotional signals in social media, we propose to study the problem of unsupervised sentiment analysis with emotional signals. In particular, we investigate whether the signals can potentially help sentiment analysis by providing a unified way to model two main categories of emotional signals, i.e., emotion indication and emotion correlation. We further incorporate the signals into an unsupervised learning framework for sentiment analysis. In the experiment, we compare the proposed framework with the stateoftheart methods on two Twitter datasets and empirically evaluate our proposed framework to gain a deep understanding of the effects of emotional signals.
Linear and Nonlinear Projective Nonnegative Matrix Factorization
"... Abstract—A variant of nonnegative matrix factorization (NMF) which was proposed earlier is analyzed here. It is called Projective Nonnegative Matrix Factorization (PNMF). The new method approximately factorizes a projection matrix, minimizing the reconstruction error, into a positive lowrank matrix ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
Abstract—A variant of nonnegative matrix factorization (NMF) which was proposed earlier is analyzed here. It is called Projective Nonnegative Matrix Factorization (PNMF). The new method approximately factorizes a projection matrix, minimizing the reconstruction error, into a positive lowrank matrix and its transpose. The dissimilarity between the original data matrix and its approximation can be measured by the Frobenius matrix norm or the modified KullbackLeibler divergence. Both measures are minimized by multiplicative update rules, whose convergence is proven for the first time. Enforcing orthonormality to the basic objective is shown to lead to an even more efficient update rule, which is also readily extended to nonlinear cases. The formulation of the PNMF objective is shown to be connected to a variety of existing nonnegative matrix factorization methods and clustering approaches. In addition, the derivation using Lagrangian multipliers reveals the relation between reconstruction and sparseness. For kernel principal component analysis with the binary constraint, useful in graph partitioning problems, the nonlinear kernel PNMF provides a good approximation which outperforms an existing discretization approach. Empirical study on three realworld databases shows that PNMF can achieve the best or close to the best in clustering. The proposed algorithm runs more efficiently than the compared nonnegative matrix factorization methods, especially for highdimensional data. Moreover, contrary to the basic NMF, the trained projection matrix can be readily used for newly coming samples and demonstrates good generalization. I.
Nonnegative Matrix Factorization: A Comprehensive Review
 IEEE TRANS. KNOWLEDGE AND DATA ENG
, 2013
"... Nonnegative Matrix Factorization (NMF), a relatively novel paradigm for dimensionality reduction, has been in the ascendant since its inception. It incorporates the nonnegativity constraint and thus obtains the partsbased representation as well as enhancing the interpretability of the issue corres ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Nonnegative Matrix Factorization (NMF), a relatively novel paradigm for dimensionality reduction, has been in the ascendant since its inception. It incorporates the nonnegativity constraint and thus obtains the partsbased representation as well as enhancing the interpretability of the issue correspondingly. This survey paper mainly focuses on the theoretical research into NMF over the last 5 years, where the principles, basic models, properties, and algorithms of NMF along with its various modifications, extensions, and generalizations are summarized systematically. The existing NMF algorithms are divided into four categories: Basic NMF (BNMF),
Descent methods for Nonnegative Matrix Factorization
, 2008
"... In this paper, we present several descent methods that can be applied to nonnegative matrix factorization and we analyze a recently developed fast block coordinate method. We also give a comparison of these different methods and show that the new block coordinate method has better properties in term ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
In this paper, we present several descent methods that can be applied to nonnegative matrix factorization and we analyze a recently developed fast block coordinate method. We also give a comparison of these different methods and show that the new block coordinate method has better properties in terms of approximation error and complexity. By interpreting this method as a rankone approximation of the residue matrix, we also extend it to the nonnegative tensor factorization and introduce some variants of the method by imposing some additional controllable constraints such as: sparsity, discreteness and smoothness.
Exemplarbased Visualization of Large Document Corpus
"... Abstract—With the rapid growth of the World Wide Web and electronic information services, text corpus is becoming available online at an incredible rate. By displaying text data in a logical layout (e.g., color graphs), text visualization presents a direct way to observe the documents as well as und ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
Abstract—With the rapid growth of the World Wide Web and electronic information services, text corpus is becoming available online at an incredible rate. By displaying text data in a logical layout (e.g., color graphs), text visualization presents a direct way to observe the documents as well as understand the relationship between them. In this paper, we propose a novel technique, Exemplarbased Visualization (EV), to visualize an extremely large text corpus. Capitalizing on recent advances in matrix approximation and decomposition, EV presents a probabilistic multidimensional projection model in the lowrank text subspace with a sound objective function. The probability of each document proportion to the topics is obtained through iterative optimization and embedded to a low dimensional space using parameter embedding. By selecting the representative exemplars, we obtain a compact approximation of the data. This makes the visualization highly efficient and flexible. In addition, the selected exemplars neatly summarize the entire data set and greatly reduce the cognitive overload in the visualization, leading to an easier interpretation of large text corpus. Empirically, we demonstrate the superior performance of EV through extensive experiments performed on the publicly available text data sets. Index Terms—Exemplar, largescale document visualization, multidimensional projection. 1
Learning the shared subspace for multitask clustering and transductive transfer classification
 Data Mining, 2009. ICDM’09. Ninth IEEE International Conference on
, 2009
"... Abstract—There are many clustering tasks which are closely related in the real world, e.g. clustering the web pages of different universities. However, existing clustering approaches neglect the underlying relation and treat these clustering tasks either individually or simply together. In this pape ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
Abstract—There are many clustering tasks which are closely related in the real world, e.g. clustering the web pages of different universities. However, existing clustering approaches neglect the underlying relation and treat these clustering tasks either individually or simply together. In this paper, we will study a novel clustering paradigm, namely multitask clustering, which performs multiple related clustering tasks together and utilizes the relation of these tasks to enhance the clustering performance. We aim to learn a subspace shared by all the tasks, through which the knowledge of the tasks can be transferred to each other. The objective of our approach consists of two parts: (1) Withintask clustering: clustering the data of each task in its input space individually; and (2) Crosstask clustering: simultaneous learning the shared subspace and clustering the data of all the tasks together. We will show that it can be solved by alternating minimization, and its convergence is theoretically guaranteed. Furthermore, we will show that given the labels of one task, our multitask clustering method can be extended to transductive transfer classification (a.k.a. crossdomain classification, domain adaption). Experiments on several crossdomain text data sets demonstrate that the proposed multitask clustering outperforms traditional singletask clustering methods greatly. And the transductive transfer classification method is comparable to or even better than several existing transductive transfer classification approaches. Keywordsmultitask clustering; transductive transfer classification; multitask learning; transfer learning; cross domain classification; domain adaption I.