Results 11  20
of
51
Convex non–negative matrix factorization in the wild
 in Proceedings of the 9th IEEE International Conference on Data Mining (ICDM–09
"... Abstract—Nonnegative matrix factorization (NMF) has recently received a lot of attention in data mining, information retrieval, and computer vision. It factorizes a nonnegative input matrix V into two nonnegative matrix factors V = WH such that W describes ”clusters ” of the datasets. Analyzing g ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
Abstract—Nonnegative matrix factorization (NMF) has recently received a lot of attention in data mining, information retrieval, and computer vision. It factorizes a nonnegative input matrix V into two nonnegative matrix factors V = WH such that W describes ”clusters ” of the datasets. Analyzing genotypes, social networks, or images, it can be beneficial to ensure V to contain meaningful “cluster centroids”, i.e., to restrict W to be convex combinations of data points. But how can we run this convex NMF in the wild, i.e., given millions of data points? Triggered by the simple observation that each data point is a convex combination of vertices of the data convex hull, we propose to restrict W further to be vertices of the convex hull. The benefits of this convexhull NMF approach are twofold. First, the expected size of the convex hull of, for example, n random Gaussian points in the plane is Ω ( √ log n), i.e., the candidate set typically grows much slower than the data set. Second, distance preserving lowdimensional embeddings allow one to compute candidate vertices efficiently. Our extensive experimental evaluation shows that convexhull NMF compares favorably to convex NMF for large data sets both in terms of speed and reconstruction quality. Moreover, we show that our method can easily be applied to largescale, realworld data sets, in our case consisting of 1.6 million images respectively 150 million votes on World of Warcraft R ○ guilds. Keywordsdata mining; matrix decomposition; data handling; non negative matrix factorization; archetypal analysis; social network analysis; I.
A novel greedy algorithm for Nyström approximation
"... The Nyström method is an efficient technique for obtaining a lowrank approximation of a large kernel matrix based on a subset of its columns. The quality of the Nyström approximation highly depends on the subset of columns used, which are usually selected using random sampling. This paper presents ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
The Nyström method is an efficient technique for obtaining a lowrank approximation of a large kernel matrix based on a subset of its columns. The quality of the Nyström approximation highly depends on the subset of columns used, which are usually selected using random sampling. This paper presents a novel recursive algorithm for calculating the Nyström approximation, and an effective greedy criterion for column selection. Further, a very efficient variant is proposed for greedy sampling, which works on random partitions of data instances. Experiments on benchmark data sets show that the proposed greedy algorithms achieve significant improvements in approximating kernel matrices, with minimum overhead in run time.
Global image denoising
 IEEE Trans. on Image Proc
, 2014
"... Abstract — Most existing stateoftheart image denoising algorithms are based on exploiting similarity between a relatively modest number of patches. These patchbased methods are strictly dependent on patch matching, and their performance is hamstrung by the ability to reliably find sufficiently ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Abstract — Most existing stateoftheart image denoising algorithms are based on exploiting similarity between a relatively modest number of patches. These patchbased methods are strictly dependent on patch matching, and their performance is hamstrung by the ability to reliably find sufficiently similar patches. As the number of patches grows, a point of diminishing returns is reached where the performance improvement due to more patches is offset by the lower likelihood of finding sufficiently close matches. The net effect is that while patchbased methods, such as BM3D, are excellent overall, they are ultimately limited in how well they can do on (larger) images with increasing complexity. In this paper, we address these shortcomings by developing a paradigm for truly global filtering where each pixel is estimated from all pixels in the image. Our objectives in this paper are twofold. First, we give a statistical analysis of our proposed global filter, based on a spectral decomposition of its corresponding operator, and we study the effect of truncation of this spectral decomposition. Second, we derive an approximation to the spectral (principal) components using the Nyström extension. Using these, we demonstrate that this global filter can be implemented efficiently by sampling a fairly small percentage of the pixels in the image. Experiments illustrate that our strategy can effectively globalize any existing denoising filters to estimate each pixel using all pixels in the image, hence improving upon the best patchbased methods. Index Terms — Image denoising, nonlocal filters, Nyström extension, spatial domain filter, risk estimator.
Laplacian Embedded Regression for Scalable Manifold Regularization
"... Semisupervised Learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundat ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Semisupervised Learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms such as Laplacian SVM (LapSVM) and Laplacian Regularized Least Squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian Embedded Regression by introducing an intermediate decision variable into the manifold regularization framework. By using ϵinsensitive loss, we obtain the Laplacian Embedded SVR (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian Embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS possess a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are twofold: 1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately; 2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel PCA, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale semisupervised learning problems. Extensive experiments on both toy and real world data sets show the effectiveness and scalability of the proposed framework.
From technological networks to social networks
 IEEE J. SEL. AREAS COMMUN
, 2013
"... Social networks overlaid on technological networks account for a significant fraction of Internet use. Through graph theoretic and functionality models, this paper examines social network analysis and potential implications for the design of technological networks, and vice versa. Such interplay be ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Social networks overlaid on technological networks account for a significant fraction of Internet use. Through graph theoretic and functionality models, this paper examines social network analysis and potential implications for the design of technological networks, and vice versa. Such interplay between social networks and technological networks suggests new directions for future research in networking.
Maximum Variance Correction with Application to A ∗ Search
"... In this paper we introduce Maximum Variance Correction (MVC), which finds largescale feasible solutions to Maximum Variance Unfolding (MVU) by postprocessing embeddings from any manifold learning algorithm. It increases the scale of MVU embeddings by several orders of magnitude and is naturally par ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
In this paper we introduce Maximum Variance Correction (MVC), which finds largescale feasible solutions to Maximum Variance Unfolding (MVU) by postprocessing embeddings from any manifold learning algorithm. It increases the scale of MVU embeddings by several orders of magnitude and is naturally parallel. This unprecedented scalability opens up new avenues of applications for manifold learning, in particular the use of MVU embeddings as effective heuristics to speedup A ∗ search. We demonstrate unmatched reductions in search time across several nontrivial A ∗ benchmark search problems and bridge the gap between the manifold learning literature and one of its most promising high impact applications. 1.
0 Joint Link Prediction and Attribute Inference using a SocialAttribute Network
"... The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. [Yin et al. 2010a; 2010b] proposed an attributeaugmented social network model, which we call ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. [Yin et al. 2010a; 2010b] proposed an attributeaugmented social network model, which we call as SocialAttribute Network (SAN), to integrate network structure and node attributes to perform both link prediction and attribute inference. They focused on generalizing the random walk with restart algorithm to the SAN framework and showed improved performance. In this paper, we extend the SAN framework with several leading supervised and unsupervised link prediction algorithms and demonstrate performance improvement for each algorithm on both link prediction and attribute inference. Moreover, we make the novel observation that attribute inference can help inform link prediction, i.e., link prediction accuracy is further improved by first inferring missing attributes. We comprehensively evaluate these algorithms and compare them with
Largescale SVD and Manifold Learning
"... This paper examines the efficacy of samplingbased lowrank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nyström and Column sampling methods. We present a theoretical comparison between ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
This paper examines the efficacy of samplingbased lowrank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nyström and Column sampling methods. We present a theoretical comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the largescale task of extracting lowdimensional manifold structure given millions of highdimensional face images. We address the computational challenges of nonlinear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning lowdimensional embeddings for two large face data sets: CMUPIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nyström approximation is superior to the Column sampling method for this task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMUPIE data set.
Locally Linear Landmarks for LargeScale Manifold Learning
"... Abstract. Spectral methods for manifold learning and clustering typically construct a graph weighted with affinities from a dataset and compute eigenvectors of a graph Laplacian. With large datasets, the eigendecomposition is too expensive, and is usually approximated by solving for a smaller graph ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Spectral methods for manifold learning and clustering typically construct a graph weighted with affinities from a dataset and compute eigenvectors of a graph Laplacian. With large datasets, the eigendecomposition is too expensive, and is usually approximated by solving for a smaller graph defined on a subset of the points (landmarks) and then applying the Nyström formula to estimate the eigenvectors over all points. This has the problem that the affinities between landmarks do not benefit from the remaining points and may poorly represent the data if using few landmarks. We introduce a modified spectral problem that uses all data points by constraining the latent projection of each point to be a local linear function of the landmarks ’ latent projections. This constructs a new affinity matrix between landmarks that preserves manifold structure even with few landmarks, allows one to reduce the eigenproblem size, and defines a fast, nonlinear outofsample mapping.