Results 1  10
of
25
Revisiting the Nyström method for improved largescale machine learning
"... We reconsider randomized algorithms for the lowrank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and pro ..."
Abstract

Cited by 34 (5 self)
 Add to MetaCart
(Show Context)
We reconsider randomized algorithms for the lowrank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods, and they point to differences between uniform and nonuniform sampling methods based on leverage scores. We complement our empirical results with a suite of worstcase theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds—e.g., improved additiveerror bounds for spectral and Frobenius norm error and relativeerror bounds for trace norm error. 1.
Clustered Nyström method for large scale manifold learning and dimension reduction
 IEEE Transactions on Neural Networks
, 2010
"... Abstract — Kernel (or similarity) matrix plays a key role in many machine learning algorithms such as kernel methods, manifold learning, and dimension reduction. However, the cost of storing and manipulating the complete kernel matrix makes it infeasible for large problems. The Nyström method is a p ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
(Show Context)
Abstract — Kernel (or similarity) matrix plays a key role in many machine learning algorithms such as kernel methods, manifold learning, and dimension reduction. However, the cost of storing and manipulating the complete kernel matrix makes it infeasible for large problems. The Nyström method is a popular samplingbased lowrank approximation scheme for reducing the computational burdens in handling large kernel matrices. In this paper, we analyze how the approximating quality of the Nyström method depends on the choice of landmark points, and in particular the encoding powers of the landmark points in summarizing the data. Our (nonprobabilistic) error analysis justifies a “clustered Nyström method ” that uses the kmeans clustering centers as landmark points. Our algorithm can be applied to scale up a wide variety of algorithms that depend on the eigenvalue decomposition of kernel matrix (or its variant), such as kernel principal component analysis, Laplacian eigenmap, spectral clustering, as well as those involving kernel matrix inverse such as leastsquares support vector machine and Gaussian process regression. Extensive experiments demonstrate the competitive performance of our algorithm in both accuracy and efficiency. Index Terms — Dimension reduction, eigenvalue decomposition, kernel matrix, lowrank approximation, manifold learning,
Large Scale Spectral Clustering with LandmarkBased Representation
 PROCEEDINGS OF THE TWENTYFIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2011
"... Spectral clustering is one of the most popular clustering approaches. Despite its good performance, it is limited in its applicability to largescale problems due to its high computational complexity. Recently, many approaches have been proposed to accelerate the spectral clustering. Unfortunately, ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
Spectral clustering is one of the most popular clustering approaches. Despite its good performance, it is limited in its applicability to largescale problems due to its high computational complexity. Recently, many approaches have been proposed to accelerate the spectral clustering. Unfortunately, these methods usually sacrifice quite a lot information of the original data, thus result in a degradation of performance. In this paper, we propose a novel approach, called Landmarkbased Spectral Clustering (LSC), for large scale clustering problems. Specifically, we select p ( ≪ n) representative data points as the landmarks and represent the original data points as the linear combinations of these landmarks. The spectral embedding of the data can then be efficiently computed with the landmarkbased representation. The proposed algorithm scales linearly with the problem size. Extensive experiments show the effectiveness and efficiency of our approach comparing to the stateoftheart methods.
The spectral norm error of the naive nystrom extension
, 2011
"... Abstract. The näıve Nyström extension forms a lowrank approximation to a positivesemidefinite matrix by uniformly randomly sampling from its columns. This paper provides the first relativeerror bound on the spectral norm error incurred in this process. This bound follows from a natural connecti ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
(Show Context)
Abstract. The näıve Nyström extension forms a lowrank approximation to a positivesemidefinite matrix by uniformly randomly sampling from its columns. This paper provides the first relativeerror bound on the spectral norm error incurred in this process. This bound follows from a natural connection between the Nyström extension and the column subset selection problem. The main tool is a matrix Chernoff bound for sampling without replacement. 1.
Ensemble Nyström Method
"... A crucial technique for scaling kernel methods to very large data sets reaching or exceeding millions of instances is based on lowrank approximation of kernel matrices. We introduce a new family of algorithms based on mixtures of Nyström approximations, ensemble Nyström algorithms, that yield more ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
(Show Context)
A crucial technique for scaling kernel methods to very large data sets reaching or exceeding millions of instances is based on lowrank approximation of kernel matrices. We introduce a new family of algorithms based on mixtures of Nyström approximations, ensemble Nyström algorithms, that yield more accurate lowrank approximations than the standard Nyström method. We give a detailed study of variants of these algorithms based on simple averaging, an exponential weight method, or regressionbased methods. We also present a theoretical analysis of these algorithms, including novel error bounds guaranteeing a better convergence rate than the standard Nyström method. Finally, we report results of extensive experiments with several data sets containing up to 1M points demonstrating the significant improvement over the standard Nyström approximation.
A novel greedy algorithm for Nyström approximation
"... The Nyström method is an efficient technique for obtaining a lowrank approximation of a large kernel matrix based on a subset of its columns. The quality of the Nyström approximation highly depends on the subset of columns used, which are usually selected using random sampling. This paper presents ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The Nyström method is an efficient technique for obtaining a lowrank approximation of a large kernel matrix based on a subset of its columns. The quality of the Nyström approximation highly depends on the subset of columns used, which are usually selected using random sampling. This paper presents a novel recursive algorithm for calculating the Nyström approximation, and an effective greedy criterion for column selection. Further, a very efficient variant is proposed for greedy sampling, which works on random partitions of data instances. Experiments on benchmark data sets show that the proposed greedy algorithms achieve significant improvements in approximating kernel matrices, with minimum overhead in run time.
Efficient Algorithms and Error Analysis for the Modified Nyström Method
"... Many kernel methods suffer from high time and space complexities and are thus prohibitive in bigdata applications. To tackle the computational challenge, the Nyström method has been extensively used to reduce time and space complexities by sacrificing some accuracy. The Nyström method speedups ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Many kernel methods suffer from high time and space complexities and are thus prohibitive in bigdata applications. To tackle the computational challenge, the Nyström method has been extensively used to reduce time and space complexities by sacrificing some accuracy. The Nyström method speedups computation by constructing an approximation of the kernel matrix using only a few columns of the matrix. Recently, a variant of the Nyström method called the modified Nyström method has demonstrated significant improvement over the standard Nyström method in approximation accuracy, both theoretically and empirically. In this paper, we propose two algorithms that make the modified Nyström method practical. First, we devise a simple column selection algorithm with a provable error bound. Our algorithm is more efficient and easier to implement than and nearly as accurate as the stateoftheart algorithm. Second, with the selected columns at hand, we propose an algorithm that computes the approximation in lower time complexity than the approach in the previous work. Furthermore, we prove that the modified Nyström method is exact under certain conditions, and we establish a lower error bound for the modified Nyström method. 1
Finding Planted Partitions in Nearly Linear Time using Arrested Spectral Clustering
"... We describe an algorithm for clustering using a similarity graph. The algorithm (a) runs in O(n log 3 n + m log n) time on graphs with n vertices and m edges, and (b) with high probability, finds all “large enough ” clusters in a random graph generated according to the planted partition model. We pr ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
We describe an algorithm for clustering using a similarity graph. The algorithm (a) runs in O(n log 3 n + m log n) time on graphs with n vertices and m edges, and (b) with high probability, finds all “large enough ” clusters in a random graph generated according to the planted partition model. We provide lower bounds that imply that our “large enough” constraint cannot be improved much, even using a computationally unbounded algorithm. We describe some experiments running the algorithm and a few related algorithms on random graphs with partitions generated using a Chinese Restaurant Processes, and some results of applying the algorithm to cluster DBLP titles. 1.