Results 1  10
of
28
TwiceRamanujan sparsifiers
 IN PROC. 41ST STOC
, 2009
"... We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d V  ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively. ..."
Abstract

Cited by 87 (12 self)
 Add to MetaCart
(Show Context)
We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d V  ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively.
Fast Approximation of Matrix Coherence and Statistical Leverage
"... The statistical leverage scores of a matrix A are the squared rownorms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recentlypopular problems such as matrix completion and Nyströmbased lowrank matrix ..."
Abstract

Cited by 49 (11 self)
 Add to MetaCart
(Show Context)
The statistical leverage scores of a matrix A are the squared rownorms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recentlypopular problems such as matrix completion and Nyströmbased lowrank matrix approximation as well as in largescale statistical data analysis applications more generally; moreover, they are of interest since they define the key structural nonuniformity that must be dealt with in developing fast randomized matrix algorithms. Our main result is a randomized algorithm that takes as input an arbitrary n×d matrix A, with n ≫ d, and that returns as output relativeerror approximations to all n of the statistical leverage scores. The proposed algorithm runs (under assumptions on the precise values of n and d) in O(nd logn) time, as opposed to the O(nd 2) time required by the naïve algorithm that involves computing an orthogonal basis for the range of A. Our analysis may be viewed in terms of computing a relativeerror approximation to an underconstrained leastsquares approximation problem, or, relatedly, it may be viewed as an application of JohnsonLindenstrauss type ideas. Several practicallyimportant extensions of our basic result are also described, including the approximation of socalled crossleverage scores, the extension of these ideas to matrices with n≈d, and the extension to streaming environments.
A unified framework for approximating and clustering data
, 2011
"... Given a set F of n positive functions over a ground set X, we consider the problem of computing x ∗ that minimizes the expression ∑ f∈F f(x), over x ∈ X. A typical application is shape fitting, where we wish to approximate a set P of n elements (say, points) by a shape x from a (possibly infinite) f ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
Given a set F of n positive functions over a ground set X, we consider the problem of computing x ∗ that minimizes the expression ∑ f∈F f(x), over x ∈ X. A typical application is shape fitting, where we wish to approximate a set P of n elements (say, points) by a shape x from a (possibly infinite) family X of shapes. Here, each point p ∈ P corresponds to a function f such that f(x) is the distance from p to x, and we seek a shape x that minimizes the sum of distances from each point in P. In the kclustering variant, each x ∈ X is a tuple ofk shapes, andf(x) is the distance frompto its closest shape inx. Our main result is a unified framework for constructing coresets and approximate clustering for such general sets of functions. To achieve our results, we forge a link between the classic and well defined notion of εapproximations from the theory of PAC Learning and VC dimension, to the relatively new (and not so consistent) paradigm of coresets, which are some kind of “compressed representation " of the input set F. Using traditional techniques, a coreset usually implies an LTAS (linear time approximation scheme) for the corresponding optimization problem, which can be computed in parallel, via one pass over the data, and using only polylogarithmic space (i.e, in the streaming model). For several function families F for which coresets are known not to exist, or the corresponding (approximate) optimization problems are hard, our framework yields bicriteria approximations, or coresets that are large, but contained in a lowdimensional space. We demonstrate our unified framework by applying it on projective clustering problems. We obtain new coreset constructions and significantly smaller coresets, over the ones that
Lasserre hierarchy, higher eigenvalues, and approximation schemes for graph partitioning and quadratic integer programming with PSD objectives
 In Proceedings of 52nd Annual Symposium on Foundations of Computer Science (FOCS
, 2011
"... We present an approximation scheme for optimizing certain Quadratic Integer Programming problems with positive semidefinite objective functions and global linear constraints. This framework includes well known graph problems such as Minimum graph bisection, Edge expansion, Uniform sparsest cut, and ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
(Show Context)
We present an approximation scheme for optimizing certain Quadratic Integer Programming problems with positive semidefinite objective functions and global linear constraints. This framework includes well known graph problems such as Minimum graph bisection, Edge expansion, Uniform sparsest cut, and Small Set expansion, as well as the Unique Games problem. These problems are notorious for the existence of huge gaps between the known algorithmic results and NPhardness results. Our algorithm is based on rounding semidefinite programs from the Lasserre hierarchy, and the analysis uses bounds for lowrank approximations of a matrix in Frobenius norm using columns of the matrix. For all the above graph problems, we give an algorithm running in time nO(r/ε 2) with approximation ratio 1+εmin{1,λr} , where λr is the r’th smallest eigenvalue of the normalized graph Laplacian L. In the case of graph bisection and small set expansion, the number of vertices in the cut is within lowerorder terms of the stipulated bound. Our results imply (1 + O(ε)) factor approximation in time nO(r
Simple and deterministic matrix sketching
 CoRR
"... We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix A ∈ R n×m one after the other in a streaming fashion. For ℓ = ⌈1/ε ⌉ it maintains a sketch matrix B ∈ R ℓ×m such that for any unit vector x ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
(Show Context)
We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix A ∈ R n×m one after the other in a streaming fashion. For ℓ = ⌈1/ε ⌉ it maintains a sketch matrix B ∈ R ℓ×m such that for any unit vector x ‖Ax ‖ 2 ≥ ‖Bx ‖ 2 ≥ ‖Ax ‖ 2 − ε‖A ‖ 2 f. Sketch updates per row in A require amortized O(mℓ) operations. This gives the first algorithm whose error guaranty decreases proportional to 1/ℓ using O(mℓ) space. Prior art algorithms produce bounds proportional to 1 / √ ℓ. Our experiments corroborate that the faster convergence rate is observed in practice. The presented algorithm also stands out in that it is: deterministic, simple to implement, and elementary to prove. Regardless of streaming aspects, the algorithm can be used to compute a 1+ε ′ approximation to the best rank k approximation of any matrix A ∈ R n×m. This requires O(mnℓ ′ ) operations and O(mℓ ′ ) space where ℓ ′ =
Improving CUR Matrix Decomposition and the Nyström Approximation via Adaptive Sampling
"... The CUR matrix decomposition and the Nyström approximation are two important lowrank matrix approximation techniques. The Nyström method approximates a symmetric positive semidefinite matrix in terms of a small number of its columns, while CUR approximates an arbitrary data matrix by a small number ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
The CUR matrix decomposition and the Nyström approximation are two important lowrank matrix approximation techniques. The Nyström method approximates a symmetric positive semidefinite matrix in terms of a small number of its columns, while CUR approximates an arbitrary data matrix by a small number of its columns and rows. Thus, CUR decomposition can be regarded as an extension of the Nyström approximation. In this paper we establish a more general error bound for the adaptive column/row sampling algorithm, based on which we propose more accurate CUR and Nyström algorithms with expected relativeerror bounds. The proposed CUR and Nyström algorithms also have low time complexity and can avoid maintaining the whole data matrix in RAM. In addition, we give theoretical analysis for the lower error bounds of the standard Nyström method and the ensemble Nyström method. The main theoretical results established in this paper are novel, and our analysis makes no special assumption on the data matrices.
Relative errors for deterministic lowrank matrix approximations
 In Proceedings of the 25th Annual ACMSIAM Symposium on Discrete Algorithms (SODA
, 2014
"... ar ..."
(Show Context)
Iterative row sampling
 In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on
, 2013
"... There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n × d matrix where n d, which allows one to solve a poly(d) sized problem instead. In practic ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n × d matrix where n d, which allows one to solve a poly(d) sized problem instead. In practice, the best performances are often obtained by invoking these routines in an iterative fashion. We show these iterative methods can be adapted to give theoretical guarantees comparable and better than the current state of the art. Our approaches are based on computing the importances of the rows, known as leverage scores, in an iterative manner. We show that alternating between computing a short matrix estimate and finding more accurate approximate leverage scores leads to a series of geometrically smaller instances. This gives an algorithm that runs in O(nnz(A) + dω+θ−2) time for any θ> 0, where the dω+θ term is comparable to the cost of solving a regression problem on the small approximation. Our results are built upon the close connection between randomized matrix algorithms, iterative methods, and graph sparsification. 1