Results 1  10
of
28
Sparser JohnsonLindenstrauss Transforms
"... We give two different constructions for dimensionality reduction in ℓ2 via linear mappings that are sparse: only an O(ε)fraction of entries in each column of our embedding matrices are nonzero to achieve distortion 1+ε with high probability, while still achieving the asymptotically optimal number ..."
Abstract

Cited by 28 (8 self)
 Add to MetaCart
We give two different constructions for dimensionality reduction in ℓ2 via linear mappings that are sparse: only an O(ε)fraction of entries in each column of our embedding matrices are nonzero to achieve distortion 1+ε with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters. Both constructions are also very simple: a vector can be embedded in two for loops. Such distributions can be used to speed up applications where ℓ2 dimensionality reduction is used.
Lowdistortion subspace embeddings in inputsparsity time and applications to robust linear regression
, 2012
"... Lowdistortion embeddings are critical building blocks for developing random sampling and random projection algorithms for common linear algebra problems. We show that, given a matrix A ∈ Rn×d with n d and a p ∈ [1, 2), with a constant probability, we can construct a lowdistortion embedding matr ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
Lowdistortion embeddings are critical building blocks for developing random sampling and random projection algorithms for common linear algebra problems. We show that, given a matrix A ∈ Rn×d with n d and a p ∈ [1, 2), with a constant probability, we can construct a lowdistortion embedding matrix Π ∈ RO(poly(d))×n that embeds Ap, the `p subspace spanned by A’s columns, into (RO(poly(d)), ‖ · ‖p); the distortion of our embeddings is only O(poly(d)), and we can compute ΠA in O(nnz(A)) time, i.e., inputsparsity time. Our result generalizes the inputsparsity time `2 subspace embedding by Clarkson and Woodruff [STOC’13]; and for completeness, we present a simpler and improved analysis of their construction for `2. These inputsparsity time `p embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as (1 ± )distortion `p subspace embedding and relativeerror `p regression. For `2, we show that a (1 + )approximate solution to the `2 regression problem specified by the matrix A and a vector b ∈ Rn can be computed in O(nnz(A) + d3 log(d/)/2) time; and for `p, via a subspacepreserving sampling procedure, we show that a (1 ± )distortion embedding of Ap into RO(poly(d)) can be computed in O(nnz(A) · logn) time, and we also show that a (1 + )approximate solution to the `p regression problem minx∈Rd ‖Ax − b‖p can be computed in O(nnz(A) · logn + poly(d) log(1/)/2) time. Moreover, we can also improve the embedding dimension or equivalently the sample size to O(d3+p/2 log(1/)/2) without increasing the complexity.
Relative errors for deterministic lowrank matrix approximations
 In Proceedings of the 25th Annual ACMSIAM Symposium on Discrete Algorithms (SODA
, 2014
"... ar ..."
(Show Context)
New constructions of RIP matrices with fast multiplication and fewer rows
, 2012
"... In compressed sensing, the restricted isometry property (RIP) is a sufficient condition for the efficient reconstruction of a nearly ksparse vector x ∈ C d from m linear measurements Φx. It is desirable for m to be small, and for Φ to support fast matrixvector multiplication. In this work, we give ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
In compressed sensing, the restricted isometry property (RIP) is a sufficient condition for the efficient reconstruction of a nearly ksparse vector x ∈ C d from m linear measurements Φx. It is desirable for m to be small, and for Φ to support fast matrixvector multiplication. In this work, we give a randomized construction of RIP matrices Φ ∈ C m×d, preserving the ℓ2 norms of all ksparse vectors with distortion 1 + ε, where the matrixvector multiply Φx can be computed in nearly linear time. The number of rows m is on the order of ε −2 k log d log 2 (k log d). Previous analyses of constructions of RIP matrices supporting fast matrixvector multiplies, such as the sampled discrete Fourier matrix, required m to be larger by roughly a log k factor. Supporting fast matrixvector multiplication is useful for iterative recovery algorithms which repeatedly multiply by Φ or Φ ∗. Furthermore, our construction, together with a connection between RIP matrices and the JohnsonLindenstrauss lemma in [KrahmerWard, SIAM. J. Math. Anal. 2011], implies fast JohnsonLindenstrauss embeddings with asymptotically fewer rows than previously known. Our approach is a simple twist on previous constructions. Rather than choosing the rows for the embedding matrix to be rows sampled from some larger structured matrix (such as the discrete Fourier transform or a random circulant matrix), we instead choose each row of the embedding matrix to be a linear combination of a small number of rows of the original matrix, with random sign flips as coefficients. The main tool in our analysis is a recent bound for the supremum of certain types of Rademacher chaos processes in [KrahmerMendelsonRauhut, arXiv abs/1207.0235]. 1
Iterative row sampling
 In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on
, 2013
"... There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n × d matrix where n d, which allows one to solve a poly(d) sized problem instead. In practic ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n × d matrix where n d, which allows one to solve a poly(d) sized problem instead. In practice, the best performances are often obtained by invoking these routines in an iterative fashion. We show these iterative methods can be adapted to give theoretical guarantees comparable and better than the current state of the art. Our approaches are based on computing the importances of the rows, known as leverage scores, in an iterative manner. We show that alternating between computing a short matrix estimate and finding more accurate approximate leverage scores leads to a series of geometrically smaller instances. This gives an algorithm that runs in O(nnz(A) + dω+θ−2) time for any θ> 0, where the dω+θ term is comparable to the cost of solving a regression problem on the small approximation. Our results are built upon the close connection between randomized matrix algorithms, iterative methods, and graph sparsification. 1
Single Pass Spectral Sparsification in Dynamic Streams
"... We present the first single pass algorithm for computing spectral sparsifiers of graphs in the dynamic semistreaming model. Given a single pass over a stream containing insertions and deletions of edges to a graph G, our algorithm maintains a randomized linear sketch of the incidence matrix of G in ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We present the first single pass algorithm for computing spectral sparsifiers of graphs in the dynamic semistreaming model. Given a single pass over a stream containing insertions and deletions of edges to a graph G, our algorithm maintains a randomized linear sketch of the incidence matrix of G into dimension O ( 1 ɛ 2 n polylog(n)). Using this sketch, at any point, the algorithm can output a (1 ± ɛ) spectral sparsifier for G with high probability. While O ( 1 ɛ 2 n polylog(n)) space algorithms are known for computing cut sparsifiers in dynamic streams [AGM12b, GKP12] and spectral sparsifiers in insertiononly streams [KL11], prior to our work, the best known single pass algorithm for maintaining spectral sparsifiers in dynamic streams required sketches of dimension Ω ( 1 ɛ 2 n 5/3) [AGM14]. To achieve our result, we show that, using a coarse sparsifier of G and a linear sketch of G’s incidence matrix, it is possible to sample edges by effective resistance, obtaining a spectral
Principal Component Analysis and Higher Correlations for Distributed Data
"... We consider algorithmic problems in the setting in which the input data has been partitioned arbitrarily on many servers. The goal is to compute a function of all the data, and the bottleneck is the communication used by the algorithm. We present algorithms for two illustrative problems on massive ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We consider algorithmic problems in the setting in which the input data has been partitioned arbitrarily on many servers. The goal is to compute a function of all the data, and the bottleneck is the communication used by the algorithm. We present algorithms for two illustrative problems on massive data sets: (1) computing a lowrank approximation of a matrix A = A1 +A2 +...+As, with matrix At stored on server t and (2) computing a function of a vector a1 + a2 +... + as, where server t has the vector at; this includes the wellstudied special case of computing frequency moments and separable functions, as well as higherorder correlations such as the number of subgraphs of a specified type occurring in a graph. For both problems we give algorithms with nearly optimal communication, and in particular the only dependence on n, the size of the data, is in the number of bits needed to represent indices and words (O(log n)). 1.
Toward a unified theory of sparse dimensionality reduction in Euclidean space, arXiv:1311.2542
"... Abstract. Let Φ ∈ Rm×n be a sparse JohnsonLindenstrauss transform [KN14] with s nonzeroes per column. For a subset T of the unit sphere, ε ∈ (0, 1/2) given, we study settings for m, s required to ensure E Φ sup x∈T ∣∣‖Φx‖22 − 1∣ ∣ < ε, i.e. so that Φ preserves the norm of every x ∈ T simultaneo ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Let Φ ∈ Rm×n be a sparse JohnsonLindenstrauss transform [KN14] with s nonzeroes per column. For a subset T of the unit sphere, ε ∈ (0, 1/2) given, we study settings for m, s required to ensure E Φ sup x∈T ∣∣‖Φx‖22 − 1∣ ∣ < ε, i.e. so that Φ preserves the norm of every x ∈ T simultaneously and multiplicatively up to 1 + ε. We introduce a new complexity parameter, which depends on the geometry of T, and show that it suffices to choose s and m such that this parameter is small. Our result is a sparse analog of Gordon’s theorem, which was concerned with a dense Φ having i.i.d. Gaussian entries. We qualitatively unify several results related to the JohnsonLindenstrauss lemma, subspace embeddings, and Fourierbased restricted isometries. Our work also
Optimal CUR matrix decompositions
 In Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC
, 2014
"... ar ..."
(Show Context)
Sparsity lower bounds for dimensionality reducing maps
 In arXiv:1211.0995v1
, 2012
"... We give neartight lower bounds for the sparsity required in several dimensionality reducing linear maps. First, consider the JohnsonLindenstrauss (JL) lemma which states that for any set of n vectors in R d there is a matrix A ∈ R m×d with m = O(ε −2 log n) such that mapping by A preserves pairwis ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
We give neartight lower bounds for the sparsity required in several dimensionality reducing linear maps. First, consider the JohnsonLindenstrauss (JL) lemma which states that for any set of n vectors in R d there is a matrix A ∈ R m×d with m = O(ε −2 log n) such that mapping by A preserves pairwise Euclidean distances of these n vectors up to a 1±ε factor. We show that there exists a set of n vectors such that any such matrix A with at most s nonzero entries per column must have s = Ω(ε −1 log n / log(1/ε)) as long as m < O(n / log(1/ε)). This bound improves the lower bound of Ω(min{ε −2, ε −1 √ log m d}) by [DasguptaKumarSarlós, STOC 2010], which only held against the stronger property of distributional JL, and only against a certain restricted class of distributions. Meanwhile our lower bound is against the JL lemma itself, with no restrictions. Our lower bound matches the sparse JohnsonLindenstrauss upper bound of [KaneNelson, SODA 2012] up to an O(log(1/ε)) factor. Next, we show that any m×n matrix with the krestricted isometry property (RIP) with constant distortion must have at least Ω(k log(n/k)) nonzeroes per column if m = O(k log(n/k)), the optimal number of rows of RIP matrices, and k < n / polylog n. This improves the previous lower bound of Ω(min{k, n/m}) by [Chandar, 2010] and shows that for virtually all k it is impossible to have a sparse RIP matrix with an optimal number of rows. Both lower bounds above also offer a tradeoff between sparsity and the number of rows. Lastly, we show that any oblivious distribution over subspace embedding matrices with 1 nonzero per column and preserving distances in a d dimensionalsubspace up to a constant factor must have at least Ω(d2) rows. This matches one of the upper bounds in [NelsonNguy˜ên, 2012] and shows the impossibility of obtaining the best of both of constructions in that work, namely 1 nonzero per column and Õ(d) rows. 1