Results 1  10
of
31
Optimal detection of sparse principal components in high dimension
, 2013
"... We perform a finite sample analysis of the detection levels for sparse principal components of a highdimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NPcomplete in general, and we describe a computationally ..."
Abstract

Cited by 42 (4 self)
 Add to MetaCart
We perform a finite sample analysis of the detection levels for sparse principal components of a highdimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NPcomplete in general, and we describe a computationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels, and it performs well on simulated datasets. Moreover, using polynomial time reductions from theoretical computer science, we bring significant evidence that our results cannot be improved, thus revealing an inherent trade off between statistical and computational performance.
Complexity theoretic lower bounds for sparse principal component detection
 In COLT 2013 – The 26th Conference on Learning Theory
, 2013
"... In the context of sparse principal component detection, we bring evidence towards the existence of a statistical price to pay for computational efficiency. We measure the performance of a test by the smallest signal strength that it can detect and we propose a computationally efficient method based ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
In the context of sparse principal component detection, we bring evidence towards the existence of a statistical price to pay for computational efficiency. We measure the performance of a test by the smallest signal strength that it can detect and we propose a computationally efficient method based on semidefinite programming. We also prove that the statistical performance of this test cannot be strictly improved by any computationally efficient method. Our results can be viewed as complexity theoretic lower bounds conditionally on the assumptions that some instances of the planted clique problem cannot be solved in randomized polynomial time.
Clustering Sparse Graphs
, 2012
"... We develop a new algorithm to cluster sparse unweighted graphs – i.e. partition the nodes into disjoint clusters so that there is higher density within clusters, and low across clusters. By sparsity we mean the setting where both the incluster and across cluster edge densities are very small, possi ..."
Abstract

Cited by 25 (6 self)
 Add to MetaCart
We develop a new algorithm to cluster sparse unweighted graphs – i.e. partition the nodes into disjoint clusters so that there is higher density within clusters, and low across clusters. By sparsity we mean the setting where both the incluster and across cluster edge densities are very small, possibly vanishing in the size of the graph. Sparsity makes the problem noisier, and hence more difficult to solve. Any clustering involves a tradeoff between minimizing two kinds of errors: missing edges within clusters and present edges across clusters. Our insight is that in the sparse case, these must be penalized differently. We analyze our algorithm’s performance on the natural, classical and widely studied “planted partition ” model (also called the stochastic block model); we show that our algorithm can cluster sparser graphs, and with smaller clusters, than all previous methods. This is seen empirically as well. 1
On the number of iterations for dantzigwolfe optimization and packingcovering approximation algorithms
 In Proceedings of the 7th International IPCO Conference
, 1999
"... We start with definitions given by Plotkin, Shmoys, and Tardos [16]. Given A ∈ IR m×n, b ∈ IR m and a polytope P ⊆ IR n,thefractional packing problem is to find an x ∈ P such that Ax ≤ b if such an x exists. An ɛapproximate solution to this problem is an x ∈ P such that Ax ≤ (1 + ɛ)b. Anɛrelaxed d ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
(Show Context)
We start with definitions given by Plotkin, Shmoys, and Tardos [16]. Given A ∈ IR m×n, b ∈ IR m and a polytope P ⊆ IR n,thefractional packing problem is to find an x ∈ P such that Ax ≤ b if such an x exists. An ɛapproximate solution to this problem is an x ∈ P such that Ax ≤ (1 + ɛ)b. Anɛrelaxed decision
Finding Hidden Cliques in Linear Time with High Probability
"... We are given a graph G with n vertices, where a random subset of k vertices has been made into a clique, and the remaining edges are chosen independently with probability 1 2, k). The hidden clique problem is to design an algorithm that finds the kclique in polynomial time with high probability. An ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
We are given a graph G with n vertices, where a random subset of k vertices has been made into a clique, and the remaining edges are chosen independently with probability 1 2, k). The hidden clique problem is to design an algorithm that finds the kclique in polynomial time with high probability. An algorithm due to Alon, Krivelevich and Sudakov [3] uses spectral techniques to find the hidden clique with high probability when k = c √ n for a sufficiently large constant c> 0. Recently, an algorithm that solves the same problem was proposed by Feige and Ron [14]. It has the advantages of being simpler and more intuitive, and of an improved running time of O(n 2). However, the analysis in [14] gives success probability of only 2/3. In this paper we present a new algorithm for finding hidden cliques that both runs in time O(n 2), and has a failure probability that is less than polynomially small.. This random graph model is denoted G(n, 1 2
Finding hidden cliques in linear time
 In 21st International Meeting on Probabilistic, Combinatorial, and Asymptotic Methods in the Analysis of Algorithms (AofA’10), Discrete Math. Theor. Comput. Sci. Proc., AM
, 2010
"... In the hidden clique problem, one needs to find the maximum clique in an nvertex graph that has a clique of size k but is otherwise random. An algorithm of Alon, Krivelevich and Sudakov that is based on spectral techniques is known to solve this problem (with high probability over the random choice ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
In the hidden clique problem, one needs to find the maximum clique in an nvertex graph that has a clique of size k but is otherwise random. An algorithm of Alon, Krivelevich and Sudakov that is based on spectral techniques is known to solve this problem (with high probability over the random choice of input graph) when k ≥ c √ n for a sufficiently large constant c. In this manuscript we present a new algorithm for finding hidden cliques. It too provably works when k> c √ n for a sufficiently large constant c. However, our algorithm has the advantage of being much simpler (no use of spectral techniques), running faster (linear time), and experiments show that the leading constant c is smaller than in the spectral approach. We also present linear time algorithms that experimentally find even smaller hidden cliques, though it remains open whether any of these algorithms finds hidden cliques of size o ( √ n).
Incoherenceoptimal matrix completion
, 2013
"... This paper considers the matrix completion problem. We show that it is not necessary to assume joint incoherence, which is a standard but unintuitive and restrictive condition that is imposed by previous studies. This leads to a sample complexity bound that is orderwise optimal with respect to the ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
This paper considers the matrix completion problem. We show that it is not necessary to assume joint incoherence, which is a standard but unintuitive and restrictive condition that is imposed by previous studies. This leads to a sample complexity bound that is orderwise optimal with respect to the incoherence parameter (as well as to the rank r and the matrix dimension n, except for a log n factor). As a consequence, we improve the sample complexity of recovering a semidefinite matrix from O(nr2 log2 n) to O(nr log2 n), and the highest allowable rank from Θ( n / log n) to Θ(n / log2 n). The key step in proof is to obtain new bounds on the `∞,2norm, defined as the maximum of the row and column norms of a matrix. To demonstrate the applicability of our techniques, we discuss extensions to SVD projection, semisupervised clustering and structured matrix completion. Finally, we turn to the lowrankplussparse matrix decomposition problem, and show that the joint incoherence condition is unavoidable here conditioned on computational complexity assumptions on the classical planted clique problem. This means that it is intractable in general to separate a rankω( n) positive semidefinite matrix and a sparse matrix. 1
Computing equilibria: A computational complexity perspective
, 2009
"... Computational complexity is the subfield of computer science that rigorously studies the intrinsic difficulty of computational problems. This survey explains how complexity theory defines “hard problems”; applies these concepts to several equilibrium computation problems; and discusses implications ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
Computational complexity is the subfield of computer science that rigorously studies the intrinsic difficulty of computational problems. This survey explains how complexity theory defines “hard problems”; applies these concepts to several equilibrium computation problems; and discusses implications for computation, games, and behavior. We assume
Statistical Algorithms and a Lower Bound for Detecting Planted Cliques
"... We introduce a framework for proving lower bounds on computational problems over distributions, based on defining a restricted class of algorithms called statistical algorithms. For such algorithms, access to the input distribution is limited to obtaining an estimate of the expectation of any given ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
We introduce a framework for proving lower bounds on computational problems over distributions, based on defining a restricted class of algorithms called statistical algorithms. For such algorithms, access to the input distribution is limited to obtaining an estimate of the expectation of any given function on a sample drawn randomly from the input distribution, rather than directly accessing samples. Our definition captures most natural algorithms of interest in theory and in practice, e.g., momentsbased methods, local search, standard iterative methods for convex optimization, MCMC and simulated annealing. Our definition and techniques are inspired by and generalize the statistical query model in learning theory [35]. For wellknown problems over distributions, we give lower bounds on the complexity of any statistical algorithm. These include an exponential lower bounds for moment maximization in R n, and a nearly optimal lower bound for detecting planted bipartite clique distributions (or planted dense subgraph distributions) when the planted clique has size O(n1/2−δ) for any constant δ> 0. Variants of the latter have been assumed to be hard to prove hardness for other problems and for cryptographic applications. Our lower bounds provide concrete evidence