Results 1  10
of
17
Finding Hidden Cliques in Linear Time with High Probability
"... We are given a graph G with n vertices, where a random subset of k vertices has been made into a clique, and the remaining edges are chosen independently with probability 1 2, k). The hidden clique problem is to design an algorithm that finds the kclique in polynomial time with high probability. An ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
We are given a graph G with n vertices, where a random subset of k vertices has been made into a clique, and the remaining edges are chosen independently with probability 1 2, k). The hidden clique problem is to design an algorithm that finds the kclique in polynomial time with high probability. An algorithm due to Alon, Krivelevich and Sudakov [3] uses spectral techniques to find the hidden clique with high probability when k = c √ n for a sufficiently large constant c> 0. Recently, an algorithm that solves the same problem was proposed by Feige and Ron [14]. It has the advantages of being simpler and more intuitive, and of an improved running time of O(n 2). However, the analysis in [14] gives success probability of only 2/3. In this paper we present a new algorithm for finding hidden cliques that both runs in time O(n 2), and has a failure probability that is less than polynomially small.. This random graph model is denoted G(n, 1 2
Statistical Algorithms and a Lower Bound for Detecting Planted Cliques
"... We introduce a framework for proving lower bounds on computational problems over distributions, based on defining a restricted class of algorithms called statistical algorithms. For such algorithms, access to the input distribution is limited to obtaining an estimate of the expectation of any given ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
We introduce a framework for proving lower bounds on computational problems over distributions, based on defining a restricted class of algorithms called statistical algorithms. For such algorithms, access to the input distribution is limited to obtaining an estimate of the expectation of any given function on a sample drawn randomly from the input distribution, rather than directly accessing samples. Our definition captures most natural algorithms of interest in theory and in practice, e.g., momentsbased methods, local search, standard iterative methods for convex optimization, MCMC and simulated annealing. Our definition and techniques are inspired by and generalize the statistical query model in learning theory [35]. For wellknown problems over distributions, we give lower bounds on the complexity of any statistical algorithm. These include an exponential lower bounds for moment maximization in R n, and a nearly optimal lower bound for detecting planted bipartite clique distributions (or planted dense subgraph distributions) when the planted clique has size O(n1/2−δ) for any constant δ> 0. Variants of the latter have been assumed to be hard to prove hardness for other problems and for cryptographic applications. Our lower bounds provide concrete evidence
Finding Hidden Cliques of Size √ N/e in Nearly Linear Time
, 2013
"... Consider an ErdösRenyi random graph in which each edge is present independently with probability 1/2, except for a subset CN of the vertices that form a clique (a completely connected subgraph). We consider the problem of identifying the clique, given a realization of such a random graph. The best ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Consider an ErdösRenyi random graph in which each edge is present independently with probability 1/2, except for a subset CN of the vertices that form a clique (a completely connected subgraph). We consider the problem of identifying the clique, given a realization of such a random graph. The best known algorithm provably finds the clique in linear time with high probability, provided CN  ≥ 1.261 √ N [YDP11]. Spectral methods can be shown to fail on cliques smaller than √ N. In this paper we describe a nearly linear time algorithm that succeeds with high probability for CN  ≥ (1 + ε) √ N/e for any ε> 0. This is the first algorithm that provably improves over spectral methods. We further generalize the hidden clique problem to other background graphs (the standard case corresponding to the complete graph on N vertices). For large girth regular graphs of degree ( ∆ + 1) we prove that ‘local ’ algorithms succeed if CN  ≥ (1 + ε)N / √ e ∆ and fail if CN  ≤ (1 − ε)N / √ e∆.
Community Detection in Sparse Random Networks
, 2013
"... We consider the problem of detecting a tight community in a sparse random network. This is formalized as testing for the existence of a dense random subgraph in a random graph. Under the null hypothesis, the graph is a realization of an ErdösRényi graph on N vertices and with connection probability ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of detecting a tight community in a sparse random network. This is formalized as testing for the existence of a dense random subgraph in a random graph. Under the null hypothesis, the graph is a realization of an ErdösRényi graph on N vertices and with connection probability p0; under the alternative, there is an unknown subgraph on n vertices where the connection probability is p1> p0. In (AriasCastro and Verzelen, 2012), we focused on the asymptotically dense regime where p0 is large enough that log(1 ∨ (np0) −1) = o(log(N/n)). We consider here the asymptotically sparse regime where p0 is small enough that log(N/n) = O(log(1 ∨ (np0) −1)). As before, we derive information theoretic lower bounds, and also establish the performance of various tests. Compared to our previous work (AriasCastro and Verzelen, 2012), the arguments for the lower bounds are based on the same technology, but are substantially more technical in the details; also, the methods we study are different: besides a variant of the scan statistic, we study other statistics such as the size of the largest connected component, the number of triangles, the eigengap of the adjacency matrix, etc. Our detection bounds are sharp, except in the Poisson regime where we were not able to fully characterize the constant arising in the bound.
Statistical and computational tradeoffs in estimation of sparse principal components
, 2014
"... In recent years, Sparse Principal Component Analysis has emerged as an extremely popular dimension reduction technique for highdimensional data. The theoretical challenge, in the simplest case, is to estimate the leading eigenvector of a population covariance matrix under the assumption that this e ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
In recent years, Sparse Principal Component Analysis has emerged as an extremely popular dimension reduction technique for highdimensional data. The theoretical challenge, in the simplest case, is to estimate the leading eigenvector of a population covariance matrix under the assumption that this eigenvector is sparse. An impressive range of estimators have been proposed; some of these are fast to compute, while others are known to achieve the minimax optimal rate over certain Gaussian or subgaussian classes. In this paper we show that, under a widelybelieved assumption from computational complexity theory, there is a fundamental tradeoff between statistical and computational performance in this problem. More precisely, working with new, larger classes satisfying a Restricted Covariance Concentration condition, we show that no randomised polynomial time algorithm can achieve the minimax optimal rate. On the other hand, we also study a (polynomial time) variant of the wellknown semidefinite relaxation estimator, and show that it attains essentially the optimal rate among all randomised polynomial time algorithms.
Finding hidden cliques of size √ N/e in nearly linear time
 Foundations of Computational Mathematics
, 2014
"... ar ..."
(Show Context)
On the Hardness of Signaling
, 2014
"... There has been a recent surge of interest in the role of information in strategic interactions. Much of this work seeks to understand how the realized equilibrium of a game is influenced by uncertainty in the environment and the information available to players in the game. Lurking beneath this lite ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
There has been a recent surge of interest in the role of information in strategic interactions. Much of this work seeks to understand how the realized equilibrium of a game is influenced by uncertainty in the environment and the information available to players in the game. Lurking beneath this literature is a fundamental, yet largely unexplored, algorithmic question: how should a “market maker ” who is privy to additional information, and equipped with a specified objective, inform the players in the game? This is an informational analogue of the mechanism design question, and views the information structure of a game as a mathematical object to be designed, rather than an exogenous variable. We initiate a complexitytheoretic examination of the design of optimal information structures in general Bayesian games, a task often referred to as signaling. We focus on one of the simplest instantiations of the signaling question: Bayesian zerosum games, and a principal who must choose an information structure maximizing the equilibrium payoff of one of the players. In this setting, we show that optimal signaling is computationally intractable, and in some cases hard to approximate, assuming that it is hard to recover a planted clique from an ErdősRényi random graph. This is despite the fact that equilibria in these games are computable in polynomial time, and therefore suggests that the hardness of optimal signaling is a distinct phenomenon from the hardness of equilibrium computation. Necessitated by the nonlocal nature of information structures, enroute to our results we prove an “amplification lemma ” for the planted clique problemwhichmay be of independent interest. Specifically, we show that even if we plant many cliques in an ErdősRényi random graph, so much so that most nodes in the graph are in some planted clique, recovering a constant fraction of the planted cliques is no easier than the traditional planted clique problem. 1
Robust convex relaxation for the planted clique and densest ksubgraph problems: additional proofs.
, 2013
"... Abstract We consider the problem of identifying the densest knode subgraph in a given graph. We write this problem as an instance of rankconstrained cardinality minimization and then relax using the nuclear and 1 norms. Although the original combinatorial problem is NPhard, we show that the dens ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract We consider the problem of identifying the densest knode subgraph in a given graph. We write this problem as an instance of rankconstrained cardinality minimization and then relax using the nuclear and 1 norms. Although the original combinatorial problem is NPhard, we show that the densest ksubgraph can be recovered from the solution of our convex relaxation for certain program inputs. In particular, we establish exact recovery in the case that the input graph contains a single planted clique plus noise in the form of corrupted adjacency relationships. We consider two constructions for this noise. In the first, noise is introduced by an adversary deterministically deleting edges within the planted clique and placing diversionary edges. In the second, these edge corruptions are performed at random. Analogous recovery guarantees for identifying the densest subgraph of fixed size in a bipartite graph are also established, and results of numerical simulations for randomly generated graphs are included to demonstrate the efficacy of our algorithm.