Results 1  10
of
94
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 572 (15 self)
 Add to MetaCart
(Show Context)
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
Randomized Gossip Algorithms
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2006
"... Motivated by applications to sensor, peertopeer, and ad hoc networks, we study distributed algorithms, also known as gossip algorithms, for exchanging information and for computing in an arbitrarily connected network of nodes. The topology of such networks changes continuously as new nodes join a ..."
Abstract

Cited by 532 (5 self)
 Add to MetaCart
(Show Context)
Motivated by applications to sensor, peertopeer, and ad hoc networks, we study distributed algorithms, also known as gossip algorithms, for exchanging information and for computing in an arbitrarily connected network of nodes. The topology of such networks changes continuously as new nodes join and old nodes leave the network. Algorithms for such networks need to be robust against changes in topology. Additionally, nodes in sensor networks operate under limited computational, communication, and energy resources. These constraints have motivated the design of “gossip ” algorithms: schemes which distribute the computational burden and in which a node communicates with a randomly chosen neighbor. We analyze the averaging problem under the gossip constraint for an arbitrary network graph, and find that the averaging time of a gossip algorithm depends on the second largest eigenvalue of a doubly stochastic matrix characterizing the algorithm. Designing the fastest gossip algorithm corresponds to minimizing this eigenvalue, which is a semidefinite program (SDP). In general, SDPs cannot be solved in a distributed fashion; however, exploiting problem structure, we propose a distributed subgradient method that solves the optimization problem over the network. The relation of averaging time to the second largest eigenvalue naturally relates it to the mixing time of a random walk with transition probabilities derived from the gossip algorithm. We use this connection to study the performance and scaling of gossip algorithms on two popular networks: Wireless Sensor Networks, which are modeled as Geometric Random Graphs, and the Internet graph under the socalled Preferential Connectivity (PC) model.
Protovalue functions: A laplacian framework for learning representation and control in markov decision processes
 Journal of Machine Learning Research
, 2006
"... This paper introduces a novel spectral framework for solving Markov decision processes (MDPs) by jointly learning representations and optimal policies. The major components of the framework described in this paper include: (i) A general scheme for constructing representations or basis functions by d ..."
Abstract

Cited by 92 (10 self)
 Add to MetaCart
(Show Context)
This paper introduces a novel spectral framework for solving Markov decision processes (MDPs) by jointly learning representations and optimal policies. The major components of the framework described in this paper include: (i) A general scheme for constructing representations or basis functions by diagonalizing symmetric diffusion operators (ii) A specific instantiation of this approach where global basis functions called protovalue functions (PVFs) are formed using the eigenvectors of the graph Laplacian on an undirected graph formed from state transitions induced by the MDP (iii) A threephased procedure called representation policy iteration comprising of a sample collection phase, a representation learning phase that constructs basis functions from samples, and a final parameter estimation phase that determines an (approximately) optimal policy within the (linear) subspace spanned by the (current) basis functions. (iv) A specific instantiation of the RPI framework using leastsquares policy iteration (LSPI) as the parameter estimation method (v) Several strategies for scaling the proposed approach to large discrete and continuous state spaces, including the Nyström extension for outofsample interpolation of eigenfunctions, and the use of Kronecker sum factorization to construct compact eigenfunctions in product spaces such as factored MDPs (vi) Finally, a series of illustrative discrete and continuous control tasks, which both illustrate the concepts and provide a benchmark for evaluating the proposed approach. Many challenges remain to be addressed in scaling the proposed framework to large MDPs, and several elaboration of the proposed framework are briefly summarized at the end.
Matrix Approximation and Projective Clustering via Volume Sampling
, 2006
"... Frieze, Kannan, and Vempala (JACM 2004) proved that a small sample of rows of a given matrix A spans the rows of a lowrank approximation D that minimizes A−DF within a small additive error, and the sampling can be done efficiently using just two passes over the matrix. In this paper, we genera ..."
Abstract

Cited by 90 (3 self)
 Add to MetaCart
Frieze, Kannan, and Vempala (JACM 2004) proved that a small sample of rows of a given matrix A spans the rows of a lowrank approximation D that minimizes A−DF within a small additive error, and the sampling can be done efficiently using just two passes over the matrix. In this paper, we generalize this result in two ways. First, we prove that the additive error drops exponentially by iterating the sampling in an adaptive manner (adaptive sampling). Using this result, we give a passefficient algorithm for computing a lowrank approximation with reduced additive error. Our second result is that there exist k rows of A whose span contains the rows of a multiplicative (k + 1)approximation to the best rankk matrix; moreover, this subset can be found by sampling ksubsets of rows from a natural distribution (volume sampling). Combining volume sampling with adaptive sampling yields the existence of a set of k + k(k + 1)/ε rows whose span contains the rows of a multiplicative (1 + ε)approximation. This leads to a PTAS for the following NPhard
On the Geometry of Differential Privacy
, 2009
"... We consider the noise complexity of differentially private mechanisms in the setting where the user asks d linear queries f: ℜ n → ℜ nonadaptively. Here, the database is represented by a vector in ℜ n and proximity between databases is measured in the ℓ1metric. We show that the noise complexity is ..."
Abstract

Cited by 89 (5 self)
 Add to MetaCart
We consider the noise complexity of differentially private mechanisms in the setting where the user asks d linear queries f: ℜ n → ℜ nonadaptively. Here, the database is represented by a vector in ℜ n and proximity between databases is measured in the ℓ1metric. We show that the noise complexity is determined by two geometric parameters associated with the set of queries. We use this connection to give tight upper and lower bounds on the noise complexity for any d � n. We show that for d random linear queries of sensitivity 1, it is necessary and sufficient to add ℓ2error Θ(min{d √ d/ε, d √ log(n/d)/ε}) to achieve εdifferential privacy. Assuming the truth of a deep conjecture from convex geometry, known as the Hyperplane conjecture, we can extend our results to arbitrary linear queries giving nearly matching upper and lower bounds. Our bound translates to error O(min{d/ε, √ d log(n/d)/ε}) per answer. The best previous upper bound (Laplacian mechanism) gives a bound of O(min{d/ε, √ n/ε}) per answer, while the best known lower bound was Ω ( √ d/ε). In contrast, our lower bound is strong enough to separate the concept of differential privacy from the notion of approximate differential privacy where an upper bound of O ( √ d/ε) can be achieved.
Decentralized estimation and control of graph connectivity in mobile sensor networks
 in American Control Conference
, 2008
"... Abstract — The ability of a robot team to reconfigure itself is useful in many applications: for metamorphic robots to change shape, for swarm motion towards a goal, for biological systems to avoid predators, or for mobile buoys to clean up oil spills. In many situations, auxiliary constraints, such ..."
Abstract

Cited by 61 (0 self)
 Add to MetaCart
Abstract — The ability of a robot team to reconfigure itself is useful in many applications: for metamorphic robots to change shape, for swarm motion towards a goal, for biological systems to avoid predators, or for mobile buoys to clean up oil spills. In many situations, auxiliary constraints, such as connectivity between team members or limits on the maximum hopcount, must be satisfied during reconfiguration. In this paper, we show that both the estimation and control of the graph connectivity can be accomplished in a decentralized manner. We describe a decentralized estimation procedure that allows each agent to track the algebraic connectivity of a timevarying graph. Based on this estimator, we further propose a decentralized gradient controller for each agent to maintain global connectivity during motion. I.
Controlling connectivity of dynamic graphs,” in
 Proc. 44th IEEE Conference on Decision and Control and European Control Conference,
, 2005
"... Abstract Dynamic networks have recently emerged as an efficient way to model various forms of interaction within teams of mobile agents, such as sensing and communication. This article focuses on the use of graphs as models of wireless communications. In this context, graphs have been used widely i ..."
Abstract

Cited by 60 (5 self)
 Add to MetaCart
Abstract Dynamic networks have recently emerged as an efficient way to model various forms of interaction within teams of mobile agents, such as sensing and communication. This article focuses on the use of graphs as models of wireless communications. In this context, graphs have been used widely in the study of robotic and sensor networks and have provided an invaluable modeling framework to address a number of coordinated tasks ranging from exploration, surveillance and reconnaissance, to cooperative construction and manipulation. In fact, the success of these stories has almost always relied on efficient information exchange and coordination between the members of the team, as seen, e.g., in the case of distributed state agreement where multihop communication has been proven necessary for convergence and performance guarantees.
Hierarchical Spatial Gossip for MultiResolution Representations in Sensor Networks
, 2007
"... In this paper we propose a lightweight algorithm for constructing multiresolution data representations for sensor networks. We compute, at each sensor node u, O(log n) aggregates about exponentially enlarging neighborhoods centered at u. The ith aggregate is the aggregated data among nodes approxim ..."
Abstract

Cited by 25 (10 self)
 Add to MetaCart
In this paper we propose a lightweight algorithm for constructing multiresolution data representations for sensor networks. We compute, at each sensor node u, O(log n) aggregates about exponentially enlarging neighborhoods centered at u. The ith aggregate is the aggregated data among nodes approximately within 2 i hops of u. We present a scheme, named the hierarchical spatial gossip algorithm, to extract and construct these aggregates, for all sensors simultaneously, with a total communication cost of O(npolylog n). The hierarchical gossip algorithm adopts atomic communication steps with each node choosing to exchange information with a node distance d away with probability 1/d 3. The attractiveness of the algorithm attributes to its simplicity, low communication cost, distributed nature and robustness to node failures and link failures. Besides the natural applications of multiresolution data summaries in data validation and information mining, we also demonstrate the application of the precomputed spatial multiresolution data summaries in answering range queries efficiently.
Optimization and Analysis of Distributed Averaging with Short Node Memory
"... Distributed averaging describes a class of network algorithms for the decentralized computation of aggregate statistics. Initially, each node has a scalar data value, and the goal is to compute the average of these values at every node (the socalled average consensus problem). Nodes iteratively exc ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
(Show Context)
Distributed averaging describes a class of network algorithms for the decentralized computation of aggregate statistics. Initially, each node has a scalar data value, and the goal is to compute the average of these values at every node (the socalled average consensus problem). Nodes iteratively exchange information with their neighbors and perform local updates until the value at every node converges to the initial network average. Much previous work has focused on algorithms where each node maintains and updates a single value; every time an update is performed, the previous value is forgotten. Convergence to the average consensus is achieved asymptotically. The convergence rate is fundamentally limited by network connectivity, and it can be prohibitively slow on topologies such as grids and random geometric graphs, even if the update rules are optimized. In this paper, we provide the first theoretical demonstration that adding a local prediction component to the update rule can significantly improve the convergence rate of distributed averaging algorithms. We focus on the case where the local predictor is a linear combination of the node’s current and previous values (i.e., two memory taps), and our update rule computes a combination of the predictor and the usual weighted linear combination of values received from neighbouring nodes. We derive the optimal mixing parameter for combining the predictor with the neighbors ’ values, and conduct a theoretical analysis of the improvement in convergence rate that can be achieved using this acceleration methodology. For a chain topology on N nodes, this leads to a factor of N improvement over standard consensus, and for a twodimensional grid, our approach achieves a factor of √ N improvement, in terms of the number of iterations required to reach a prescribed level of accuracy.