Results 1 - 10
of
359
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract
-
Cited by 246 (14 self)
- Add to MetaCart
(Show Context)
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse real-world networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large real-world networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “community-like.” This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are well-embeddable in a low-dimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract
-
Cited by 208 (17 self)
- Add to MetaCart
(Show Context)
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large real-world networks, ranging from traditional and on-line social networks, to technological and information networks and
Unbalanced expanders and randomness extractors from parvaresh-vardy codes
- In Proceedings of the 22nd Annual IEEE Conference on Computational Complexity
, 2007
"... We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of right-hand vertices are polynomially close to optimal, whereas the previous ..."
Abstract
-
Cited by 120 (7 self)
- Add to MetaCart
(Show Context)
We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of right-hand vertices are polynomially close to optimal, whereas the previous constructions of Ta-Shma, Umans, and Zuckerman (STOC ‘01) required at least one of these to be quasipolynomial in the optimal. Our expanders have a short and self-contained description and analysis, based on the ideas underlying the recent list-decodable errorcorrecting codes of Parvaresh and Vardy (FOCS ‘05). Our expanders can be interpreted as near-optimal “randomness condensers, ” that reduce the task of extracting randomness from sources of arbitrary min-entropy rate to extracting randomness from sources of min-entropy rate arbitrarily close to 1, which is a much easier task. Using this connection, we obtain a new construction of randomness extractors that is optimal up to constant factors, while being much simpler than the previous construction of Lu et al. (STOC ‘03) and improving upon it when the error parameter is small (e.g. 1/poly(n)).
Naïve Learning in Social Networks and the Wisdom of Crowds
, 2010
"... We study learning in a setting where agents receive independent noisy signals about the true value of a variable and then communicate in a network. They naïvely update beliefs by repeatedly taking weighted averages of neighbors’ opinions. We show that all opinions in a large society converge to the ..."
Abstract
-
Cited by 98 (1 self)
- Add to MetaCart
We study learning in a setting where agents receive independent noisy signals about the true value of a variable and then communicate in a network. They naïvely update beliefs by repeatedly taking weighted averages of neighbors’ opinions. We show that all opinions in a large society converge to the truth if and only if the influence of the most influential agent vanishes as the society grows. We also identify obstructions to this, including prominent groups, and provide structural conditions on the network ensuring efficient learning. Whether agents converge to the truth is unrelated to how quickly consensus is approached. (JEL D83, D85, Z13)
Twice-Ramanujan sparsifiers
- IN PROC. 41ST STOC
, 2009
"... We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d |V | ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively. ..."
Abstract
-
Cited by 88 (12 self)
- Add to MetaCart
We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d |V | ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively.
Optimal and scalable distribution of content updates over a mobile social network
- In Proc. IEEE INFOCOM
, 2009
"... Number: CR-PRL-2008-08-0001 ..."
Arithmetic Circuits: a survey of recent results and open questions
"... A large class of problems in symbolic computation can be expressed as the task of computing some polynomials; and arithmetic circuits form the most standard model for studying the complexity of such computations. This algebraic model of computation attracted a large amount of research in the last fi ..."
Abstract
-
Cited by 62 (5 self)
- Add to MetaCart
A large class of problems in symbolic computation can be expressed as the task of computing some polynomials; and arithmetic circuits form the most standard model for studying the complexity of such computations. This algebraic model of computation attracted a large amount of research in the last five decades, partially due to its simplicity and elegance. Being a more structured model than Boolean circuits, one could hope that the fundamental problems of theoretical computer science, such as separating P from NP, will be easier to solve for arithmetic circuits. However, in spite of the appearing simplicity and the vast amount of mathematical tools available, no major breakthrough has been seen. In fact, all the fundamental questions are still open for this model as well. Nevertheless, there has been a lot of progress in the area and beautiful results have been found, some in the last few years. As examples we mention the connection between polynomial identity testing and lower bounds of Kabanets and Impagliazzo, the lower bounds of Raz for multilinear formulas, and two new approaches for proving lower bounds: Geometric Complexity Theory and Elusive Functions. The goal of this monograph is to survey the field of arithmetic circuit complexity, focusing mainly on what we find to be the most interesting and accessible research directions. We aim to cover the main results and techniques, with an emphasis on works from the last two decades. In particular, we
Efficient and Robust Compressed Sensing using Optimized Expander Graphs
"... Expander graphs have been recently proposed to construct efficient compressed sensing algorithms. In particular, it has been shown that any n-dimensional vector that is k-sparse can be fully recovered using O(k log n) measurements and only O(k log n) simple recovery iterations. In this paper we imp ..."
Abstract
-
Cited by 47 (6 self)
- Add to MetaCart
Expander graphs have been recently proposed to construct efficient compressed sensing algorithms. In particular, it has been shown that any n-dimensional vector that is k-sparse can be fully recovered using O(k log n) measurements and only O(k log n) simple recovery iterations. In this paper we improve upon this result by considering expander graphs with expansion coefficient beyond 3 and show that, with the same number of 4 measurements, only O(k) recovery iterations are required, which is a significant improvement when n is large. In fact, full recovery can be accomplished by at most 2k very simple iterations. The number of iterations can be reduced arbitrarily close to k, and the recovery algorithm can be implemented very efficiently using a simple priority queue with total recovery time O ( n log ( )) n k We also show that by tolerating a small penalty on the number of measurements, and not on the number of recovery iterations, one can use the efficient construction of a family of expander graphs to come up with explicit measurement matrices for this method. We compare our result with other recently developed expander-graph-based methods and argue that it compares favorably both in terms of the number of required measurements and in terms of the time complexity and the simplicity of recovery. Finally we will show how our analysis extends to give a robust algorithm that finds the position and sign of the k significant elements of an almost k-sparse signal and then, using very simple optimization techniques, finds a k-sparse signal which is close to the best k-term approximation of the original signal.
Naïve Learning in Social Networks: Convergence, Influence, and the Wisdom of Crowds
, 2007
"... We study learning and influence in a setting where agents communicate according to an arbitrary social network and naïvely update their beliefs by repeatedly taking weighted averages of their neighbors ’ opinions. A focus is on conditions under which beliefs of all agents in large societies converge ..."
Abstract
-
Cited by 45 (3 self)
- Add to MetaCart
We study learning and influence in a setting where agents communicate according to an arbitrary social network and naïvely update their beliefs by repeatedly taking weighted averages of their neighbors ’ opinions. A focus is on conditions under which beliefs of all agents in large societies converge to the truth, despite their naïve updating. We show that this happens if and only if the influence of the most influential agent in the society is vanishing as the society grows. Using simple examples, we identify two main obstructions which can prevent this. By ruling out these obstructions, we provide general structural conditions on the social network that are sufficient for convergence to truth. In addition, we show how social influence changes when some agents redistribute their trust, and we provide a complete characterization of the social networks for which there is a convergence of beliefs. Finally, we survey some recent structural results on the speed of convergence and relate these to issues of segregation, polarization and propaganda.