Results 11  20
of
123
Modeling Information Propagation with Survival Theory
, 2013
"... Networks provide a ‘skeleton’ for the spread of contagions, like, information, ideas, behaviors and diseases. Many times networks over which contagions diffuse are unobserved and need to be inferred. Here we apply survival theory to develop general additive and multiplicative risk models under which ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
Networks provide a ‘skeleton’ for the spread of contagions, like, information, ideas, behaviors and diseases. Many times networks over which contagions diffuse are unobserved and need to be inferred. Here we apply survival theory to develop general additive and multiplicative risk models under which the network inference problems can be solved efficiently by exploiting their convexity. Our additive risk model generalizes several existing network inference models. We show all these models are particular cases of our more general model. Our multiplicative model allows for modeling scenarios in which a node can either increase or decrease the risk of activation of another node, in contrast with previous approaches, which consider only positive risk increments. We evaluate the performance of our network inference algorithms on large synthetic and real cascade datasets, and show that our models are able to predict the length and duration of cascades in real data.
Time varying graphs and social network analysis: Temporal indicators and metrics
 Artificial Intelligence and Simulation of Behaviour (AISB
, 2011
"... Abstract. Most instruments formalisms, concepts, and metricsfor social networks analysis fail to capture their dynamics. Typical systems exhibit different scales of dynamics, ranging from the finegrain dynamics of interactions (which recently led researchers to consider temporal versions of dista ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
(Show Context)
Abstract. Most instruments formalisms, concepts, and metricsfor social networks analysis fail to capture their dynamics. Typical systems exhibit different scales of dynamics, ranging from the finegrain dynamics of interactions (which recently led researchers to consider temporal versions of distance, connectivity, and related indicators), to the evolution of network properties over longer periods of time. This paper proposes a general formal approach to study networks’ structural evolution for both atemporal and temporal indicators, based respectively on sequences of static graphs and sequences of timevarying graphs that cover successive timewindows. All the concepts and indicators, some of which are new, are expressed using a timevarying graph formalism recently proposed in [10]. Experimental results of the application of atemporal metrics applied to a portion of the scientific community of arXiv are provided. 1
A Constructing and Sampling Graphs with a Prescribed Joint Degree Distribution
"... One of the most influential recent results in network analysis is that many natural networks exhibit a powerlaw or lognormal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have th ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
(Show Context)
One of the most influential recent results in network analysis is that many natural networks exhibit a powerlaw or lognormal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these realworld networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest understanding the relationship between network structure and the joint degree distribution of graphs is an interesting avenue of further research. An important tool for such studies are algorithms that can generate random instances of graphs with the same joint degree distribution. This is the main topic of this paper and we study the problem from both a theoretical and practical perspective. We provide an algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via end point switches. We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge. These experiments show that our Markov Chain mixes quickly on these real graphs, allowing for utilization of our techniques in practice.
C: Network archaeology: uncovering ancient networks from presentday interactions
 PLoS Comput Biol
"... What proteins interacted in a longextinct ancestor of yeast? How have different members of a protein complex assembled together over time? Our ability to answer such questions has been limited by the unavailability of ancestral proteinprotein interaction (PPI) networks. To overcome this limitation ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
(Show Context)
What proteins interacted in a longextinct ancestor of yeast? How have different members of a protein complex assembled together over time? Our ability to answer such questions has been limited by the unavailability of ancestral proteinprotein interaction (PPI) networks. To overcome this limitation, we propose several novel algorithms to reconstruct the growth history of a presentday network. Our likelihoodbased method finds a probable previous state of the graph by applying an assumed growth model backwards in time. This approach retains node identities so that the history of individual nodes can be tracked. Using this methodology, we estimate protein ages in the yeast PPI network that are in good agreement with sequencebased estimates of age and with structural features of protein complexes. Further, by comparing the quality of the inferred histories for several different growth models (duplicationmutation with complementarity, forest fire, and preferential attachment), we provide additional evidence that a duplicationbased model captures many features of PPI network growth better than models designed to mimic social network growth. From the reconstructed history, we model the arrival time of extant and ancestral interactions and predict that complexes have significantly rewired over time and that new edges tend to form within existing complexes. We also hypothesize a distribution of perprotein duplication rates, track the change of the network’s clustering coefficient, and predict paralogous relationships between extant proteins that are likely to be complementary to the relationships inferred using sequence alone. Finally, we infer plausible parameters for
COUNTING TRIANGLES IN MASSIVE GRAPHS WITH MAPREDUCE
, 2013
"... Graphs and networks are used to model interactions in a variety of contexts. There is a growing need to quickly assess the characteristics of a graph in order to understand its underlying structure. Some of the most useful metrics are trianglebased and give a measure of the connectedness of mutual ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
(Show Context)
Graphs and networks are used to model interactions in a variety of contexts. There is a growing need to quickly assess the characteristics of a graph in order to understand its underlying structure. Some of the most useful metrics are trianglebased and give a measure of the connectedness of mutual friends. This is often summarized in terms of clustering coefficients, which measure the likelihood that two neighbors of a node are themselves connected. Computing these measures exactly for largescale networks is prohibitively expensive in both memory and time. However, a recent wedge sampling algorithm has proved successful in efficiently and accurately estimating clustering coefficients. In this paper, we describe how to implement this approach in MapReduce to deal with extremely massive graphs. We show results on publiclyavailable networks, the largest of which is 132M nodes and 4.7B edges, as well as artificially generated networks (using the Graph500 benchmark), the largest of which has 240M nodes and 8.5B edges. We can estimate the clustering coefficient by degree bin (e.g., we use exponential binning) and the number of triangles per bin, as well as the global clustering coefficient and total number of triangles, in an average of 0.33 sec. per million edges plus overhead (approximately 225 sec. total for our configuration). The technique can also be used to study triangle statistics such as the ratio of the highest and lowest degree, and we highlight differences between social and nonsocial networks. To the best of our knowledge, these are the largest trianglebased graph computations published to date.
Changepoint detection over graphs with the spectral scan statistic. arXiv preprint arXiv:1206.0773,
, 2012
"... Abstract We consider the changepoint detection problem of deciding, based on noisy measurements, whether an unknown signal over a given graph is constant or is instead piecewise constant over two induced subgraphs of relatively low cut size. We analyze the corresponding generalized likelihood rati ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
(Show Context)
Abstract We consider the changepoint detection problem of deciding, based on noisy measurements, whether an unknown signal over a given graph is constant or is instead piecewise constant over two induced subgraphs of relatively low cut size. We analyze the corresponding generalized likelihood ratio (GLR) statistic and relate it to the problem of finding a sparsest cut in a graph. We develop a tractable relaxation of the GLR statistic based on the combinatorial Laplacian of the graph, which we call the spectral scan statistic, and analyze its properties. We show how its performance as a testing procedure depends directly on the spectrum of the graph, and use this result to explicitly derive its asymptotic properties on few graph topologies. Finally, we demonstrate both theoretically and by simulations that the spectral scan statistic can outperform naive testing procedures based on edge thresholding and χ 2 testing.
Tied Kronecker Product Graph Models to Capture Variance in Network Populations
"... Abstract—Much of the past work on mining and modeling networks has focused on understanding the observed properties of single example graphs. However, in many reallife applications it is important to characterize the structure of populations of graphs. In this work, we investigate the distributiona ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
Abstract—Much of the past work on mining and modeling networks has focused on understanding the observed properties of single example graphs. However, in many reallife applications it is important to characterize the structure of populations of graphs. In this work, we investigate the distributional properties of Kronecker product graph models (KPGMs) [1]. Specifically, we examine whether these models can represent the natural variability in graph properties observed across multiple networks and find surprisingly that they cannot. By considering KPGMs from a new viewpoint, we can show the reason for this lack of variance theoretically—which is primarily due to the generation of each edge independently from the others. Based on this understanding we propose a generalization of KPGMs that uses tied parameters to increase the variance of the model, while preserving the expectation. We then show experimentally, that our mixedKPGM can adequately capture the natural variability across a population of networks. I.
Uncover TopicSensitive Information Diffusion Networks
 In AISTATS, 2012b
"... Analyzing the spreading patterns of memes with respect to their topic distributions and the underlying diffusion network structures is an important task in social network analysis. This task in many cases becomes very challenging since the underlying diffusion networks are often hidden, and the to ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
(Show Context)
Analyzing the spreading patterns of memes with respect to their topic distributions and the underlying diffusion network structures is an important task in social network analysis. This task in many cases becomes very challenging since the underlying diffusion networks are often hidden, and the topic specific transmission rates are unknown either. In this paper, we propose a continuous time model, TOPICCASCADE, for topicsensitive information diffusion networks, and infer the hidden diffusion networks and the topic dependent transmission rates from the observed time stamps and contents of cascades. One attractive property of the model is that its parameters can be estimated via a convex optimization which we solve with an efficient proximal gradient based block coordinate descent (BCD) algorithm. In both synthetic and realworld data, we show that our method significantly improves over the previous stateoftheart models in terms of both recovering the hidden diffusion networks and predicting the transmission times of memes. 1
Graphlet decomposition of a weighted network
, 2012
"... We introduce the graphlet decomposition of a weighted network, which encodes a notion of social information based on social structure. We develop a scalable algorithm, which combines EM with BronKerbosch in a novel fashion, for estimating the parameters of the model underlying graphlets using one n ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
We introduce the graphlet decomposition of a weighted network, which encodes a notion of social information based on social structure. We develop a scalable algorithm, which combines EM with BronKerbosch in a novel fashion, for estimating the parameters of the model underlying graphlets using one network sample. We explore theoretical properties of graphlets, including computational complexity, redundancy and expected accuracy. We test graphlets on synthetic data, and we analyze messaging on Facebook and crime associations in the 19th century.
Transforming Graph Data for Statistical Relational Learning
, 2012
"... Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In th ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In this article, we examine and categorize techniques for transforming graphbased relational data to improve SRL algorithms. In particular, appropriate transformations of the nodes, links, and/or features of the data can dramatically affect the capabilities and results of SRL algorithms. We introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. More specifically, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.