Results 11  20
of
55
Featureenhanced probabilistic models for diffusion network inference
 In European conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD’12
, 2012
"... Abstract. Cascading processes, such as disease contagion, viral marketing, and information diffusion, are a pervasive phenomenon in many types of networks. The problem of devising intervention strategies to facilitate or inhibit such processes has recently received considerable attention. However, a ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Cascading processes, such as disease contagion, viral marketing, and information diffusion, are a pervasive phenomenon in many types of networks. The problem of devising intervention strategies to facilitate or inhibit such processes has recently received considerable attention. However, a major challenge is that the underlying network is often unknown. In this paper, we revisit the problem of inferring latent network structure given observations from a diffusion process, such as the spread of trending topics in social media. We define a family of novel probabilistic models that can explain recurrent cascading behavior, and take into account not only the time differences between events but also a richer set of additional features. We show that MAP inference is tractable and can therefore scale to very large realworld networks. Further, we demonstrate the effectiveness of our approach by inferring the underlying network structure of a subset of the popular Twitter following network by analyzing the topics of a large number of messages posted by users over a 10month period. Experimental results show that our models accurately recover the links of the Twitter network, and significantly improve the performance over previous models based entirely on time. 1
Topology Discovery of Sparse Random Graphs With Few Participants
"... We consider the task of topology discovery of sparse random graphs using endtoend random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discover ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
We consider the task of topology discovery of sparse random graphs using endtoend random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obtain endtoend measurements, and (b) additionally, the participants exchange messages along the second shortest path. For scenario(a), ourproposedalgorithm resultsinasublineareditdistance guarantee using a sublinear number of uniformly selected participants. For scenario (b), we obtain a much stronger result, and show that we can achieve consistent reconstruction when a sublinear numberof uniformly selected nodes participate. This implies that accurate discovery of sparse random graphs is tractable using an extremely small number of participants. We finally obtain a lower bound on the number of participants required by any algorithm to reconstruct the original random graph up to a given edit distance. We also demonstrate that while consistent discovery is tractable for sparse random graphs using a small number of participants, in general, there are graphs which cannot be discovered by any algorithm even with a significant number of participants, and with the availability of endtoend information along all the paths between the participants.
Submodular Inference of Diffusion Networks from Multiple Trees
, 2012
"... Diffusion and propagation of information, influence and diseases take place over increasingly larger networks. We observe when a node copies information, makes a decision or becomes infected but networks are often hidden or unobserved. Since networks are highly dynamic, changing and growing rapid ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Diffusion and propagation of information, influence and diseases take place over increasingly larger networks. We observe when a node copies information, makes a decision or becomes infected but networks are often hidden or unobserved. Since networks are highly dynamic, changing and growing rapidly, we only observe a relatively small set of cascades before a network changes significantly. Scalable network inference based on a small cascade set is then necessary for understanding the rapidly evolving dynamics that govern diffusion. In this article, we develop a scalable approximation algorithm with provable nearoptimal performance based on submodular maximization which achieves a high accuracy in such scenario, solving an open problem first introduced by GomezRodriguez et al. (2010). Experiments on synthetic and real diffusion data show that our algorithm in practice achieves an optimal tradeoff between accuracy and running time.
Learning Diffusion Probability based on Node Attributes in Social Networks
"... Abstract. Information diffusion over a social network is analyzed by modeling the successive interactions of neighboring nodes as probabilistic processes of state changes. We address the problem of estimating parameters (diffusion probability and timedelay parameter) of the probabilistic model as ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Information diffusion over a social network is analyzed by modeling the successive interactions of neighboring nodes as probabilistic processes of state changes. We address the problem of estimating parameters (diffusion probability and timedelay parameter) of the probabilistic model as a function of the node attributes from the observed diffusion data by formulating it as the maximum likelihood problem. We show that the parameters are obtained by an iterative updating algorithm which is efficient and is guaranteed to converge. We tested the performance of the learning algorithm on three real world networks assuming the attribute dependency, and confirmed that the dependency can be correctly learned. We further show that the influence degree of each node based on the linkdependent diffusion probabilities is substantially different from that obtained assuming a uniform diffusion probability which is approximated by the average of the true linkdependent diffusion probabilities. 1
Trace Complexity of Network Inference
, 2013
"... The network inference problem consists of reconstructing the edge set of a network given traces representing the chronology of infection times as epidemics spread through the network. This problem is a paradigmatic representative of prediction tasks in machine learning that require deducing a latent ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
The network inference problem consists of reconstructing the edge set of a network given traces representing the chronology of infection times as epidemics spread through the network. This problem is a paradigmatic representative of prediction tasks in machine learning that require deducing a latent structure from observed patterns of activity in a network, which often require an unrealistically large number of resources (e.g., amount of available data, or computational time). A fundamental question is to understand which properties we can predict with a reasonable degree of accuracy with the available resources, and which we cannot. We define the trace complexity as the number of distinct traces required to achieve high fidelity in reconstructing the topology of the unobserved network or, more generally, some of its properties. We give algorithms that are competitive with, while being simpler and more efficient
Predicting Information Diffusion on Social Networks with Partial Knowledge
"... Models of information diffusion and propagation over large social media usually rely on a Close World Assumption: information can only propagate onto the network relational structure, it cannot come from external sources, the network structure is supposed fully known by the model. These assumptions ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Models of information diffusion and propagation over large social media usually rely on a Close World Assumption: information can only propagate onto the network relational structure, it cannot come from external sources, the network structure is supposed fully known by the model. These assumptions are nonrealistic for many propagation processes extracted from Social Websites. We address the problem of predicting information propagation when the network diffusion structure is unknown and without making any closed world assumption. Instead of modeling a diffusion process, we propose to directly predict the final propagation state of the information over a whole user set. We describe a general model, able to learn predicting which users are the most likely to be contaminated by the information knowing an initial state of the network. Different instances are proposed and evaluated on artificial datasets.
Link prediction in graphs with autoregressive features
 In Neural Information Processing Systems
, 2012
"... ar ..."
(Show Context)
Topology Discovery of Sparse Random Graphs With Few Participants ∗
, 2011
"... We considerthe taskoftopologydiscoveryofsparserandomgraphsusing endtoendrandom measurements(e.g., delay)between a subset ofnodes, referredto as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two ro ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We considerthe taskoftopologydiscoveryofsparserandomgraphsusing endtoendrandom measurements(e.g., delay)between a subset ofnodes, referredto as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obtain endtoend measurements, and (b) additionally, the participants exchange messages along the second shortest path. For scenario (a), our proposed algorithm results in a sublinear editdistance guarantee using a sublinear number of uniformly selected participants. For scenario (b), we obtain a much stronger result, and show that we can achieve consistent reconstruction when a sublinear number of uniformly selected nodes participate. This implies that accurate discovery of sparse random graphs is tractable using an extremely small number of participants. We finally obtain a lower bound on the number of participants required by any algorithm to reconstruct the original random graph up to a given edit distance. We also demonstrate that while consistent discovery is tractable for sparse random graphs using a small number of participants, in general, there are graphs which cannot be discovered by any algorithm even with a significant number of participants, and with the availability of endtoend information along all the paths between the participants.
Parameter Learning for Latent Network Diffusion
"... Diffusion processes in networks are increasingly used to model dynamic phenomena such as the spread of information, wildlife, or social influence. Our work addresses the problem of learning the underlying parameters that govern such a diffusion process by observing the time at which nodes become act ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Diffusion processes in networks are increasingly used to model dynamic phenomena such as the spread of information, wildlife, or social influence. Our work addresses the problem of learning the underlying parameters that govern such a diffusion process by observing the time at which nodes become active. A key advantage of our approach is that, unlike previous work, it can tolerate missing observations for some nodes in the diffusion process. Having incomplete observations is characteristic of offline networks used to model the spread of wildlife. We develop an EM algorithm to address parameter learning in such settings. Since both the E and M steps are computationally challenging, we employ a number of optimization methods such as nonlinear and differenceofconvex programming to address these challenges. Evaluation of the approach on the Redcockaded Woodpecker conservation problem shows that it is highly robust and accurately learns parameters in various settings, even with more than 80 % missing data. 1