Results 1  10
of
116
Supervised Random Walks: Predicting and Recommending Links in Social Networks
"... Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Althoug ..."
Abstract

Cited by 147 (3 self)
 Add to MetaCart
(Show Context)
Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Although this problem has been extensively studied, the challenge of how to effectively combine the information from the network structure with rich node and edge attribute data remains largely open. We develop an algorithm based on Supervised Random Walks that naturally combines the information from the network structure with node and edge level attributes. We achieve this by using these attributes to guide a random walk on the graph. We formulate a supervised learning task where the goal is to learn a function that assigns strengths to edges in the network such that a random walker is more likely to visit the nodes to which new links will be created in the future. We develop an efficient training algorithm to directly learn the edge strength estimation function. Our experiments on the Facebook social graph and large collaboration networks show that our approach outperforms stateoftheart unsupervised approaches as well as approaches that are based on feature extraction.
On the Convexity of Latent Social Network Inference
"... In many realworld scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We co ..."
Abstract

Cited by 55 (4 self)
 Add to MetaCart
(Show Context)
In many realworld scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We consider contagions propagating over the edges of an unobserved social network, where we only observe the times when nodes became infected, but not who infected them. Given such node infection times, we then identify the optimal network that best explains the observed data. We present a maximum likelihood approach based on convex programming with a l1like penalty term that encourages sparsity. Experiments on real and synthetic data reveal that our method nearperfectly recovers the underlying network structure as well as the parameters of the contagion propagation model. Moreover, our approach scales well as it can infer optimal networks on thousand nodes in a matter of minutes. 1
Limiting the Spread of Misinformation in Social Networks
"... In this work, we study the notion of competing campaigns in a social network. By modeling the spread of influence in the presence of competing campaigns, we provide necessary tools for applications such as emergency response where the goal is to limit the spread of misinformation. We study the probl ..."
Abstract

Cited by 54 (2 self)
 Add to MetaCart
(Show Context)
In this work, we study the notion of competing campaigns in a social network. By modeling the spread of influence in the presence of competing campaigns, we provide necessary tools for applications such as emergency response where the goal is to limit the spread of misinformation. We study the problem of influence limitation where a “bad ” campaign starts propagating from a certain node in the network and use the notion of limiting campaigns to counteract the effect of misinformation. The problem can be summarized as identifying a subset of individuals that need to be convinced to adopt the competing (or “good”) campaign so as to minimize the number of people that adopt the “bad ” campaign at the end of both propagation processes. We show that this optimization problem is NPhard and provide approximation guarantees for a greedy solution for various definitions of this problem by proving that they are submodular. Although the greedy algorithm is a polynomial time algorithm, for today’s large scale social networks even this solution is computationally very expensive. Therefore, we study the performance of the degree centrality heuristic as well as other heuristics that have implications on our specific problem. The experiments on a number of closeknit regional networks obtained from the Facebook social network show that in most cases inexpensive heuristics do in fact compare well with the greedy approach.
Detecting and tracking political abuse in social media
 In Proceedings of ICWSM
, 2011
"... We study astroturf political campaigns on microblogging platforms: politicallymotivated individuals and organizations that use multiple centrallycontrolled accounts to create the appearance of widespread support for a candidate or opinion. We describe a machine learning framework that combines ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
We study astroturf political campaigns on microblogging platforms: politicallymotivated individuals and organizations that use multiple centrallycontrolled accounts to create the appearance of widespread support for a candidate or opinion. We describe a machine learning framework that combines topological, contentbased and crowdsourced features of information diffusion networks on Twitter to detect the early stages of viral spreading of political misinformation. We present promising preliminary results with better than 96% accuracy in the detection of astroturf content in the runup to the 2010 U.S. midterm elections. 1
Influence maximization in continuous time diffusion networks. arXiv preprint arXiv:1205.1682
, 2012
"... The problem of finding the optimal set of source nodes in a diffusion network that maximizes the spread of information, influence, and diseases in a limited amount of time depends dramatically on the underlying temporal dynamics of the network. However, this still remains largely unexplored to date ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
(Show Context)
The problem of finding the optimal set of source nodes in a diffusion network that maximizes the spread of information, influence, and diseases in a limited amount of time depends dramatically on the underlying temporal dynamics of the network. However, this still remains largely unexplored to date. To this end, given a network and its temporal dynamics, we first describe how continuous time Markov chains allow us to analytically compute the average total number of nodes reached by a diffusion process starting in a set of source nodes. We then show that selecting the set of most influential source nodes in the continuous time influence maximization problem is NPhard and develop an efficient approximation algorithm with provable nearoptimal performance. Experiments on synthetic and real diffusion networks show that our algorithm outperforms other state of the art algorithms by at least ∼20 % and is robust across different network topologies. 1.
How to win friends and influence people, truthfully: Influence maximization mechanisms for social networks
 In WSDM
, 2012
"... Throughout the past decade there has been extensive research on algorithmic and data mining techniques for solving the problem of influence maximization in social networks: if one can incentivize a subset of individuals to become early adopters of a new technology, which subset should be selected so ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
(Show Context)
Throughout the past decade there has been extensive research on algorithmic and data mining techniques for solving the problem of influence maximization in social networks: if one can incentivize a subset of individuals to become early adopters of a new technology, which subset should be selected so that the wordofmouth effect in the social network is maximized? Despite the progress in modeling and techniques, the incomplete information aspect of the problem has been largely overlooked. While data can often provide the network structure and influence patterns may be observable, the inherent cost individuals have to become early adopters is difficult to extract. In this paper we introduce mechanisms that elicit individuals’ costs while providing desirable approximation guarantees in some of the most wellstudied models of social network influence. We follow the mechanism design framework which advocates for allocation and payment schemes that incentivize individuals to report their true information. We also performed experiments using the Mechanical Turk platform and social network data to provide evidence of the framework’s effectiveness in practice.
Scalable influence estimation in continuoustime diffusion networks
 In
, 2013
"... If a piece of information is released from a media site, can we predict whether it may spread to one million web pages, in a month? This influence estimation problem is very challenging since both the timesensitive nature of the task and the requirement of scalability need to be addressed simultane ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
(Show Context)
If a piece of information is released from a media site, can we predict whether it may spread to one million web pages, in a month? This influence estimation problem is very challenging since both the timesensitive nature of the task and the requirement of scalability need to be addressed simultaneously. In this paper, we propose a randomized algorithm for influence estimation in continuoustime diffusion networks. Our algorithm can estimate the influence of every node in a network with V  nodes and E  edges to an accuracy of using n = O(1/2) randomizations and up to logarithmic factorsO(nE+nV) computations. When used as a subroutine in a greedy influence maximization approach, our proposed algorithm is guaranteed to find a set of C nodes with the influence of at least (1 − 1/e) OPT−2C, where OPT is the optimal value. Experiments on both synthetic and realworld data show that the proposed algorithm can easily scale up to networks of millions of nodes while significantly improves over previous stateofthearts in terms of the accuracy of the estimated influence and the quality of the selected nodes in maximizing the influence. 1
Structure and Dynamics of Information Pathways in Online Media
"... Diffusion of information, spread of rumors and infectious diseases are all instances of stochastic processes that occur over the edges of an underlying network. Many times networks over which contagions spread are unobserved, and such networks are often dynamic and change over time. In this paper, w ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
(Show Context)
Diffusion of information, spread of rumors and infectious diseases are all instances of stochastic processes that occur over the edges of an underlying network. Many times networks over which contagions spread are unobserved, and such networks are often dynamic and change over time. In this paper, we investigate the problem of inferring dynamic networks based on information diffusion data. We assume there is an unobserved dynamic network that changes over time, while we observe the results of a dynamic process spreading over the edges of the network. The task then is to infer the edges and the dynamics of the underlying network. We develop an online algorithm that relies on stochastic convex optimization to efficiently solve the dynamic network inference problem. We apply our algorithm to information diffusion among 3.3 million mainstream media and blog sites and experiment with more than 179 million different pieces of information spreading over the network in a one year period. We study the evolution of information pathways in the online media space and find interesting insights. Information pathways for general recurrent topics are more stable across time than for ongoing news events. Clusters of news media sites and blogs often emerge and vanish in matter of days for ongoing news events. Major social movements and events involving civil population, such as the Libyan’s civil war or Syria’s uprise, lead to an increased amount of information pathways among blogs as well as in the overall increase in the network centrality of blogs and social media sites.
Detecting and tracking the spread of astroturf memes in microblog streams. arXiv preprint arXiv:1011.3768
, 2010
"... Online social media are complementing and in some cases replacing persontoperson social interaction and redefining the diffusion of information. In particular, microblogs have become crucial grounds on which public relations, marketing, and political battles are fought. We introduce an extensibl ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
Online social media are complementing and in some cases replacing persontoperson social interaction and redefining the diffusion of information. In particular, microblogs have become crucial grounds on which public relations, marketing, and political battles are fought. We introduce an extensible framework that will enable the realtime analysis of meme diffusion in social media by mining, visualizing, mapping, classifying, and modeling massive streams of public microblogging events. We describe a Web service that leverages this framework to track political memes in Twitter and help detect astroturfing, smear campaigns, and other misinformation in the context of U.S. political elections. We present some cases of abusive behaviors uncovered by our service. Finally, we discuss promising preliminary results on the detection of suspicious memes via supervised learning based on features extracted from the topology of the diffusion networks, sentiment analysis, and crowdsourced annotations.
Learning Networks of Heterogeneous Influence
 In NIPS, 2012a
"... Information, disease, and influence diffuse over networks of entities in both natural systems and human society. Analyzing these transmission networks plays an important role in understanding the diffusion processes and predicting future events. However, the underlying transmission networks are oft ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
(Show Context)
Information, disease, and influence diffuse over networks of entities in both natural systems and human society. Analyzing these transmission networks plays an important role in understanding the diffusion processes and predicting future events. However, the underlying transmission networks are often hidden and incomplete, and we observe only the time stamps when cascades of events happen. In this paper, we address the challenging problem of uncovering the hidden network only from the cascades. The structure discovery problem is complicated by the fact that the influence between networked entities is heterogeneous, which can not be described by a simple parametric model. Therefore, we propose a kernelbased method which can capture a diverse range of different types of influence without any prior assumption. In both synthetic and real cascade data, we show that our model can better recover the underlying diffusion network and drastically improve the estimation of the transmission functions among networked entities. 1