Results 1 - 10
of
12
Scalable influence estimation in continuous-time diffusion networks
- In
, 2013
"... If a piece of information is released from a media site, can we predict whether it may spread to one million web pages, in a month? This influence estimation problem is very challenging since both the time-sensitive nature of the task and the requirement of scalability need to be addressed simultane ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
(Show Context)
If a piece of information is released from a media site, can we predict whether it may spread to one million web pages, in a month? This influence estimation problem is very challenging since both the time-sensitive nature of the task and the requirement of scalability need to be addressed simultaneously. In this paper, we propose a randomized algorithm for influence estimation in continuous-time diffusion networks. Our algorithm can estimate the influence of every node in a network with |V | nodes and |E | edges to an accuracy of using n = O(1/2) randomizations and up to logarithmic factorsO(n|E|+n|V|) computations. When used as a subroutine in a greedy influence maximization approach, our proposed algorithm is guaranteed to find a set of C nodes with the influence of at least (1 − 1/e) OPT−2C, where OPT is the optimal value. Experiments on both synthetic and real-world data show that the proposed algorithm can easily scale up to networks of millions of nodes while significantly improves over previous state-of-the-arts in terms of the accuracy of the estimated influence and the quality of the selected nodes in maximizing the influence. 1
Shaping Social Activity by Incentivizing Users
"... Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
(Show Context)
Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model social events using multivariate Hawkes processes, which can capture both endogenous and exogenous event intensities, and derive a time dependent linear relation between the intensity of exogenous events and the overall network activity. Exploiting this connection, we develop a convex optimization framework for determining the required level of external drive in order for the network to reach a desired activity level. We experimented with event data gathered from Twitter, and show that our method can steer the activity of the network more accurately than alternatives. 1
SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity
"... Social networking websites allow users to create and share content. Big information cascades of post resharing can form as users of these sites reshare others ’ posts with their friends and followers. One of the central challenges in understanding such cascading be-haviors is in forecasting informat ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Social networking websites allow users to create and share content. Big information cascades of post resharing can form as users of these sites reshare others ’ posts with their friends and followers. One of the central challenges in understanding such cascading be-haviors is in forecasting information outbreaks, where a single post becomes widely popular by being reshared by many users. In this paper, we focus on predicting the final number of reshares of a given post. We build on the theory of self-exciting point pro-cesses to develop a statistical model that allows us to make accu-rate predictions. Our model requires no training or expensive fea-ture engineering. It results in a simple and efficiently computable formula that allows us to answer questions, in real-time, such as: Given a post’s resharing history so far, what is our current estimate of its final number of reshares? Is the post resharing cascade past the initial stage of explosive growth? And, which posts will be the most reshared in the future? We validate our model using one month of complete Twitter data and demonstrate a strong improvement in predictive accuracy over existing approaches. Our model gives only 15 % relative error in predicting final size of an average information cascade after ob-serving it for just one hour.
Hawkes Processes with Stochastic Excitations
"... Abstract We propose an extension to Hawkes processes by treating the levels of self-excitation as a stochastic differential equation. Our new point process allows better approximation in application domains where events and intensities accelerate each other with correlated levels of contagion. We g ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract We propose an extension to Hawkes processes by treating the levels of self-excitation as a stochastic differential equation. Our new point process allows better approximation in application domains where events and intensities accelerate each other with correlated levels of contagion. We generalize a recent algorithm for simulating draws from Hawkes processes whose levels of excitation are stochastic processes, and propose a hybrid Markov chain Monte Carlo approach for model fitting. Our sampling procedure scales linearly with the number of required events and does not require stationarity of the point process. A modular inference procedure consisting of a combination between Gibbs and Metropolis Hastings steps is put forward. We recover expectation maximization as a special case. Our general approach is illustrated for contagion following geometric Brownian motion and exponential Langevin dynamics.
Linear processes in high-dimension: phase space and critical properties
"... Abstract In this work we investigate the generic properties of a stochastic linear model in the regime of high-dimensionality. We consider in particular the Vector AutoRegressive model (VAR) and the multivariate Hawkes process. We analyze both deterministic and random versions of these models, show ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract In this work we investigate the generic properties of a stochastic linear model in the regime of high-dimensionality. We consider in particular the Vector AutoRegressive model (VAR) and the multivariate Hawkes process. We analyze both deterministic and random versions of these models, showing the existence of a stable and an unstable phase. We find that along the transition region separating the two regimes, the correlations of the process decay slowly, and we characterize the conditions under which these slow correlations are expected to become power-laws. We check our findings with numerical simulations showing remarkable agreement with our predictions. We finally argue that real systems with a strong degree of self-interaction are naturally characterized by this type of slow relaxation of the correlations.
Multistage Campaigning in Social Networks
"... Abstract We consider the problem of how to optimize multi-stage campaigning over social networks. The dynamic programming framework is employed to balance the high present reward and large penalty on low future outcome in the presence of extensive uncertainties. In particular, we establish theoreti ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract We consider the problem of how to optimize multi-stage campaigning over social networks. The dynamic programming framework is employed to balance the high present reward and large penalty on low future outcome in the presence of extensive uncertainties. In particular, we establish theoretical foundations of optimal campaigning over social networks where the user activities are modeled as a multivariate Hawkes process, and we derive a time dependent linear relation between the intensity of exogenous events and several commonly used objective functions of campaigning. We further develop a convex dynamic programming framework for determining the optimal intervention policy that prescribes the required level of external drive at each stage for the desired campaigning result. Experiments on both synthetic data and the real-world MemeTracker dataset show that our algorithm can steer the user activities for optimal campaigning much more accurately than baselines.
Modeling Tweet Arrival Times using Log-Gaussian Cox Processes
"... Research on modeling time series text cor-pora has typically focused on predicting what text will come next, but less well studied is predicting when the next text event will occur. In this paper we ad-dress the latter case, framed as modeling continuous inter-arrival times under a log-Gaussian Cox ..."
Abstract
- Add to MetaCart
Research on modeling time series text cor-pora has typically focused on predicting what text will come next, but less well studied is predicting when the next text event will occur. In this paper we ad-dress the latter case, framed as modeling continuous inter-arrival times under a log-Gaussian Cox process, a form of inhomo-geneous Poisson process which captures the varying rate at which the tweets ar-rive over time. In an application to ru-mour modeling of tweets surrounding the 2014 Ferguson riots, we show how inter-arrival times between tweets can be ac-curately predicted, and that incorporating textual features further improves predic-tions. 1
Video Popularity Prediction by Sentiment Propagation via Implicit Network
"... Video popularity prediction plays a foundational role in many aspects of life, such as recommendation systems and invest-ment consulting. Because of its technological and economic importance, this problem has been extensively studied for years. However, four constraints have limited most related wor ..."
Abstract
- Add to MetaCart
(Show Context)
Video popularity prediction plays a foundational role in many aspects of life, such as recommendation systems and invest-ment consulting. Because of its technological and economic importance, this problem has been extensively studied for years. However, four constraints have limited most related works ’ usability. First, most feature oriented models are inadequate in the social media environment, because many videos are published with no specific content features, such as a strong cast or a famous script. Second, many studies assume that there is a linear correlation existing between view counts from early and later days, but this is not the case in every scenario. Third, numerous works just take view counts into consideration, but discount associated sen-timents. Nevertheless, it is the public opinions that directly drive a video’s final success/failure. Also, many related ap-proaches rely on a network topology, but such topologies are unavailable in many situations. Here, we propose a Dual Sentimental Hawkes Process (DSHP) to cope with all the problems above. DSHP’s innovations are reflected in three ways: (1) it breaks the ”Linear Correlation ” assumption, and implements Hawkes Process; (2) it reveals deeper fac-tors that affect a video’s popularity; and (3) it is topology free. We evaluate DSHP on four types of videos: Movies, TV Episodes, Music Videos, and Online News, and compare its performance against 6 widely used models, including Trans-lation Model, Multiple Linear Regression, KNN Regression, ARMA, Reinforced Poisson Process, and Univariate Hawkes Process. Our model outperforms all of the others, which in-dicates a promising application prospect.
Algorithms, Theory
"... We present in this paper a framework to model informa-tion diffusion in social networks based on linear multivariate Hawkes processes. Our model exploits the effective broad-casting times of information by users, which guarantees a more realistic view of the information diffusion process. The propos ..."
Abstract
- Add to MetaCart
(Show Context)
We present in this paper a framework to model informa-tion diffusion in social networks based on linear multivariate Hawkes processes. Our model exploits the effective broad-casting times of information by users, which guarantees a more realistic view of the information diffusion process. The proposed model takes into consideration not only interac-tions between users but also interactions between topics, which provides a deeper analysis of influences in social net-works. We provide an estimation algorithm based on non-negative matrix factorization techniques, which together with a dimensionality reduction argument is able to discover, in addition, the latent community structure of the social network. We also provide several numerical results of our method.