Results 11  20
of
51
Structure and Dynamics of Information Pathways in Online Media
"... Diffusion of information, spread of rumors and infectious diseases are all instances of stochastic processes that occur over the edges of an underlying network. Many times networks over which contagions spread are unobserved, and such networks are often dynamic and change over time. In this paper, w ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
Diffusion of information, spread of rumors and infectious diseases are all instances of stochastic processes that occur over the edges of an underlying network. Many times networks over which contagions spread are unobserved, and such networks are often dynamic and change over time. In this paper, we investigate the problem of inferring dynamic networks based on information diffusion data. We assume there is an unobserved dynamic network that changes over time, while we observe the results of a dynamic process spreading over the edges of the network. The task then is to infer the edges and the dynamics of the underlying network. We develop an online algorithm that relies on stochastic convex optimization to efficiently solve the dynamic network inference problem. We apply our algorithm to information diffusion among 3.3 million mainstream media and blog sites and experiment with more than 179 million different pieces of information spreading over the network in a one year period. We study the evolution of information pathways in the online media space and find interesting insights. Information pathways for general recurrent topics are more stable across time than for ongoing news events. Clusters of news media sites and blogs often emerge and vanish in matter of days for ongoing news events. Major social movements and events involving civil population, such as the Libyan’s civil war or Syria’s uprise, lead to an increased amount of information pathways among blogs as well as in the overall increase in the network centrality of blogs and social media sites.
MLbase: A Distributed Machinelearning System
"... Machine learning (ML) and statistical techniques are key to transforming big data into actionable knowledge. In spite of the modern primacy of data, the complexity of existing ML algorithms is often overwhelming—many users do not understand the tradeoffs and challenges of parameterizing and choosin ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
Machine learning (ML) and statistical techniques are key to transforming big data into actionable knowledge. In spite of the modern primacy of data, the complexity of existing ML algorithms is often overwhelming—many users do not understand the tradeoffs and challenges of parameterizing and choosing between different learning techniques. Furthermore, existing scalable systems that support machine learning are typically not accessible to ML researchers without a strong background in distributed systems and lowlevel primitives. In this work, we present our vision for MLbase, a novel system harnessing the power of machine learning for both endusers and ML researchers. MLbase provides (1) a simple declarative way to specify ML tasks, (2) a novel optimizer to select and dynamically adapt the choice of learning algorithm, (3) a set of highlevel operators to enable ML researchers to scalably implement a wide range of ML methods without deep systems knowledge, and (4) a new runtime optimized for the dataaccess patterns of these highlevel operators. 1.
CommunicationEfficient Algorithms for Statistical Optimization
"... We study two communicationefficient algorithms for distributed statistical optimization on largescale data. The first algorithm is an averaging method that distributes the N data samples evenly to m machines, performs separate minimization on each subset, and then averages the estimates. We provid ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
(Show Context)
We study two communicationefficient algorithms for distributed statistical optimization on largescale data. The first algorithm is an averaging method that distributes the N data samples evenly to m machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves meansquared error that decays as O(N −1 +(N/m) −2). Wheneverm ≤ √ N, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all N samples. The second algorithm is a novel method, based on an appropriate form of the bootstrap. Requiring only a single round of communication, it has meansquared error that decays asO(N −1 +(N/m) −3), and so is more robust to the amount of parallelization. We complement our theoretical results with experiments on largescale problems from the internet search domain. In particular, we show that our methods efficiently solve an advertisement prediction problem from the Chinese SoSo Search Engine, which consists ofN ≈ 2.4×10 8 samples andd ≥ 700,000 dimensions. 1
Trading computation for communication: Distributed stochastic dual coordinate ascent
 in NIPS
, 2013
"... We present and study a distributed optimization algorithm by employing a stochastic dual coordinate ascent method. Stochastic dual coordinate ascent methods enjoy strong theoretical guarantees and often have better performances than stochastic gradient descent methods in optimizing regularized lo ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
(Show Context)
We present and study a distributed optimization algorithm by employing a stochastic dual coordinate ascent method. Stochastic dual coordinate ascent methods enjoy strong theoretical guarantees and often have better performances than stochastic gradient descent methods in optimizing regularized loss minimization problems. It still lacks of efforts in studying them in a distributed framework. We make a progress along the line by presenting a distributed stochastic dual coordinate ascent algorithm in a star network, with an analysis of the tradeoff between computation and communication. We verify our analysis by experiments on real data sets. Moreover, we compare the proposed algorithm with distributed stochastic gradient descent methods and distributed alternating direction methods of multipliers for optimizing SVMs in the same distributed framework, and observe competitive performances. 1
Parallel MCMC via Weierstrass sampler. arXiv preprint arXiv:1312.4605
, 2013
"... ar ..."
(Show Context)
Asynchronous stochastic coordinate descent: Parallelism and convergence properties
"... ar ..."
(Show Context)
Online Distributed Optimization via Dual Averaging
 IEEE Conference on Decision and Control
, 2013
"... Abstract—This paper presents a regret analysis on a distributed online optimization problem computed over a network of agents. The goal is to distributively optimize a global objective function which can be decomposed into the summation of convex cost functions associated with each agent. Since the ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Abstract—This paper presents a regret analysis on a distributed online optimization problem computed over a network of agents. The goal is to distributively optimize a global objective function which can be decomposed into the summation of convex cost functions associated with each agent. Since the agents face uncertainties in the environment, their cost functions change at each time step. We extend a distributed algorithm based on dual subgradient averaging to the online setting. The proposed algorithm yields an upper bound on regret as a function of the underlying network topology, specifically its connectivity. The regret of an algorithm is the difference between the cost of the sequence of decisions generated by the algorithm and the performance of the best fixed decision in hindsight. A model for distributed sensor estimation is proposed and the corresponding simulation results are presented.
Petuum: A framework for iterativeconvergent distributed ML
 In arXiv:1312.7651
, 2013
"... See next page for additional authors ..."
(Show Context)
GPU asynchronous stochastic gradient descent to speed up neural network training
 CoRR
"... The ability to train largescale neural networks has resulted in stateoftheart performance in many areas of computer vision. These results have largely come from computational break throughs of two forms: model parallelism, e.g. GPU accelerated training, which has seen quick adoption in compute ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
The ability to train largescale neural networks has resulted in stateoftheart performance in many areas of computer vision. These results have largely come from computational break throughs of two forms: model parallelism, e.g. GPU accelerated training, which has seen quick adoption in computer vision circles, and data parallelism, e.g. ASGD, whose large scale has been used mostly in industry. We report early experiments with a system that makes use of both model parallelism and data parallelism, we call GPU ASGD. We show using GPU ASGD it is possible to speed up training of large convolutional neural networks useful for computer vision. We believe GPU ASGD will make it possible to train larger networks on larger training sets in a reasonable amount of time. 1