Results 1  10
of
56
Pregel: A system for largescale graph processing
 IN SIGMOD
, 2010
"... Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs—in some cases billions of vertices, trillions of edges—poses challenges to their efficient processing. In this paper we present a computational model ..."
Abstract

Cited by 472 (0 self)
 Add to MetaCart
(Show Context)
Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs—in some cases billions of vertices, trillions of edges—poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertexcentric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and faulttolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distributionrelated details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.
Hundreds of Impossibility Results for Distributed Computing
 Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract

Cited by 52 (5 self)
 Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.
An Algorithm for the Asynchronous WriteAll problem based on process collision
, 2001
"... this paper we present a rather straightforward algorithm to solve the WriteAll problem on an asynchronous PRAM, i.e. a machine on which the processes can be stopped and restarted at will. This means that it is also suitable for all other fault models as mentioned in Kanellakis and Shvartsman, page ..."
Abstract

Cited by 33 (3 self)
 Add to MetaCart
this paper we present a rather straightforward algorithm to solve the WriteAll problem on an asynchronous PRAM, i.e. a machine on which the processes can be stopped and restarted at will. This means that it is also suitable for all other fault models as mentioned in Kanellakis and Shvartsman, page 13 [9]. Using dierent terminology we can say that our algorithm is waitfree, which means that each nonfaulty process will be able to nish the whole task, within a predetermined amount of steps, independent of the actions (or failures) of other processes.
Online Scheduling of Parallel Programs on Heterogeneous Systems with Applications to Cilk
 Theory of Computing Systems Special Issue on SPAA
, 2002
"... We study the problem of executing parallel programs, in particular Cilk programs, on a collection of processors of di erent speeds. We consider a model in which each processor maintains an estimate of its own speed, where communication between processors has a cost, and where all scheduling must be ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
We study the problem of executing parallel programs, in particular Cilk programs, on a collection of processors of di erent speeds. We consider a model in which each processor maintains an estimate of its own speed, where communication between processors has a cost, and where all scheduling must be online. This problem has been considered previously in the fields of asynchronous parallel computing and scheduling theory. Our model is a bridge between the assumptions in these fields. We provide a new more accurate analysis of an old scheduling algorithm called the maximum utilization scheduler. Based on this analysis, we generalize this scheduling policy and define the high utilization scheduler. We next focus on the Cilk platform and introduce a new algorithm for scheduling Cilk multithreaded parallel programs on heterogeneous processors. This scheduler is inspired by the high utilization scheduler and is modified to fit in a Cilk context. A crucial aspect of our algorithm is that it keeps the original spirit of the Cilk scheduler. In fact, when our new algorithm runs on homogeneous processors, it exactly mimics the dynamics of the original Cilk scheduler.
Dynamic Load Balancing with Group Communication
 6TH INTERNATIONAL COLLOQUIUM ON STRUCTURAL INFORMATION AND COMMUNICATION COMPLEXITY
, 1996
"... This work considers the problem of efficiently performing a set of tasks using a network ofprocessors in the setting where the network is subject to dynamic reconfigurations, including partitions and merges. A key challenge for this setting is the implementation of dynamic loadbalancing that reduce ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
(Show Context)
This work considers the problem of efficiently performing a set of tasks using a network ofprocessors in the setting where the network is subject to dynamic reconfigurations, including partitions and merges. A key challenge for this setting is the implementation of dynamic loadbalancing that reduces the number of tasks that are performed redundantly because of the reconfigurations. We explore new approaches for load balancing in dynamic networks that canbe employed by applications using a group communication service. The group communication services that we consider include a membership service (establishing new groups to reflect dynamic changes) but does not include maintenance of a primary component. For the nprocessor, ntask load balancing problem defined in this work, the following specific results are obtained.For the case of fully dynamic changes including fragmentation and merges we show that the termination time of any online task assignment algorithm is greater than the termination timeof an offline task assignment algorithm by a factor greater than n/12.We present a load balancing algorithm that guarantees completion of all tasks in all fragments
Reliably executing tasks in the presence of untrusted entities
 Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems
, 2006
"... In this work we consider a distributed system formed by a master processor and a collection of n processors (workers) that can execute tasks; worker processors are untrusted and might act maliciously. The master assigns tasks to workers to be executed. Each task returns a binary value, and we want t ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
In this work we consider a distributed system formed by a master processor and a collection of n processors (workers) that can execute tasks; worker processors are untrusted and might act maliciously. The master assigns tasks to workers to be executed. Each task returns a binary value, and we want the master to accept only correct values with high probability. Furthermore, we assume that the service provided by the workers is not free; for each task that a worker is assigned, the master is charged with a workunit. Therefore, considering a single task assigned to several workers, our goal is to have the master computer to accept the correct value of the task with high probability, with the smallest possible amount of work (number of workers the master assigns the task). We explore two ways of bounding the number of faulty processors: (a) we consider a fixed bound f < n/2 on the maximum number of workers that may fail, and (b) a probability p < 1/2 of any processor to be faulty (all processors are faulty with probability p, independently of the rest of processors). Our work demonstrates that it is possible to obtain high probability of correct acceptance with low work. In particular, by considering both mechanisms of bounding the number of malicious workers, we first show lower bounds on the minimum amount of (expected) work required, so that any algorithm accepts the correct value with probability of success 1 − ε, where ε ≪ 1 (e.g., 1/n). Then we develop and analyze two algorithms, each using a different decision strategy, and show that both algorithms obtain the same probability of success 1 − ε, and in doing so, they require similar upper bounds on the (expected) work. Furthermore, under certain conditions, these upper bounds are asymptotically optimal with respect to our lower bounds.
Workcompetitive scheduling for cooperative computing with dynamic groups
 SIAM JOURNAL ON COMPUTING
, 2005
"... The problem of cooperatively performing a set of t tasks in a decentralized computing environment subject to failures is one of the fundamental problems in distributed computing. The setting with partitionable networks is especially challenging, as algorithmic solutions must accommodate the possib ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
The problem of cooperatively performing a set of t tasks in a decentralized computing environment subject to failures is one of the fundamental problems in distributed computing. The setting with partitionable networks is especially challenging, as algorithmic solutions must accommodate the possibility that groups of processors become disconnected (and, perhaps, reconnected) during the computation. The efficiency of taskperforming algorithms is often assessed in terms of work: the total number of tasks, counting multiplicities, performed by all of the processors during the computation. In general, the scenario where the processors are partitioned into g disconnected components causes any taskperforming algorithm to have work Ω(t · g) even if each group of processors performs no more than the optimal number of Θ(t) tasks. Given that such pessimistic lower bounds apply to any scheduling algorithm, we pursue a competitive analysis. Specifically, this paper studies a simple randomized scheduling algorithm for p asynchronous processors, connected by a dynamically changing communication medium, to complete t known tasks. The performance of this algorithm is compared against that of an omniscient offline algorithm with full knowledge of the future changes in the communication medium. The paper describes a notion of computation width, which associates a natural number with a history of changes in the communication medium, and shows both upper and lower bounds on workcompetitiveness in terms of this quantity. Specifically, it is shown that the simple randomized algorithm obtains the competitive ratio (1 + cw/e), where cw is the computation width and e is the base of the natural logarithm (e =2.7182...); this competitive ratio is then shown to be tight.
Distributed Cooperation during the Absence of Communication
, 2001
"... This paper presents a study of a distributed cooperation problem under the assumption that processors may not be able to communicate for a prolonged time. The problem for n processors is defined in terms of t tasks that need to be performed e#ciently and that are known to all processors. The resul ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
This paper presents a study of a distributed cooperation problem under the assumption that processors may not be able to communicate for a prolonged time. The problem for n processors is defined in terms of t tasks that need to be performed e#ciently and that are known to all processors. The results of this study characterize the ability of the processors to schedule their work so that when some processors establish communication, the wasted (redundant) work these processors have collectively performed prior to that time is controlled. The lower bound for wasted work presented here shows that for any set of schedules there are two processors such that when they complete t1 and t2 tasks respectively the number of redundant tasks is #(t1 t2 /t). For n = t and for schedules longer than # n,thenumberof redundant tasks for two or more processors must be at least 2. The upper bound on pairwise waste for schedules of length # n is shown to be 1. Our e#cient deterministic schedule construction is motivated by design theory. To obtain linear length schedules, a novel deterministic and e#cient construction is given. This construction has the property that pairwise wasted work increases gracefully as processors progress through their schedules. Finally our analysis of a random scheduling solution shows that with high probability pairwise waste is well behaved at all times: specifically, two processors having completed t1 and t2 tasks, respectively, are guaranteed to have no more than t1 t2 /t + # redundant tasks, where #=O(log n + t1 t2 /t # log n).
Cooperative Computing with Fragmentable and Mergeable Groups
 J. Discrete Algorithms
, 2000
"... This work considers the problem of performing a set of N tasks on a set of P cooperating messagepassing processors (P N ). The processors use a group communication service (GCS) to coordinate their activity in the setting where dynamic changes in the underlying network topology cause the processor ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
(Show Context)
This work considers the problem of performing a set of N tasks on a set of P cooperating messagepassing processors (P N ). The processors use a group communication service (GCS) to coordinate their activity in the setting where dynamic changes in the underlying network topology cause the processor groups to change over time. GCSs have been recognized as effective building blocks for faulttolerant applications in such settings. Our results explore the efficiency of faulttolerant cooperative computation using GCSs. Prior investigation of this area by Dolev et al. [8] focused on competitive lower bounds, nonredundant task allocation schemes and workefficient algorithms in the presence of fragmentation regroupings. In this work we investigate workefficient and messageefficient algorithms for fragmentation and merge regroupings. We present an algorithm that uses GCSs and implements a coordinatorbased strategy. This algorithm is motivated by the results in [8]. It achieves similar work complexity of O(N f + N) for fragmentations, where f is the number of new groups created by dynamic fragmentations.
The Complexity of Synchronous Iterative DoAll with Crashes
, 2001
"... DoAll is the problem of performing N tasks in a distributed system of P failureprone processors [9]. Many distributed and parallel algorithms have been developed for this basic problem and several algorithm simulations have been developed by iterating DoAll algorithms. The eciency of the solut ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
(Show Context)
DoAll is the problem of performing N tasks in a distributed system of P failureprone processors [9]. Many distributed and parallel algorithms have been developed for this basic problem and several algorithm simulations have been developed by iterating DoAll algorithms. The eciency of the solutions for DoAll is measured in terms of work complexity where all processing steps taken by the processors are counted. Work is ideally expressed as a function of N , P , and f , the number of processor crashes. However the known lower bounds and the upper bounds for extant algorithms do not adequately show how work depends on f . We present the rst nontrivial lower bounds for DoAll that capture the dependence of work on N , P and f . For the model of computation where processors are able to make perfect loadbalancing decisions locally, we also present matching upper bounds. Thus we give the rst complete analysis of DoAll for this model. We dene the riterative DoAll problem that abstracts the repeated use of DoAll such as found in algorithm simulations. Our fsensitive analysis enables us to derive a tight bound for riterative DoAll work (that is stronger than the rfold work complexity of a single DoAll). Our approach that models perfect loadbalancing allows for the analysis of specic algorithms to be divided into two parts: (i) the analysis of the cost of tolerating failures while performing work, and (ii) the analysis of the cost of implementing loadbalancing. We demonstrate the utility and generality of this approach by improving the analysis of two known ecient algorithms. We give an improved analysis of an ecient messagepassing algorithm (algorithm AN [5]). We also derive a new and complete analysis of the best known DoAll algorithm for...