Results 1 - 10
of
41
Adversarial contention resolution for simple channels
- In: 17th Annual Symposium on Parallelism in Algorithms and Architectures
, 2005
"... This paper analyzes the worst-case performance of randomized backoff on simple multiple-access channels. Most previous analysis of backoff has assumed a statistical arrival model. For batched arrivals, in which all n packets arrive at time 0, we show the following tight high-probability bounds. Rand ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
(Show Context)
This paper analyzes the worst-case performance of randomized backoff on simple multiple-access channels. Most previous analysis of backoff has assumed a statistical arrival model. For batched arrivals, in which all n packets arrive at time 0, we show the following tight high-probability bounds. Randomized binary exponential backoff has makespan Θ(nlgn), and more generally, for any constant r, r-exponential backoff has makespan Θ(nlog lgr n). Quadratic backoff has makespan Θ((n/lg n) 3/2), and more generally, for r> 1, r-polynomial backoff has makespan Θ((n/lg n) 1+1/r). Thus, for batched inputs, both exponential and polynomial backoff are highly sensitive to backoff constants. We exhibit a monotone superpolynomial subexponential backoff algorithm, called loglog-iterated backoff, that achieves makespan Θ(nlg lgn/lg lglgn). We provide a matching lower bound showing that this strategy is optimal among all monotone backoff algorithms. Of independent interest is that this lower bound was proved with a delay sequence argument. In the adversarial-queuing model, we present the following stability and instability results for exponential backoff and loglogiterated backoff. Given a (λ,T)-stream, in which at most n = λT packets arrive in any interval of size T, exponential backoff is stable for arrival rates of λ = O(1/lgn) and unstable for arrival rates of λ = Ω(lglgn/lg n); loglog-iterated backoff is stable for arrival rates of λ = O(1/(lg lgnlgn)) and unstable for arrival rates of λ = Ω(1/lg n). Our instability results show that bursty input is close to being worst-case for exponential backoff and variants and that even small bursts can create instabilities in the channel.
An optical simulation of shared memory
, 1994
"... We present a work-optimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
We present a work-optimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module per processor. Each memory module only services a request on a timestep if it receives exactly one memory request. Our algorithm simulates each step of an n lg lg n-processor erew pram on an n-processor ocpc in O(lg lg n) expected delay. (The probability that the delay is longer than this is at most n; for any constant.) The best previous simulation, due to Valiant, required (lg n) expected delay.
On Contention Resolution Protocols and Associated Probabilistic Phenomena
- IN PROCEEDINGS OF THE 26TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 1994
"... ..."
Contention Resolution with Constant Expected Delay
"... We study contention resolution problem in a multiple-access channel such as the Ethernet... ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
We study contention resolution problem in a multiple-access channel such as the Ethernet...
Delayed path coupling and generating random permutations via distributed stochastic processes
, 1999
"... We analyze various stochastic processes for generating permutations almost uniformly at random in distributed and parallel systems. All our protocols are simple, elegant and are based on performing disjoint transpositions executed in parallel. The challenging problem of our concern is to prove that ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
(Show Context)
We analyze various stochastic processes for generating permutations almost uniformly at random in distributed and parallel systems. All our protocols are simple, elegant and are based on performing disjoint transpositions executed in parallel. The challenging problem of our concern is to prove that the output configurations in our processes reach almost uniform probability distribution very rapidly, i.e. in a (low) polylogarithmic time. For the analysis of the aforementioned protocols we develop a novel technique, called delayed path coupling, for proving rapid mixing of Markov chains. Our approach is an extension of the path coupling method of Bubley and Dyer. We apply delayed path coupling to three stochastic processes for generating random permutations. For one
Randomized communication in radio networks
- HANDBOOK OF RANDOMIZED COMPUTING
, 2001
"... A communication network is called a radio network if its nodes exchange messages in the following restricted way. First, a send operation performed by a node delivers copies of the same message to all directly reachable nodes. Secondly, a node can successfully receive an incoming message only if exa ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
(Show Context)
A communication network is called a radio network if its nodes exchange messages in the following restricted way. First, a send operation performed by a node delivers copies of the same message to all directly reachable nodes. Secondly, a node can successfully receive an incoming message only if exactly one of its neighbors sent a message in that step. It is this semantics of how ports at nodes send and receive messages that defines the networks rather than the fact that only radio waves are used as a medium of communication; but if that is the case then just a single frequency is used. We discuss algorithmic aspects of exchanging information in such networks, concentrating on distributed randomized protocols. Specific problems and solutions depend a lot on the topology of the underlying reachability graph and how much the nodes know about it. In single-hop networks each pair of nodes can communicate directly. This kind of networks is also known as the multiple access channel. Popular
Scheduling Parallel Communication: The h-Relation Problem
- IN PROC. OF THE 20TH INTERNATIONAL SYMP. ON MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE, LNCS 969
, 1995
"... This paper is concerned with the efficient scheduling and routing of point-to-point messages in a distributed computing system with n processors. We examine the h-relation problem, a routing problem where each processor has at most h messages to send and at most h messages to receive. Communica ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
This paper is concerned with the efficient scheduling and routing of point-to-point messages in a distributed computing system with n processors. We examine the h-relation problem, a routing problem where each processor has at most h messages to send and at most h messages to receive. Communication is carried out in rounds. Direct communication is possible from any processor to any other, and in each round a processor can send one message and receive one message. The off-line version of the problem arises when every processor knows the source and destination of every message. In this case the messages can be routed in at most h rounds. More interesting, and more typical, is the on-line version, in which each processor has knowledge only of h and of the destinations of those messages which it must send. The on-line version of the problem is the focus of this paper. The difficulty of the h-relation problem stems from message conflicts, in which two or more messages are se...
Contention Resolution in Hashing Based Shared Memory Simulations
, 2000
"... In this paper we study the problem of simulating shared memory on the distributed memory machine (DMM). Our approach uses multiple copies of shared memory cells, distributed among the memory modules of the DMM via universal hashing. The main aim is to design strategies that resolve contention at th ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
In this paper we study the problem of simulating shared memory on the distributed memory machine (DMM). Our approach uses multiple copies of shared memory cells, distributed among the memory modules of the DMM via universal hashing. The main aim is to design strategies that resolve contention at the memory modules. Extending results and methods from random graphs and very fast randomized algorithms, we present new simulation techniques that enable us to improve the previously best results exponentially. In particular, we show that an n-processor CRCW PRAM can be simulated by an n-processor DMM with delay O(log log log n log ∗ n), with high probability. Next we describe a general technique that can be used to turn these simulations into timeprocessor optimal ones, in the case of EREW PRAMs to be simulated. We obtain a time-processor optimal simulation of an (n log log log n log ∗ n)-processor EREW PRAM on an n-processor DMM with delay O(log log log n log ∗ n), with high probability. When an (n log log log n log ∗ n)-processor CRCW PRAM is simulated, the delay is only by a log ∗ n factor larger. We further demonstrate that the simulations presented can not be significantly improved using our techniques. We show an Ω(log log log n / log log log log n) lower bound on the expected delay for a class of PRAM simulations, called topological simulations, that covers all previously known simulations as well as the simulations presented in the paper.
Fast Deterministic Simulation of Computations on Faulty Parallel Machines
- in Proc. of the 3rd Ann. European Symp. on Algorithms, 1995, Springer Verlag LNCS 979
, 1995
"... A method of deterministic simulation of fully operational parallel machines on the analogous machines prone to errors is developed. The simulation is presented for the exclusive-read exclusive-write (EREW) PRAM and the Optical Communication Parallel Computer (OCPC), but it applies to a large class o ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
A method of deterministic simulation of fully operational parallel machines on the analogous machines prone to errors is developed. The simulation is presented for the exclusive-read exclusive-write (EREW) PRAM and the Optical Communication Parallel Computer (OCPC), but it applies to a large class of parallel computers. It is shown that simulations of operational multiprocessor machines on faulty ones can be performed with logarithmic slowdown in the worst case. More precisely, we prove that both a PRAM with a bounded fraction of faulty processors and memory cells and an OCPC with a bounded fraction of faulty processors can simulate deterministically their fault-free counterparts with O(log n) slowdown and preprocessing done in time O(log 2 n). The fault model is as follows. The faults are deterministic (worst-case distribution) and static (do not change in the course of a computation). If a processor attempts to communicate with some other processor (in the case of an OCPC) or re...
Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
- Journal of Parallel and Distributed Computing
"... Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N), where 2 < 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N = log N processors. Such a parallel comp ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N), where 2 < 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N = log N processors. Such a parallel computation is cost optimal and matches the performance of PRAM. Furthermore, our parallelization on a DMPC can be made fully scalable, that is, for all 1 p N = log N, multiplying two N N matrices can be performed by a DMPC with p processors in O(N =p) time, i.e., linear speedup and cost optimality can be achieved in the range [1::N = log N]. This unifies all known algorithms for matrix multiplication on DMPC, standard or non-standard, sequential or parallel. Extensions of our methods and results to other parallel systems are also presented. The above claims result in significant progress in scalable parallel matrix multiplication (as well as solving many other important problems) on distributed memory systems, both theoretically and practically. 1