| R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. Algorithmica, 16(4/5):517--542, 1996. |
....time units, i.e. they can become rather large as the arrival interval approaches the limit of 1=D. Nonredundant random placement has been proposed for the parallel file system RAMA [6] Combining random placement and redundancy has first been considered in parallel computing for PRAM emulation [7] and online load balancing [8] For scheduling disk accesses, these techniques have been used for multimedia applications [9, 10, 11, 12, 13, 14] These papers use shortest queue, do not specify the scheduling algorithm, or schedule large batches in a synchronous fashion. Some RAID arrays use ....
....that some requests are executed on both disks. All these theoretical difficulties led us to adopt a mostly simulation based approach in this paper. Several scheduling algorithms are known which reduce the maximal delay for scheduling a batch of jRj requests to O(jRj=D) with high probability [7], i.e. independent of D. Korst [11] explained how an optimal schedule for batches of requests can be computed using maximum flow computations and in [2] it is shown that the maximal delay is bounded by djRj=De 1 for optimal schedules with high probability. Further generalizations for batched ....
R. M. Karp, M. Luby, and F. Meyer auf der Heide, "Efficient PRAM simulation on a distributed memory machine," in 24th ACM Symp. on Theory of Computing, May 1992, pp. 318--326.
....class of service time distributions, i.e. increasing with likelihood ratio distributions. However, the SQ scheme seems more suitable for centralized systems, while for distributed systems randomized load balancing schemes are well applicable. Static randomized algorithms have been studied in [18] where Karp et al. show that two hash functions instead of one provide an exponential improvement in the maximum load of a hash bucket. In [19] and [20] dynamic randomized schemes are considered. The supermarket model is analyzed and an important result is derived. In this model, customers ....
R. M. Karp, M. Luby, and F. Meyer aud der Heide, "Efficient PRAM simulation on a distributed memory machine," in Proc. 24th ACM Symp. Theory of Computing, Victoria, BC, Canada, May 1992, pp. 318--326.
....the goal of improving productivity of parallel computing involves using the synchronous PRAM model as the programming paradigm and then efficiently simulating PRAM programs on realistic machines. Simulations of PRAM on asynchronous parallel machines have been studied for over a decade now [2, 3, 4, 5, 9, 13, 14, 18, 20, 21, 22, 23, 27, 29, 34, 35]. Simplifying considerably, simulations assume that there is a machine with p asynchronous fault prone processors and the machine is to simulate a program written for n synchronous fault free processors. The simulations use three main ideas: idempotence, load balancing, and synchronization. ....
Karp, R.M., Luby, M., Meyer auf der Heide, F.: Efficient PRAM Simulation on a Distributed Memory Machine. Algorithmica, Vol. 16 (1996) 517--542 (Preliminary version: STOC'92)
....allocate a request. Many studies have focused on the strategy of using a subset of the load information available. This involves first randomly choosing a small number, k, of homogeneous servers and then choosing the least loaded server from within that set [Mit96] ELZ86] VDK96] ABKU94] KLH92] In particular, for homogeneous systems, Mitzenmacher [Mit96] studies the tradeoffs of various choices of k and various degrees of staleness of load information reported. As the degree of staleness increases, smaller values of k are preferable. Genova et al. GC00] propose an algorithm, which ....
R. Karp, M. Luby, and F. M. Heide. Efficient PRAM Simulation on a Distributed Memory Machine. In Twenty-fourth ACM Symposium on Theory of Computing, 1992.
....is the one that achieves the best bound. 3.4 Scheduling on a distributed architecture We consider here scheduling of an Athapascan program on a distributed architecture with p identical processors. The shared memory on this architecture is emulated with the help of universal hashing functions [13]. The delay occurring for any access is bounded by h(p) In order to obtain efficient emulations (h(p) constant or very small to p) a slackness strategy [15, 17] is used. It consists in emulating a q(p) PRAM on the distributed architecture, q(p) being larger enough compared to p. In the ....
R. M. Karp, M. Luby, and F. M. auf der Heide. Efficient PRAM Simulation on a Distributed Memory Machine. Algorithmica, 16:517--542, 1996.
....bin and where D = L Gamma N=M. 1.2 Related work Let us compare our results for the random arc allocation to the well known results for balls into bins processes. These processes are among of the most intensively studied stochastic processes in the context of algorithm analysis (e.g. [14, 22, 16, 12, 2, 23, 15, 19, 4]) The simplest balls into bins process assumes that N balls are placed i.u.r. into M bins [14, 22, 16] Balls into bins are the special case of chains into bins where all chains consist of a single ball, i.e. n = N. We get M ln M r logM if N =W(M logM) 6) The Bounds (5) and ....
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In 24th ACM Symp. on Theory of Computing, pages 318--326, May 1992.
....Average delays behave like Q(1=e) i.e. they can become rather large as e approaches zero. Nonredundant random placement has been proposed for the parallel file system RAMA [22] Combining random placement and redundancy has first been considered in parallel computing for PRAM emulation [17] and online load balancing [6] For scheduling disk accesses, these techniques have been used for multimedia applications [29, 30, 19, 24, 8, 28] These papers use shortest queue, do not specify the scheduling algorithm, or schedule large batches in a synchronous fashion. Some RAID arrays use ....
....formulation has the problem that some requests are executed on both disks. All these theoretical difficulties led us to adopt a mostly experimental approach in this paper. Several scheduling algorithms are known which reduce the maximal delay for scheduling a batch of jRj requests to O(jRj=D) [17], i.e. independent of D. Korst [19] explained how an optimal schedule for batches of requests can be computed using maximum flow computations and in [27] it is shown that the maximal delay is bounded by djRj=De 1 for optimal schedules. Batched scheduling algorithms can be converted into ....
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In 24th ACM Symp. on Theory of Computing, pages 318--326, May 1992.
....have the advantage to find schedules in linear time. Our result yields an optimal offline strategy and shows that the gap between the online algorithm and an optimal offline strategy is Theta(log log D) For PRAM simulation, fast parallel scheduling algorithms have been developed even earlier [22]. PRAM simulation using a 3 collision protocol achieves maximum load O(1) for N = D using O(log log D) iterations [27, Section 3] This already works for O( log D) universal classes of hash functions. Similar results hold for allocation strategies with lower redundancy such as the ones we ....
Karp, R. M., Luby, M., and auf der Heide, F. M. Efficient PRAM simulation on a distributed memory machine. In 24th ACM Symp. on Theory of Computing (May 1992), pp. 318--326.
....log D) whp. They also state that load O(1) can be achieved using offline scheduling. In Section 4.2 we review how such offline algorithms can be used to get approximately optimal schedules in linear time. For PRAM simulation, fast parallel scheduling algorithms have been developed even earlier [15]. PRAM simulation using a 3 collision protocol achieves maximum load 3 for N = D using O(log log D) iterations [20, Section 3] This already works for O 3 Delta universal classes of hash functions. Similar results hold for allocation strategies with lower redundancy such as the ones we ....
Karp, R. M., Luby, M., and auf der Heide, F. M. Efficient PRAM simulation on a distributed memory machine. In 24th ACM Symp. on Theory of Computing (May 1992), pp. 318--326.
....any bin here, is just O( ln ln n) ln d) Theta(1) Note the significant improvement over ordinary dart throwing, even for the case of d = 2. Such a result may naturally be expected to be algorithmically significant: applications to dynamic load balancing and hashing are shown in [7] Also see [37, 52] for related resource allocation and hashing processes. In light of this, a natural question may be whether there is a variant of random initial delays that leads to an improvement in the approximation bound for job shop scheduling. However, by a random construction, it has been shown in [60] ....
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In Proc. ACM Symposium on Theory of Computing, pages 318--326, 1992.
....as a network of processor memory pairs. One formal model that corresponds reasonably closely to the GPPC would be a dis CHAPTER 1. INTRODUCTION 12 tributed memory multicomputer with full connectivity, i.e. a multicomputer with an interconnection structure corresponding to the complete graph [60]. Another is the Bulk Synchronous Parallel (BSP) computer [95] which, along with some of its variations, will be described in the next subsections. A sharper distinction between special and general purpose parallel computings is expected. Those primarily concerned with achieving the maximum ....
KARP, R. M., LUBY, M., AND AUF DER HEIDE, F. M. Efficient PRAM simulation on a Distributed Memory Machine. In 24th Annual ACM Symposium on Theory of Computing (1992).
....idea, that is, the concept of first selecting a small subset of alternatives at random and then placing the ball into one of these alternatives, was discovered independently by several groups of researchers for different applications. For example, Karp, Luby, and Meyer auf der Heide [11] introduced this kind of algorithms for PRAM simulations. Azar, Broder, Karlin, and Upfal [3] presented multiple choice algorithms for on line load balancing. Following these seminal studies, the multiple choice method has been investigated extensively in several variations, e.g. parallel ....
R. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In Proc. of the 24th ACM Symp. on Theory of Computing (STOC), pages 318--326, 1992.
.... only two hash functions, it is easy to parallelize, and it is on line (i.e. it does not involve re hashing of data) Furthermore, it is not necessary to have perfectly random hash functions: similar results hold by choosing our hash functions randomly from smaller families of hash functions; see [KLM96] 1.1.2 Shared memory emulations on DMMs One of the earliest applications of the two choice paradigm is in the study of algorithms to emulate shared memory machines (as, for example, PRAMs) on distributed memory machines (DMMs) CMS95, KLM96, MSS96] In such emulations, the processors and the ....
....randomly from smaller families of hash functions; see [KLM96] 1.1. 2 Shared memory emulations on DMMs One of the earliest applications of the two choice paradigm is in the study of algorithms to emulate shared memory machines (as, for example, PRAMs) on distributed memory machines (DMMs) CMS95, KLM96, MSS96] In such emulations, the processors and the memory cells of the shared memory machine are distributed to the processors and memory modules of the DMM using appropriately chosen (universal) hash functions. Typically, the goal of the emulation algorithm is to minimize slowdown, or delay, of ....
[Article contains additional citation context not shown here]
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. Algorithmica, 16:245--281, 1996.
....[VDK96] and Mitzenmacher [Mit96b, Mit96a] has led to an enduring technique for analysis of these load balancing systems based on fluid limit models, as described in Section 4. The first rigorous analytical demonstration of the power of two choices is due to Karp, Luby, and Meyer auf der Heide [KLM92, KLM96] who considered the possibility of using two hash functions in the context of PRAM emulation by DMMs. Subsequent work on shared memory emulations on DMMs [CMS95, MSS96] has given rise to a powerful technique for analysis called the witness tree method. See Section 3 for more details on ....
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In Proceedings of the Twenty-Fourth Annual ACM Symposium on the Theory of Computing, pages 318--326, May 1992.
....even if the target architecture is a distributed memory machine (DMM) This is implemented on a DMM by sending messages to remote processes requesting reads or writes. Since variables in Opal are place by the programmer there is no scope to hash the address space as in [ Abolhassan et al. 1991, Karp et al. 1992 ] In Opal the language guarantees that if a variable is declared in a process specification then it will be placed in the address space of that process. In [ Valiant, 1982 ] it has been shown that randomised routing can reduce hot spots and improve general performance. Although this has not ....
R M Karp, M Luby, and F Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In Proc. 24th Annual ACM Symposium on Theory of Computing, 1992.
....on an n processor c collision crossbar. Under their protocol, a read or write operation of memory location x by EREW PRAM processor i is emulated by having processor i of the c collision crossbar access 2 out of the 3 copies corresponding to memory location x. A similar protocol was presented in [13]; the idea that accessing 2 out of 3 copies is sufficient for the purposes of such an emulation was first used by Upfal and Wigderson [19] The analysis presented in [6] requires some slack in the constants; in particular, they require c 3, and are only able to analyze the protocol when it is ....
R. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In Proceedings of the 24th Annual ACM Symposium on Theory of Computing, pages 318--326, May 1992.
....(where all messages have unit length) we assume that a worst case adversary determines which subset of the messages of total size c 0 log n are successfully received by a given node if the c 0 log n limit on total size would otherwise be exceeded. The related c arbitrary DMM model of Karp et al. [10] does not take into account contention among clients trying to access the same object and hence is not well suited for our study. Message types. Our protocol makes use of a constant number of different types of messages. At times the protocol may result in, say, O(logn) messages of type ff and ....
R. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. In Proceedings of the 24th Annual ACM Symposium on Theory of Computing, pages 318--326, May 1992.
....on an n processor c collision crossbar. Under their protocol, a read or write operation of memory location x by EREW PRAM processor i is emulated by having processor i of the c collision crossbar access 2 out of the 3 copies corresponding to memory location x. A similar protocol was presented in [Karp et al. 1992]; the idea that accessing 2 out of 3 copies is sufficient for the purposes of such an emulation was first used by Upfal and Wigderson [Upfal and Wigderson 1987] The analysis presented in [Dietzfelbinger and Meyer auf der Heide 1993] requires some slack in the constants; in particular, they ....
.... independent family of functions from [m] to [n] That is, for fx i : i 2 [j]g [m] y 0 ; y Gamma1 2 [n] j , j 2 [k 1] it holds that if h is drawn uniformly at random from F k m;n , then Pr[h(x i ) y i for all i in [j] 1=n j : If k p n, F k m;n can be constructed as in [Karp et al. 1992] using the families H n d ;n and H 1 m;n d defined in [Carter and Wegman 1979] and [Siegel 1989] respectively. Here d is an appropriate constant. A hash function h chosen uniformly at random from F k m;n is defined as r ffi s, where r and s are chosen uniformly at random from H n d ;n and ....
[Article contains additional citation context not shown here]
Karp, R., Luby, M., and Meyer auf der Heide, F. 1992. Efficient PRAM simulation on a distributed memory machine. In Proceedings of the 24th Annual ACM Symposium on Theory of Computing (May 1992), pp. 318--326.
....since it assumes that all processors work synchronously and that interprocessor communication is free. Different variations to the basic PRAM model have been proposed to overcome these limitations in an attempt to obtain a more practical model while preserving great part of its simplicity [1, 2, 7, 9]. Another approach which is being seriously considered as the basis of a general purpose parallel computation is the BSP model (Bulk Synchronous Parallel) It was proposed by Leslie G. Valiant in 1990 [17] as a bridge between theory and practice. The BSP model views a parallel machine as a set of ....
R. M. Karp, M. Luby y F. Meyer auf der Heide. Efficient PRAM Simulation on a Distributed Memory Machine. In Proceedings of the TwentyFourth Annual ACM Symposium of the Theory of Computing, pages 318-- 326, May 1992.
No context found.
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. Algorithmica, 16(4/5):517--542, 1996.
No context found.
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. Algorithmica, 16(4/5): 517--542, 1996.
No context found.
R. Karp, M. Luby, F. Meyer auf der Heide. Efficient PRAM Simulations on a Distributed Memory Machine. In the Proceedings of the 24th ACM Symposium on Theory of Computing, pp. 318-326, 1992.
No context found.
R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM simulation on a distributed memory machine. Algorithmica, 16(4/5):517--542, 1996.
No context found.
R.M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient pram simulation on a distributed memory machine. In Proceedings of the 24th ACM Symposium on the Theory of Computing, pages 318--326, 1992.
No context found.
R. M. Karp, M. Luby, and F. Meyer auf der Heide. EfficientPRAMsimulation on a distributed memory machine. In Proceedings of the 24th ACM Symposium on the Theory of Computing, pages 318--326, 1992.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC