Results 11  20
of
293
Fast Crash Recovery in RAMCloud
 In Proc. of SOSP’11
, 2011
"... RAMCloud is a DRAMbased storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruc ..."
Abstract

Cited by 83 (3 self)
 Add to MetaCart
(Show Context)
RAMCloud is a DRAMbased storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a logstructured approach for all its data, in DRAM as well as on disk; this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60node cluster, RAMCloud recovers 35 GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64 GB or more) in less time with larger clusters.
Using Multiple Hash Functions to Improve IP Lookups
 IN PROCEEDINGS OF IEEE INFOCOM
, 2000
"... High performance Internet routers require a mechanism for very efficient IP address lookups. Some techniques used to this end, such as binary search on levels, need to construct quickly a good hash table for the appropriate IP prefixes. In this paper we describe an approach for obtaining good hash ..."
Abstract

Cited by 81 (10 self)
 Add to MetaCart
High performance Internet routers require a mechanism for very efficient IP address lookups. Some techniques used to this end, such as binary search on levels, need to construct quickly a good hash table for the appropriate IP prefixes. In this paper we describe an approach for obtaining good hash tables based on using multiple hashes of each input key (which is an IP address). The methods we describe are fast, simple, scalable, parallelizable, and flexible. In particular, in instances where the goal is to have one hash bucket fit into a cache line, using multiple hashes proves extremely suitable. We provide a general analysis of this hashing technique and specifically discuss its application to binary search on levels.
BALANCED ALLOCATIONS: THE HEAVILY LOADED CASE
, 2006
"... We investigate ballsintobins processes allocating m balls into n bins based on the multiplechoice paradigm. In the classical singlechoice variant each ball is placed into a bin selected uniformly at random. In a multiplechoice process each ball can be placed into one out of d ≥ 2 randomly selec ..."
Abstract

Cited by 72 (9 self)
 Add to MetaCart
(Show Context)
We investigate ballsintobins processes allocating m balls into n bins based on the multiplechoice paradigm. In the classical singlechoice variant each ball is placed into a bin selected uniformly at random. In a multiplechoice process each ball can be placed into one out of d ≥ 2 randomly selected bins. It is known that in many scenarios having more than one choice for each ball can improve the load balance significantly. Formal analyses of this phenomenon prior to this work considered mostly the lightly loaded case, that is, when m ≈ n. In this paper we present the first tight analysis in the heavily loaded case, that is, when m ≫ n rather than m ≈ n. The best previously known results for the multiplechoice processes in the heavily loaded case were obtained using majorization by the singlechoice process. This yields an upper bound of the maximum load of bins of m/n + O ( √ m ln n/n) with high probability. We show, however, that the multiplechoice processes are fundamentally different from the singlechoice variant in that they have “short memory. ” The great consequence of this property is that the deviation of the multiplechoice processes from the optimal allocation (that is, the allocation in which each bin has either ⌊m/n ⌋ or ⌈m/n ⌉ balls) does not increase with the number of balls as in the case of the singlechoice process. In particular, we investigate the allocation obtained by two different multiplechoice allocation schemes,
Comparing Random Data Allocation and Data Striping in Multimedia Servers
 In ACM SIGMETRICS
, 2000
"... We compare performance of the RIO (Randomized I/O) Multimedia Storage Server which is based on random data allocation and block replication with traditional data striping techniques. We compare both approaches in terms of maximum supported data rate and stream cost. Data striping techniques in multi ..."
Abstract

Cited by 69 (1 self)
 Add to MetaCart
(Show Context)
We compare performance of the RIO (Randomized I/O) Multimedia Storage Server which is based on random data allocation and block replication with traditional data striping techniques. We compare both approaches in terms of maximum supported data rate and stream cost. Data striping techniques in multimedia servers are often designed for restricted workloads. e.g. sequential access patterns with CBR (constant bit rate) requirements. On other hand, RIO is designed to support virtually any type of multimedia application, including VBR (variable bit rate) video or audio, and interactive applications with unpredictable access patterns, such as 3D interactive virtual worlds, interactive scientific visualizations, etc. Surprisingly, our results show that system performance with random data allocation is competitive and sometimes even better than with data striping techniques, for the workloads for which data striping is designed to work best; i.e. streams with sequential access patterns and CBR...
Parallel Randomized Load Balancing
 In Symposium on Theory of Computing. ACM
, 1995
"... It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the ..."
Abstract

Cited by 63 (8 self)
 Add to MetaCart
(Show Context)
It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the least full of its chosen bins [2]. They show that the fullest bin contains only log log n= log d + \Theta(1) balls with high probability. We explore extensions of this result to parallel and distributed settings. Our results focus on the tradeoff between the amount of communication and the final load. Given r rounds of communication, we provide lower bounds on the maximum load of \Omega\Gamma r p log n= log log n) for a wide class of strategies. Our results extend to the case where the number of rounds is allowed to grow with n. We then demonstrate parallelizations of the sequential strategy presented in Azar et al. that achieve loads within a constant factor of the lower bound for two ...
Interpreting Stale Load Information
 IEEE Transactions on parallel and distributed systems
, 1999
"... In this paper we examine the problem of balancing load in a largescale distributed system when information about server loads may be stale. It is well known that sending each request to the machine with the apparent lowest load can behave badly in such systems, yet this technique is common in pr ..."
Abstract

Cited by 61 (0 self)
 Add to MetaCart
(Show Context)
In this paper we examine the problem of balancing load in a largescale distributed system when information about server loads may be stale. It is well known that sending each request to the machine with the apparent lowest load can behave badly in such systems, yet this technique is common in practice. Other systems use roundrobin or random selection algorithms that entirely ignore load information or that only use a small subset of the load information. Rather than risk extremely bad performance on one hand or ignore the chance to use load information to improve performance on the other, we develop strategies that interpret load information based on its age. Through simulation, we examine several simple algorithms that use such load interpretation strategies under a range of workloads. Our experiments suggest that by properly interpreting load information, systems can (1) match the performance of the most aggressive algorithms when load information is fresh relative to the...
On the Analysis of Randomized Load Balancing Schemes
 IN PROCEEDINGS OF THE 9TH ANNUAL ACM SYMPOSIUM ON PARALLEL ALGORITHMS AND ARCHITECTURES
, 1998
"... It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends ..."
Abstract

Cited by 59 (7 self)
 Add to MetaCart
(Show Context)
It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends a previous analysis of the supermarket model, a model that abstracts a simple, efficient load balancing scheme in the setting where jobs arrive at a large system of parallel processors. In this model, customers arrive at a system of n servers as a Poisson stream of rate #n, # < 1, with service requirements exponentially distributed with mean 1. Each customer chooses d servers independently and uniformly at random from the n servers, and is served according to the First In First Out (FIFO) protocol at the choice with the fewest customers. For the supermarket model, it has been shown that using d = 2 choices yields an exponential improvement in the expected time a customer spends in the syst...
Load Balancing and Density Dependent Jump Markov Processes (Extended Abstract)
 IN PROCEEDINGS OF THE 37TH IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE
, 1996
"... We provide a new approach for analyzing both static and dynamic randomized load balancing strategies. We demonstrate the approach by providing the first analysis of the following model: customers arrive as a Poisson stream of rate n, ! 1, at a collection of n servers. Each customer chooses some co ..."
Abstract

Cited by 48 (12 self)
 Add to MetaCart
We provide a new approach for analyzing both static and dynamic randomized load balancing strategies. We demonstrate the approach by providing the first analysis of the following model: customers arrive as a Poisson stream of rate n, ! 1, at a collection of n servers. Each customer chooses some constant d servers independently and uniformly at random from the n servers, and waits for service at the one with the fewest customers. Customers are served according to the firstin firstout (FIFO) protocol, and the service time for a customer is exponentially distributed with mean 1. We call this problem the supermarket model. We wish to know how the system behaves, and in particular we are interested the expected time a customer spends in the system in equilibrium. The model provides a good abstraction of a simple, efficient load balancing scheme in the setting where jobs...
Performance Analysis of the RIO Multimedia Storage System with Heterogeneous Disk Configurations
 In ACM Multimedia Conference
, 1998
"... RIO is a multimedia object server which manages a set of parallel disks and supports realtime data delivery with statistical delay guarantees. RIO uses random data allocation on disks combined with partial replication to achieve load balance and high performance. In this paper we analyze the perfor ..."
Abstract

Cited by 48 (0 self)
 Add to MetaCart
(Show Context)
RIO is a multimedia object server which manages a set of parallel disks and supports realtime data delivery with statistical delay guarantees. RIO uses random data allocation on disks combined with partial replication to achieve load balance and high performance. In this paper we analyze the performance of RIO when the set of disks used to store data blocks is not homogeneous, having both different bandwidths and different storage capacities. The basic problem to be addressed for heterogeneous configurations is that, on average, the fraction of the load directed to each disk is proportional to the amount of data stored on it, which may not be proportional to the disk bandwidth. This may cause some disks to be overloaded, with long queues and delays, even though bandwidth is available on other disks. This reduces the system throughput or increases the delay bound that can be guaranteed. This problem arises whenever the bandwidth to space ratio (BSR) is not uniform across all disks. In ...
Towards Simple, Highperformance Schedulers for Highaggregate Bandwidth Switches
 in Proceedings of IEEE Infocom
, 2002
"... Highaggregate bandwidth switches are those whose port count multiplied by the operating line rate is very high; for example, a 30 port switch operating at 40 Gbps or a 1000 port switch operating at 1 Gbps. Designing highperformance schedulers for such switches is a challenging problem for the foll ..."
Abstract

Cited by 43 (7 self)
 Add to MetaCart
(Show Context)
Highaggregate bandwidth switches are those whose port count multiplied by the operating line rate is very high; for example, a 30 port switch operating at 40 Gbps or a 1000 port switch operating at 1 Gbps. Designing highperformance schedulers for such switches is a challenging problem for the following reasons: (i) High performance requires finding good matchings, (ii) good matchings take time to find, and (iii) in highaggregate bandwidth switches there is either too little time (due to high line rates) or there is too much work to do (due to a high port count).