Results 11 - 20
of
59
Multifaceted Simultaneous Load Balancing in DHT-based P2P systems: A new game with old balls and bins
- Self-* Properties in Complex Information Systems, “Hot Topics” series, LNCS
, 2004
"... In this paper we present and evaluate uncoordinated on-line algorithms for simultaneous storage and replication load-balancing in DHT-based peer-to-peer systems. We compare our approach with the classical balls into bins model, and point out the similarities but also the differences which call fo ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
(Show Context)
In this paper we present and evaluate uncoordinated on-line algorithms for simultaneous storage and replication load-balancing in DHT-based peer-to-peer systems. We compare our approach with the classical balls into bins model, and point out the similarities but also the differences which call for new loadbalancing mechanisms specifically targeted at P2P systems. Some of the peculiarities of P2P systems, which make our problem even more challenging are that both the network membership and the data indexed in the network is dynamic, there is neither global coordination nor global information to rely on, and the load-balancing mechanism ideally should not compromise the structural properties and thus the search efficiency of the DHT, while preserving the semantic information of the data (e.g., lexicographic ordering to enable range searches).
An efficient data location protocol for self-organizing storage clusters
- In Proc. of ACM/IEEE SC’03
, 2003
"... Component additions and failures are common for large-scale storage clusters in production environments. To improve availability and manageability, we investigate and compare data location schemes for a large self-organizing storage cluster that can quickly adapt to the additions or departures of st ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
(Show Context)
Component additions and failures are common for large-scale storage clusters in production environments. To improve availability and manageability, we investigate and compare data location schemes for a large self-organizing storage cluster that can quickly adapt to the additions or departures of storage nodes. We further present an efficient location scheme that differentiates between small and large file blocks for reduced management overhead compared to uniform strategies. In our protocol, small blocks, which are typically in large quantities, are placed through consistent hashing. Large blocks, much fewer in practice, are placed through a usage-based policy, and their locations are tracked by Bloom filters. The proposed scheme results in improved storage utilization even with non-uniform cluster nodes. To achieve high scalability and fault resilience, this protocol is fully distributed, relies only on soft states, and supports data replication. We demonstrate the effectiveness and efficiency of this protocol through trace-driven simulation. 1.
LH* Schemes with Scalable Availability
, 1998
"... Modern applications increasingly require scalable, highly available and distributed storage systems. High-availability schemes typically deliver data despite up to n 1 simultaneous unavailabilities of the storage nodes (disks, processors with storage, or entire computers), where n is fixed. Such sc ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Modern applications increasingly require scalable, highly available and distributed storage systems. High-availability schemes typically deliver data despite up to n 1 simultaneous unavailabilities of the storage nodes (disks, processors with storage, or entire computers), where n is fixed. Such schemes are insufficient for scalable files, since the probability of more than n failures increases arbitrarily with file size. We propose a new schema termed LH*sa withstanding up to n simultaneous unavailabilities with n scaling with the file. We present LH*sa file manipulation and recovery algorithms. We discuss the access and storage performance, and variants tuning selected features. We show that LH*sa files may scale to any number of nodes, keeping the probability of data unavailability arbitrarily small. 1
A serverless 3D world
- In Proceedings of the 12th International Symposium of ACM GIS
, 2004
"... Online multi-participant virtual-world systems have at-tracted significant interest from the Internet community but are hindered by their inability to efficiently support interactivity for a large number of participants. Current solutions divide a large virtual-world into a few mutually exclusive zo ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
(Show Context)
Online multi-participant virtual-world systems have at-tracted significant interest from the Internet community but are hindered by their inability to efficiently support interactivity for a large number of participants. Current solutions divide a large virtual-world into a few mutually exclusive zones, with each zone controlled by a different server, and/or limit the number of participants per server or per virtual-world. Peer-to-Peer (P2P) systems are known to provide excellent scalability in a networked environment (one peer is introduced to the system by each participant), however current P2P applications can only provide file sharing and other forms of relatively simple data communi-cations. In this paper, we present a generic 3D virtual-world application that runs on a P2P network with no central administration or server. Two issues are addressed by this paper to enable such a spatial application on a P2P network. First, we demonstrate how to index and query a 3D space on a dynamic distributed network. Second, we show how to build such a complex application from the ground level of a P2P routing algorithm. Our work leads to new directions for the development of online virtual-worlds that we believe can be used for many government, industry, and public domain applications.
LH*RS: A Highly Available Distributed Data Storage System
- In VLDB
, 2004
"... The ideal storage system is always available and incrementally expandable. Existing storage systems fall far from this ideal. Affordable computers and high-speed networks allow us to investigate storage architectures closer to the ideal. Our demo, present a prototype implementation of LH*RS: a highl ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
(Show Context)
The ideal storage system is always available and incrementally expandable. Existing storage systems fall far from this ideal. Affordable computers and high-speed networks allow us to investigate storage architectures closer to the ideal. Our demo, present a prototype implementation of LH*RS: a highly available scalable and distributed data structure. 1.
On the practical use of ldpc erasure codes for distributed storage applications
, 2003
"... This paper has been submitted for publication. Please see the above URL for current publication status. As peer-to-peer and widely distributed storage systems proliferate, the need to perform efficient erasure coding, instead of replication, is crucial to performance and efficiency. Low-Density Pari ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
This paper has been submitted for publication. Please see the above URL for current publication status. As peer-to-peer and widely distributed storage systems proliferate, the need to perform efficient erasure coding, instead of replication, is crucial to performance and efficiency. Low-Density Parity-Check (LDPC) codes have arisen as alternatives to standard erasure codes, such as Reed-Solomon codes, trading off vastly improved decoding performance for inefficiencies in the amount of data that must be acquired to perform decoding. The scores of papers written on LDPC codes typically analyze their collective and asymptotic behavior. Unfortunately, their practical application requires the generation and analysis of individual codes for finite systems. This paper attempts to illuminate the practical considerations of LDPC codes for peer-to-peer and distributed storage systems. The three main types of LDPC codes are detailed, and a huge variety of codes are generated, then analyzed using simulation. This analysis focuses on the performance of individual codes for finite systems, and addresses several important heretofore unanswered questions about employing LDPC codes in real-world systems. This material is based upon work supported by the National
High-performance GRID Database Manager for Scientific Data
- in Proceedings of 4-th Workshop on Distributed Data & Structures (WDAS-2002), Carleton Scientific (Publ
, 2002
"... The GRID initiative provides an infrastructure for distributed computations among widely distributed high-performance computers. This will allow for exchanging and processing very large amounts of data. The LOFAR project (www.nfra.nl/lofar) is an international initiative to build a versatile, geogra ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
The GRID initiative provides an infrastructure for distributed computations among widely distributed high-performance computers. This will allow for exchanging and processing very large amounts of data. The LOFAR project (www.nfra.nl/lofar) is an international initiative to build a versatile, geographically distributed, multi-point radio facility for astrophysics, space physics, atmospheric physics, and radio research, utilizing very high performance GRID computing. LOIS is a proposed Swedish outrigger to LOFAR providing a software radar. As the volume of data processed by LOFAR/LOIS is very large and dynamic there will be need for very high performing data management systems. For this a high-performance streamoriented distributed data manager and query processor is being developed that allows very efficient execution of database queries to streamed data involving numerical and other data. Very high performance is attained by utilizing many object-relational main-memory database engines running on This project has been supported by VINNOVA under contract #2001-06074.
Range Queries to Scalable Distributed Data Structure RP*
, 2003
"... A Scalable Distributed Data Structure (SDDS) is a data structure of a new type specifically designed for multicomputers, P2P and grid computing systems. SDDS RP* provides a range partitioned file scaling up dynamically over distributed nodes. Our concern is the efficient execution of RP* range q ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
A Scalable Distributed Data Structure (SDDS) is a data structure of a new type specifically designed for multicomputers, P2P and grid computing systems. SDDS RP* provides a range partitioned file scaling up dynamically over distributed nodes. Our concern is the efficient execution of RP* range queries. The query may address an a priori unknown number of data servers.
Implementation and Performance Measurements of the RP* Scalable Distributed Data Structure for
- Windows Multicomputers.” Intl. Workshop on Performance-Oriented Program Development for Distributed Architectures (PADDA
, 2001
"... The RP * scheme generates the scalable range partitioning. The intervals at the data servers adjust dynamically so that new servers accommodate the file growth transparently for the application. We have implemented variants of RP * on a Windows 2000 multicomputer. We have measured the performance of ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
(Show Context)
The RP * scheme generates the scalable range partitioning. The intervals at the data servers adjust dynamically so that new servers accommodate the file growth transparently for the application. We have implemented variants of RP * on a Windows 2000 multicomputer. We have measured the performance of the system. The experiments prove high efficiency of our implementation. RP * should be of importance to future main-memory parallel DBMSs.
Indexing Distributed Complex Data for Complex Queries
, 2004
"... Peer-to-peer networks are becoming a common form of online data exchange. Querying data, mostly files, using keywords on peer-to-peer networks is well-known. But users cannot perform many types of queries on complex data and on many of the attributes of the data on such networks other than mostly ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Peer-to-peer networks are becoming a common form of online data exchange. Querying data, mostly files, using keywords on peer-to-peer networks is well-known. But users cannot perform many types of queries on complex data and on many of the attributes of the data on such networks other than mostly exact-match queries. We introduce a distributed hashing-based index for enabling more powerful accesses on complex data over peer-to-peer networks that we expect to be commonly deployed for digital government applications. Preliminary experiments show that our index scales well and we believe that it can be extended to obtain similar indices for many other data types for performing various complex queries, such as range queries.