Results 1 - 10
of
234
Oceanstore: An architecture for global-scale persistent storage
, 2000
"... OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cac ..."
Abstract
-
Cited by 1149 (32 self)
- Add to MetaCart
(Show Context)
OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cached anywhere, anytime. Additionally, monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data. A prototype implementation is currently under development. 1
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility
, 2001
"... This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internetbased overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache a ..."
Abstract
-
Cited by 803 (23 self)
- Add to MetaCart
This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internetbased overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache additional copies of popular files. In the PAST system, storage nodes and files are each assigned uniformly distributed identifiers, and replicas of a file are stored at nodes whose identifier matches most closely the file’s identifier. This statistical assignment of files to storage nodes approximately balances the number of files stored on each node. However, non-uniform storage node capacities and file sizes require more explicit storage load balancing to permit graceful behavior under high global storage utilization; likewise, non-uniform popularity of files requires caching to minimize fetch distance and to balance the query load. We present and evaluate PAST, with an emphasis on its storage management and caching system. Extensive tracedriven experiments show that the system minimizes fetch distance, that it balances the query load for popular files, and that it displays graceful degradation of performance as the global storage utilization increases beyond 95%.
Diskless Checkpointing
, 1997
"... Diskless Checkpointing is a technique for checkpointing the state of a long-running computation on a distributed system without relying on stable storage. As such, it eliminates the performance bottleneck of traditional checkpointing on distributed systems. In this paper, we motivate diskless checkp ..."
Abstract
-
Cited by 158 (3 self)
- Add to MetaCart
Diskless Checkpointing is a technique for checkpointing the state of a long-running computation on a distributed system without relying on stable storage. As such, it eliminates the performance bottleneck of traditional checkpointing on distributed systems. In this paper, we motivate diskless checkpointing and present the basic diskless checkpointing scheme along with several variants for improved performance. The performance of the basic scheme and its variants is evaluated on a high-performance network of workstations and compared to traditional disk-based checkpointing. We conclude that diskless checkpointing is a desirable alternative to disk-based checkpointing that can improve the performance of distributed applications in the face of failures.
FAB: Building Distributed Enterprise Disk Arrays from Commodity Components
, 2004
"... This paper describes the design, implementation, and evaluation of a Federated Array of Bricks (FAB), a distributed disk array that provides the reliability of traditional enterprise arrays with lower cost and better scalability. FAB is built from a collection of bricks, small storage appliances con ..."
Abstract
-
Cited by 123 (7 self)
- Add to MetaCart
This paper describes the design, implementation, and evaluation of a Federated Array of Bricks (FAB), a distributed disk array that provides the reliability of traditional enterprise arrays with lower cost and better scalability. FAB is built from a collection of bricks, small storage appliances containing commodity disks, CPU, NVRAM, and network interface cards. FAB deploys a new majority-votingbased algorithm to replicate or erasure-code logical blocks across bricks and a reconfiguration algorithm to move data in the background when bricks are added or decommissioned. We argue that voting is practical and necessary for reliable, high-throughput storage systems such as FAB. We have implemented a FAB prototype on a 22-node Linux cluster. This prototype sustains 85MB/second of throughput for a database workload, and 270MB/second for a bulk-read workload. In addition, it can outperform traditional masterslave replication through performance decoupling and can handle brick failures and recoveries smoothly without disturbing client requests.
An end-to-end approach to globally scalable network storage
- IN ACM SIGCOMM ’02
, 2002
"... This paper discusses the application of end-to-end design principles, which are characteristic of the architecture of the Internet, to network storage. While putting storage into the network fabric may seem to contradict end-to-end arguments, we try to show not only that there is no contradiction, b ..."
Abstract
-
Cited by 106 (33 self)
- Add to MetaCart
This paper discusses the application of end-to-end design principles, which are characteristic of the architecture of the Internet, to network storage. While putting storage into the network fabric may seem to contradict end-to-end arguments, we try to show not only that there is no contradiction, but also that adherence to such an approach is the key to achieving true scalability of shared network storage. After discussing end-to-end arguments with respect to several properties of network storage, we describe the Internet Backplane Protocol and the exNode, which are tools that have been designed to create a network storage substrate that adheres to these principles. The name for this approach is Logistical Networking, and we believe its use is fundamental to the future of truly scalable communication.
Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems
- In Proceedings of the IEEE International Symposium on Network Computing and Applications. IEEE, Los Alamitos
"... We design flexible schemes to explore the tradeoffs between storage space and access efficiency in reliable data storage systems. Aiming at this goal, two new classes of erasure-resilient codes are introduced – Basic Pyramid Codes (BPC) and Generalized Pyramid Codes (GPC). Both schemes require sligh ..."
Abstract
-
Cited by 81 (9 self)
- Add to MetaCart
We design flexible schemes to explore the tradeoffs between storage space and access efficiency in reliable data storage systems. Aiming at this goal, two new classes of erasure-resilient codes are introduced – Basic Pyramid Codes (BPC) and Generalized Pyramid Codes (GPC). Both schemes require slightly more storage space than conventional schemes, but significantly improve the critical performance of read during failures and unavailability. As a by-product, we establish a necessary matching condition to characterize the limit of failure recovery, that is, unless the matching condition is satisfied, a failure case is impossible to recover. In addition, we define a maximally recoverable (MR) property. For all ERC schemes holding the MR property, the matching condition becomes sufficient, that is, all failure cases satisfying the matching condition are indeed recoverable. We show that GPC is the first class of non-MDS schemes holding the MR property.
Reliability mechanisms for very large storage systems
- IN PROCEEDINGS OF THE 20TH IEEE / 11TH NASA GODDARD CONFERENCE ON MASS STORAGE SYSTEMS AND TECHNOLOGIES
, 2003
"... Reliability and availability are increasingly important in large-scale storage systems built from thousands of individual storage devices. Large systems must survive the failure of individual components; in systems with thousands of disks, even infrequent failures are likely in some device. We focus ..."
Abstract
-
Cited by 77 (21 self)
- Add to MetaCart
(Show Context)
Reliability and availability are increasingly important in large-scale storage systems built from thousands of individual storage devices. Large systems must survive the failure of individual components; in systems with thousands of disks, even infrequent failures are likely in some device. We focus on two types of errors: nonrecoverable read errors and drive failures. We discuss mechanisms for detecting and recovering from such errors, introducing improved techniques for detecting errors in disk reads and fast recovery from disk failure. We show that simple RAID cannot guarantee sufficient reliability; our analysis examines the tradeoffs among other schemes between system availability and storage efficiency. Based on our data, we believe that two-way mirroring should be sufficient for most large storage systems. For those that need very high reliabilty, we recommend either three-way mirroring or mirroring combined with RAID.
Disk scrubbing in large archival storage systems
- In Proceedings of the 12th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS ’04
, 2004
"... Large archival storage systems experience long periods of idleness broken up by rare data accesses. In such systems, disks may remain powered off for long periods of time. These systems can lose data for a variety of reasons, including failures at both the device level and the block level. To deal w ..."
Abstract
-
Cited by 70 (15 self)
- Add to MetaCart
(Show Context)
Large archival storage systems experience long periods of idleness broken up by rare data accesses. In such systems, disks may remain powered off for long periods of time. These systems can lose data for a variety of reasons, including failures at both the device level and the block level. To deal with these failures, we must detect them early enough to be able to use the redundancy built into the storage system. We propose a process called “disk scrubbing” in a system in which drives are periodically accessed to detect drive failure. By scrubbing all of the data stored on all of the disks, we can detect block failures and compensate for them by rebuilding the affected blocks. Our research shows how the scheduling of disk scrubbing affects overall system reliability, and that “opportunistic ” scrubbing, in which the system scrubs disks only when they are powered on for other reasons, performs very well without the need to power on disks solely to check them. 1.
RACS: A Case for Cloud Storage Diversity
"... The increasing popularity of cloud storage is leading organizations to consider moving data out of their own data centers and into the cloud. However, success for cloud storage providers can present a significant risk to customers; namely, it becomes very expensive to switch storage providers. In th ..."
Abstract
-
Cited by 64 (1 self)
- Add to MetaCart
(Show Context)
The increasing popularity of cloud storage is leading organizations to consider moving data out of their own data centers and into the cloud. However, success for cloud storage providers can present a significant risk to customers; namely, it becomes very expensive to switch storage providers. In this paper, we make a case for applying RAID-like techniques used by disks and file systems, but at the cloud storage level. We argue that striping user data across multiple providers can allow customers to avoid vendor lock-in, reduce the cost of switching providers, and better tolerate provider outages or failures. We introduce RACS, a proxy that transparently spreads the storage load over many providers. We evaluate a prototype of our system and estimate the costs incurred and benefits reaped. Finally, we use trace-driven simulations to demonstrate how RACS can reduce the cost of switching storage vendors for a large organization such as the Internet Archive by seven-fold or more by varying erasure-coding parameters.
Computer Immunology
, 1998
"... Present day computer systems are fragile and unreliable. Human beings are involved in the care and repair of computer systems at every stage in their operation. This level of human involvement will be impossible to maintain in future. Biological and social systems of comparable and greater comple ..."
Abstract
-
Cited by 61 (15 self)
- Add to MetaCart
Present day computer systems are fragile and unreliable. Human beings are involved in the care and repair of computer systems at every stage in their operation. This level of human involvement will be impossible to maintain in future. Biological and social systems of comparable and greater complexity have self-healing processes which are crucial to their survival. It will be necessary to mimic such systems if our future computer systems are to prosper in a complex and hostile environment. This paper describes strategies for future research and summarizes concrete measures for the present, building upon existing software systems.