Results 1 - 10
of
64
FAWN: A Fast Array of Wimpy Nodes
, 2008
"... This paper introduces the FAWN—Fast Array of Wimpy Nodes—cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2–16GB) of flash memory into an ensemble capable of ..."
Abstract
-
Cited by 212 (26 self)
- Add to MetaCart
(Show Context)
This paper introduces the FAWN—Fast Array of Wimpy Nodes—cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2–16GB) of flash memory into an ensemble capable of handling 700 queries per second per node, while consuming fewer than 6 watts of power per node. We have designed and implemented a clustered key-value storage system, FAWN-DHT, that runs atop these node. Nodes in FAWN-DHT use a specialized log-like back-end hash-based database to ensure that the system can absorb the large write workload imposed by frequent node arrivals and departures. FAWN uses a two-level cache hierarchy to ensure that imbalanced workloads cannot create hot-spots on one or a few wimpy nodes that impair the system’s ability to service queries at its guaranteed rate. Our evaluation of a small-scale FAWN cluster and several candidate FAWN node systems suggest that FAWN can be a practical approach to building large-scale storage for seek-intensive workloads. Our further analysis indicates that a FAWN cluster is cost-competitive with other approaches (e.g., DRAM, multitudes of magnetic disks, solid-state disk) to providing high query rates, while consuming 3-10x less power. Acknowledgements: We thank the members and companies of the CyLab Corporate Partners and the PDL
A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage
"... Over the past five years, large-scale storage installations have required fault-protection beyond RAID-5, leading to a flurry of research on and development of erasure codes for multiple disk failures. Numerous open-source implementations of various coding techniques are available to the general pub ..."
Abstract
-
Cited by 43 (8 self)
- Add to MetaCart
(Show Context)
Over the past five years, large-scale storage installations have required fault-protection beyond RAID-5, leading to a flurry of research on and development of erasure codes for multiple disk failures. Numerous open-source implementations of various coding techniques are available to the general public. In this paper, we perform a head-to-head comparison of these implementations in encoding and decoding scenarios. Our goals are to compare codes and implementations, to discern whether theory matches practice, and to demonstrate how parameter selection, especially as it concerns memory, has a significant impact on a code’s performance. Additional benefits are to give storage system designers an idea of what to expect in terms of coding performance when designing their storage systems, and to identify the places where further erasure coding research can have the most impact.
SRCMap: Energy Proportional Storage using Dynamic Consolidation
"... We investigate the problem of creating an energy proportional storage system through power-aware dynamic storage consolidation. Our proposal, Sample-Replicate-Consolidate Mapping (SRCMap), is a storage virtualization layer optimization that enables energy proportionality for dynamic I/O workloads by ..."
Abstract
-
Cited by 41 (9 self)
- Add to MetaCart
(Show Context)
We investigate the problem of creating an energy proportional storage system through power-aware dynamic storage consolidation. Our proposal, Sample-Replicate-Consolidate Mapping (SRCMap), is a storage virtualization layer optimization that enables energy proportionality for dynamic I/O workloads by consolidating the cumulative workload on a subset of physical volumes proportional to the I/O workload intensity. Instead of migrating data across physical volumes dynamically or replicating entire volumes, both of which are prohibitively expensive, SRCMap samples a subset of blocks from each data volume that constitutes its working set and replicates these on other physical volumes. During a given consolidation interval, SRCMap activates a minimal set of physical volumes to serve the workload and spins down the remaining volumes, redirecting their workload to replicas on active volumes. We present both theoretical and experimental evidence to establish the effectiveness of SRCMap in minimizing the power consumption of enterprise storage systems. 1
Generating Realistic Impressions for File-System Benchmarking
- In Proceedings of the 7th Conference on File and Storage Technologies (FAST ’09
, 2009
"... ..."
(Show Context)
2011. In search of I/O-optimal recovery from disk failures
- In Proceedings of the USENIX Workshop on Hot Topics in Storage and File Systems
"... We address the problem of minimizing the I/O needed to recover from disk failures in erasure-coded storage systems. The principal result is an algorithm that finds the optimal I/O recovery from an arbitrary number of disk failures for any XOR-based erasure code. We also describe a family of codes wi ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
(Show Context)
We address the problem of minimizing the I/O needed to recover from disk failures in erasure-coded storage systems. The principal result is an algorithm that finds the optimal I/O recovery from an arbitrary number of disk failures for any XOR-based erasure code. We also describe a family of codes with high-fault tolerance and low recovery I/O, e.g. one instance tolerates up to 11 failures and recovers a lost block in 4 I/Os. While we have determined I/O optimal recovery for any given code, it remains an open problem to identify codes with the best recovery properties. We describe our ongoing efforts toward characterizing space overhead versus recovery I/O tradeoffs and generating codes that realize these bounds. 1
Energy-efficient cluster computing with FAWN: Workloads and implications
- In Proc. e-Energy 2010
, 2010
"... This paper presents the architecture and motivation for a clusterbased, many-core computing architecture for energy-efficient, dataintensive computing. FAWN, a Fast Array of Wimpy Nodes, consists of a large number of slower but efficient nodes coupled with low-power storage. We present the computing ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
(Show Context)
This paper presents the architecture and motivation for a clusterbased, many-core computing architecture for energy-efficient, dataintensive computing. FAWN, a Fast Array of Wimpy Nodes, consists of a large number of slower but efficient nodes coupled with low-power storage. We present the computing trends that motivate a FAWN-like approach, for CPU, memory, and storage. We follow with a set of microbenchmarks to explore under what workloads these “wimpy nodes ” perform well (or perform poorly). We conclude with an outline of the longer-term implications of FAWN that lead us to select a tightly integrated stacked chip-and-memory architecture for future FAWN development.
A new minimum density RAID-6 code with a word size of eight
- IN NCA-08: 7TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING APPLICATIONS
, 2008
"... RAID-6 storage systems protect k disks of data with two parity disks so that the system of k + 2 disks may tolerate the failure of any two disks. Coding techniques for RAID-6 systems are varied, but an important class of techniques are those with minimum density, featuring an optimal combination of ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
RAID-6 storage systems protect k disks of data with two parity disks so that the system of k + 2 disks may tolerate the failure of any two disks. Coding techniques for RAID-6 systems are varied, but an important class of techniques are those with minimum density, featuring an optimal combination of encoding, decoding and modification complexity. The word size of a code impacts both how the code is laid out on each disk’s sectors and how large k can be. Word sizes which are powers of two are especially important, since they fit precisely into file system blocks. Minimum density codes exist for many word sizes with the notable exception of eight. This paper fills that gap by describing new codes for this important word size. The description includes performance properties as well as details of the discovery process.
AONT-RS: Blending security and performance in dispersed storage systems
- in FAST’11
, 2011
"... Dispersing files across multiple sites yields a variety of obvious benefits, such as availability, proximity and reliability. Less obviously, it enables security to be achieved without relying on encryption keys. Standard approaches to dispersal either achieve very high security with correspondingly ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
(Show Context)
Dispersing files across multiple sites yields a variety of obvious benefits, such as availability, proximity and reliability. Less obviously, it enables security to be achieved without relying on encryption keys. Standard approaches to dispersal either achieve very high security with correspondingly high computational and storage costs, or low security with lower costs. In this paper, we describe a new dispersal scheme, called AONT-RS, which blends an All-Or-Nothing Transform with Reed-Solomon coding to achieve high security with low computational and storage costs. We evaluate this scheme both theoretically and as implemented with standard open source tools. AONT-RS forms the backbone of a commercial dispersed storage system, which we briefly describe and then use as a further experimental testbed. We conclude with details of actual deployments. 1
Mean time to meaningless: MTTDL, Markov models, and storage system reliability
"... Mean Time To Data Loss (MTTDL) has been the standard reliability metric in storage systems for more than 20 years. MTTDL represents a simple formula that can be used to compare the reliability of small disk arrays and to perform comparative trending analyses. The MTTDL metric is often misused, with ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
(Show Context)
Mean Time To Data Loss (MTTDL) has been the standard reliability metric in storage systems for more than 20 years. MTTDL represents a simple formula that can be used to compare the reliability of small disk arrays and to perform comparative trending analyses. The MTTDL metric is often misused, with egregious examples relying on the MTTDL to generate reliability estimates that span centuries or millennia. Moving forward, the storage community needs to replace MTTDL with a metric that can be used to accurately compare the reliability of systems in a way that reflects the impact of data loss in the real world. 1
Building Flexible, Fault-Tolerant Flash-based Storage Systems
"... Adding flash memory to the storage hierarchy has recently gained a great deal of attention in both industry and academia. Decreasing cost, low power utilization and improved performance has sparked this interest. Flash reliability is a weakness that must be overcome in order for the storage industry ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
(Show Context)
Adding flash memory to the storage hierarchy has recently gained a great deal of attention in both industry and academia. Decreasing cost, low power utilization and improved performance has sparked this interest. Flash reliability is a weakness that must be overcome in order for the storage industry to fully adopt flash for persistent storage in mission-critical systems such as high-end storage controllers and low-power storage systems. We consider the unique reliability properties of NAND flash and present a high-level architecture for a reliable NAND flash memory storage system. The architecture manages erasure-coded stripes to increase reliability and operational lifetime of a flash memory-based storage system, while providing excellent write performance. Our analysis details the tradeoffs such a system can make, enabling the construction of highly reliable flash-based storage systems. 1