Results 1 - 10
of
47
Ubiquitous access to distributed data in large-scale sensor networks through decentralized erasure codes
, 2005
"... Consider a large-scale wireless sensor network of n nodes, where a fraction k out of n generate data packets of global interest. Assuming that the individual nodes have limited storage and computational capabilities, we address the problem of how to enable ubiquitous access to the distributed data p ..."
Abstract
-
Cited by 90 (7 self)
- Add to MetaCart
(Show Context)
Consider a large-scale wireless sensor network of n nodes, where a fraction k out of n generate data packets of global interest. Assuming that the individual nodes have limited storage and computational capabilities, we address the problem of how to enable ubiquitous access to the distributed data packets. Specifically, we assume that each node can store at most one data packet, and study the problem of diffusing the data so that by querying any k nodes, it is possible to retrieve all the k data packets of interest (with high probability). We introduce a class of erasure codes and show how to solve this problem efficiently in a completely distributed and robust way. Specifically we show that we can efficiently diffuse the data by “prerouting” only O(ln n) packets per data node to randomly selected storage nodes. By using the proposed scheme, the distributed data becomes available “at the fingertips” of a potential data collector located anywhere in the network.
Optimizing Cauchy Reed-Solomon codes for fault-tolerant network storage applications
- In NCA-06: 5th IEEE International Symposium on Network Computing Applications
, 2006
"... NOTE: NCA’s page limit is rather severe: 8 pages. As a result, the final paper is pretty much a hatchet job of the original submission. I would recommend reading the technical report version of this paper, because it presents the material with some accompanying tutorial material, and is easier to re ..."
Abstract
-
Cited by 49 (12 self)
- Add to MetaCart
(Show Context)
NOTE: NCA’s page limit is rather severe: 8 pages. As a result, the final paper is pretty much a hatchet job of the original submission. I would recommend reading the technical report version of this paper, because it presents the material with some accompanying tutorial material, and is easier to read. The technical report is available at:
Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes
"... It may not be feasible for sensor networks monitoring nature and inaccessible geographical regions to include powered sinks with Internet connections. We consider the scenario where sinks are not present in large-scale sensor networks, and unreliable sensors have to collectively resort to storing s ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
It may not be feasible for sensor networks monitoring nature and inaccessible geographical regions to include powered sinks with Internet connections. We consider the scenario where sinks are not present in large-scale sensor networks, and unreliable sensors have to collectively resort to storing sensed data over time on themselves. At a time of convenience, such cached data from a small subset of live sensors may be collected by a centralized (possibly mobile) collector. In this paper, we propose a decentralized algorithm using fountain codes to guarantee the persistence and reliability of cached data on unreliable sensors. With fountain codes, the collector is able to recover all data as long as a sufficient number of sensors are alive. We use random walks to disseminate data from a sensor to a random subset of sensors in the network. Our algorithms take advantage of the low decoding complexity of fountain codes, as well as the scalability of the dissemination process via random walks. We have proposed two algorithms based on random walks. Our theoretical analysis and simulation-based studies have shown that, the first algorithm maintains the same level of fault tolerance as the original centralized fountain code, while introducing lower overhead than naive random-walk based implementation in the dissemination process. Our second algorithm has lower level of fault tolerance than the original centralized fountain code, but consumes much lower dissemination cost.
A Hybrid Routing Approach for Opportunistic Networks
- In Proc. of ACM SIGCOMM Workshop on Challenged Networks
, 2006
"... With wireless networking technologies extending into the fabrics of our working and operating environments, proper handling of intermittent wireless connectivity and network disruptions is of significance. As the sheer number of potential opportunistic application continues to surge (i.e. wireless s ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
(Show Context)
With wireless networking technologies extending into the fabrics of our working and operating environments, proper handling of intermittent wireless connectivity and network disruptions is of significance. As the sheer number of potential opportunistic application continues to surge (i.e. wireless sensor networks, underwater sensor networks, pocket switched networks, transportation networks, and etc.), the design for an effective routing scheme that considers and accommodates the various intricate behaviors observed in an opportunistic network is of interest and remained desirable. While previous solutions use either replication or coding techniques to address the challenges in opportunistic networks, the tradeoff of these two techniques only make them ideal under certain network scenarios. In this paper, we propose a hybrid scheme, named H-EC, to deal with a wide variety of opportunistic network cases. H-EC is designed to fully combine the robustness of erasure coding based routing techniques, while preserving the performance advantages of replication techniques. We evaluate H-EC against other similar strategies in terms of delivery ratio and latency, and find that H-EC offers robustness in worst-case delay performance cases while achieving good performance in small delay performance cases. We also discuss the traffic overhead issues associated with H-EC as compared to other schemes, and present several strategies that can potentially alleviate the traffic overhead of H-EC schemes.
Determining fault tolerance of XOR-based erasure codes efficiently
- In Proceedings of the 2007 International Conference on Dependable Systems and Networks (DSN
, 2007
"... We propose a new fault tolerance metric for XOR-based erasure codes: the minimal erasures list (MEL). A minimal erasure is a set of erasures that leads to irrecoverable data loss and in which every erasure is necessary and sufficient for this to be so. The MEL is the enumeration of all minimal erasu ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
(Show Context)
We propose a new fault tolerance metric for XOR-based erasure codes: the minimal erasures list (MEL). A minimal erasure is a set of erasures that leads to irrecoverable data loss and in which every erasure is necessary and sufficient for this to be so. The MEL is the enumeration of all minimal erasures. An XOR-based erasure code has an irregular structure that may permit it to tolerate faults at and beyond its Hamming distance. The MEL completely describes the fault tolerance of an XOR-based erasure code at and beyond its Hamming distance; it is therefore a useful metric for comparing the fault tolerance of such codes. We also propose an algorithm that efficiently determines the MEL of an erasure code. This algorithm uses the structure of the erasure code to efficiently determine the MEL. We show that, in practice, the number of minimal erasures for a given code is much less than the total number of sets of erasures that lead to data loss: in our empirical results for one corpus of codes, there were over 80 times fewer minimal erasures. We use the proposed algorithm to identify the most fault tolerant XOR-based erasure code for all possible systematic erasure codes with up to seven data symbols and up to seven parity symbols. 1.
Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs
"... Abstract—Large scale storage systems require multi-disk fault tolerant erasure codes. Replication and RAID extensions that protect against two- and three-disk failures offer a stark tradeoff between how much data must be stored, and how much data must be read to recover a failed disk. Flat XOR-codes ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
(Show Context)
Abstract—Large scale storage systems require multi-disk fault tolerant erasure codes. Replication and RAID extensions that protect against two- and three-disk failures offer a stark tradeoff between how much data must be stored, and how much data must be read to recover a failed disk. Flat XOR-codes—erasure codes in which parity disks are calculated as the XOR of some subset of data disks—offer a tradeoff between these extremes. In this paper, we describe constructions of two novel flat XOR-code, Stepped Combination and HD-Combination codes. We describe an algorithm for flat XOR-codes that enumerates recovery equations, i.e., sets of disks that can recover a failed disk. We also describe two algorithms for flat XOR-codes that generate recovery schedules, i.e., sets of recovery equations that can be used in concert to achieve efficient recovery. Finally, we analyze the key storage properties of many flat XOR-codes and of MDS codes such as replication and RAID 6 to show the cost-benefit tradeoff gap that flat XOR-codes can fill. I.
Reliability for networked storage nodes
- Research Report RJ-10358, IBM Almaden Research
, 2006
"... High-end enterprise storage has traditionally consisted of monolithic systems with customized hardware, multiple redundant components and paths, and no single point of failure. Distributed storage systems realized through networked storage nodes offer several advantages over monolithic systems such ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
(Show Context)
High-end enterprise storage has traditionally consisted of monolithic systems with customized hardware, multiple redundant components and paths, and no single point of failure. Distributed storage systems realized through networked storage nodes offer several advantages over monolithic systems such as lower cost and increased scalability. In order to achieve reliability goals associated with enterprise-class storage systems, redundancy will have to be distributed across the collection of nodes to tolerate both node and drive failures. In this paper, we present alternatives for distributing this redundancy, and models to determine the reliability of such systems. We specify a reliability target and determine the configurations that meet this target. Further, we perform sensitivity analyses where selected parameters are varied to observe their effect on reliability. 1.
Small parity-check erasure codes - exploration and observations
- In DSN-05: International Conference on Dependable Systems and Networks
, 2005
"... Erasure codes have profound uses in wide- and mediumarea storage applications. While infinite-size codes have been developed with optimal properties, there remains a need to develop small codes with optimal properties. In this paper, we provide a framework for exploring very small codes, and we use ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
(Show Context)
Erasure codes have profound uses in wide- and mediumarea storage applications. While infinite-size codes have been developed with optimal properties, there remains a need to develop small codes with optimal properties. In this paper, we provide a framework for exploring very small codes, and we use this framework to derive optimal and near-optimal ones for discrete numbers of data bits and coding bits. These codes have heretofore been unknown and unpublished, and should be useful in practice. We also use our exploration to make observations about upper bounds for these codes, in order to gain a better understanding of them and to spur future derivations of larger, optimal and near-optimal codes. 1
Partial network coding: Theory and application for continuous sensor data collection
- in 14 th IEEE International Workshop on Quality of Service
, 2006
"... Abstract — Wireless sensor networks have been widely used for surveillance in harsh environments. In many such applications, the environmental data are continuously sensed, and data collection by a server is only performed occasionally. Hence, the sensor nodes have to temporarily store the data, and ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
(Show Context)
Abstract — Wireless sensor networks have been widely used for surveillance in harsh environments. In many such applications, the environmental data are continuously sensed, and data collection by a server is only performed occasionally. Hence, the sensor nodes have to temporarily store the data, and provide easy and on-hand access for the most updated data when the server approaches. Given the expensive server-to-sensor communications, the large amount of sensors and the limited storage space at each tiny sensor, continuous data collection becomes a challenging problem. In this paper, we present partial network coding (PNC) as a generic tool for the above applications. PNC generalizes the existing network coding (NC) paradigm, an elegant solution for ubiquitous data distribution and collection. Yet, PNC enables efficient storage replacement for continuous data, which is a major deficiency of the conventional NC. We prove that the performance of PNC is quite close to NC, except for a sublinear overhead on storage and communications. We then address a set of practical concerns toward PNC-based continuous data collection in sensor networks. Its feasibility and superiority are further demonstrated through simulation results. I.
Downloading Replicated, Wide-Area Files - A Framework and Empirical Evaluation
- Proceedings of the 3rd IEEE International Symposium on Network Computing and Applications (NCA 2004
, 2004
"... ..."
(Show Context)