Results 1 - 10
of
578
Network Coding for Distributed Storage Systems
, 2008
"... Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, ..."
Abstract
-
Cited by 338 (13 self)
- Add to MetaCart
Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a node failure is for a new node to download subsets of data stored at a number of surviving nodes, reconstruct a lost coded block using the downloaded data, and store it at the new node. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to download functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.
Pors: proofs of retrievability for large files
- In CCS ’07: Proceedings of the 14th ACM conference on Computer and communications security
, 2007
"... Abstract. In this paper, we define and explore proofs of retrievability (PORs). A POR scheme enables an archive or back-up service (prover) to produce a concise proof that a user (verifier) can retrieve a target file F, that is, that the archive retains and reliably transmits file data sufficient fo ..."
Abstract
-
Cited by 254 (8 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper, we define and explore proofs of retrievability (PORs). A POR scheme enables an archive or back-up service (prover) to produce a concise proof that a user (verifier) can retrieve a target file F, that is, that the archive retains and reliably transmits file data sufficient for the user to recover F in its entirety. A POR may be viewed as a kind of cryptographic proof of knowledge (POK), but one specially designed to handle a large file (or bitstring) F. We explore POR protocols here in which the communication costs, number of memory accesses for the prover, and storage requirements of the user (verifier) are small parameters essentially independent of the length of F. In addition to proposing new, practical POR constructions, we explore implementation considerations and optimizations that bear on previously explored, related schemes. In a POR, unlike a POK, neither the prover nor the verifier need actually have knowledge of F. PORs give rise to a new and unusual security definition whose formulation is another contribution of our work. We view PORs as an important tool for semi-trusted online archives. Existing cryptographic techniques help users ensure the privacy and integrity of files they retrieve. It is also natural, however, for users to want to verify that archives do not delete or modify files prior to retrieval. The goal of a POR is to accomplish these checks without users having to download the files themselves. A POR can also provide quality-of-service guarantees, i.e., show that a file is retrievable within a certain time bound. Key words: storage systems, storage security, proofs of retrievability, proofs of knowledge 1
On coding for reliable communication over packet networks
, 2008
"... We consider the use of random linear network coding in lossy packet networks. In particular, we consider the following simple strategy: nodes store the packets that they receive and, whenever they have a transmission opportunity, they send out coded packets formed from random linear combinations of ..."
Abstract
-
Cited by 217 (37 self)
- Add to MetaCart
We consider the use of random linear network coding in lossy packet networks. In particular, we consider the following simple strategy: nodes store the packets that they receive and, whenever they have a transmission opportunity, they send out coded packets formed from random linear combinations of stored packets. In such a strategy, intermediate nodes perform additional coding yet do not decode nor wait for a block of packets before sending out coded packets. Moreover, all coding and decoding operations have polynomial complexity. We show that, provided packet headers can be used to carry an amount of side-information that grows arbitrarily large (but independently of payload size), random linear network coding achieves packet-level capacity for both single unicast and single multicast connections and for both wireline and wireless networks. This result holds as long as packets received on links arrive according to processes that have average rates. Thus packet losses on links may exhibit correlations in time or with losses on other links. In the special case of Poisson traffic with i.i.d. losses, we give error exponents that quantify the rate of decay of the probability of error with coding delay. Our analysis of random linear network coding shows not only that it achieves packet-level capacity, but also that the propagation of packets carrying “innovative ” information follows the propagation of jobs through a queueing network, thus implying that fluid flow models yield good approximations.
Minimum-Cost Multicast over Coded Packet Networks
- IEEE TRANS. ON INF. THE
, 2006
"... We consider the problem of establishing minimum-cost multicast connections over coded packet networks, i.e., packet networks where the contents of outgoing packets are arbitrary, causal functions of the contents of received packets. We consider both wireline and wireless packet networks as well as b ..."
Abstract
-
Cited by 164 (28 self)
- Add to MetaCart
We consider the problem of establishing minimum-cost multicast connections over coded packet networks, i.e., packet networks where the contents of outgoing packets are arbitrary, causal functions of the contents of received packets. We consider both wireline and wireless packet networks as well as both static multicast (where membership of the multicast group remains constant for the duration of the connection) and dynamic multicast (where membership of the multicast group changes in time, with nodes joining and leaving the group). For static multicast, we reduce the problem to a polynomial-time solvable optimization problem, ... and we present decentralized algorithms for solving it. These algorithms, when coupled with existing decentralized schemes for constructing network codes, yield a fully decentralized approach for achieving minimum-cost multicast. By contrast, establishing minimum-cost static multicast connections over routed packet networks is a very difficult problem even using centralized computation, except in the special cases of unicast and broadcast connections. For dynamic multicast, we reduce the problem to a dynamic programming problem and apply the theory of dynamic programming to suggest how it may be solved.
On-the-fly verification of rateless erasure codes for efficient content distribution
- In Proceedings of the IEEE Symposium on Security and Privacy
, 2004
"... Abstract — The quality of peer-to-peer content distribution can suffer when malicious participants intentionally corrupt content. Some systems using simple block-by-block downloading can verify blocks with traditional cryptographic signatures and hashes, but these techniques do not apply well to mor ..."
Abstract
-
Cited by 137 (4 self)
- Add to MetaCart
Abstract — The quality of peer-to-peer content distribution can suffer when malicious participants intentionally corrupt content. Some systems using simple block-by-block downloading can verify blocks with traditional cryptographic signatures and hashes, but these techniques do not apply well to more elegant systems that use rateless erasure codes for efficient multicast transfers. This paper presents a practical scheme, based on homomorphic hashing, that enables a downloader to perform on-the-fly verification of erasure-encoded blocks. I.
Network Coding: An Introduction
, 2008
"... The basic idea behind network coding is extraordinarily sim
ple. As it is defined in this book, network coding amounts to no more than performing coding operations on the contents of packetsâperforming arbitrary mappings on the contents of packets rather than the restricted functions of replicatio ..."
Abstract
-
Cited by 69 (3 self)
- Add to MetaCart
The basic idea behind network coding is extraordinarily sim
ple. As it is defined in this book, network coding amounts to no more than performing coding operations on the contents of packetsâperforming arbitrary mappings on the contents of packets rather than the restricted functions of replication and forwarding that are typically allowed in conventional,
store-and-forward architectures. But, although simple, network coding has had little place in the history of networking. This is for good reason: in the traditional wireline technologies that have dominated networking
history, network coding is not very practical or advantageo
us.
Hence the motivation for this book: we feel that network coding may have a great deal to offer to the future design of packet networks, and we would like to help this potential be realized. We would like also to encourage more research in this burgeoning field. Thus, we have aimed the book at two (not necessarily distinct) audiences: first, the practi-
tioner, whose main interest is applications; and, second, t
he theoretician, whose main interest is developing further understanding of the properties of network coding. Of these two audiences, we have tended to favor the first, though the content of the book is nevertheless theoretical. We have aimed to expound the theory in such a way that it is access
ible to those who would like to implement network coding, serving an important purpose that was, in our opinion, inadequately served. The theoretician, in contrast to the practitioner, is spoiled. Besides this book, a survey
of important theoretical results in network coding is provi
ded in Yeung et al.âs excellent review, Network Coding Theory [149, 150]. Because of our inclination toward applications, however, our presentation differs substantially from that of Yeung et al.
RACS: A Case for Cloud Storage Diversity
"... The increasing popularity of cloud storage is leading organizations to consider moving data out of their own data centers and into the cloud. However, success for cloud storage providers can present a significant risk to customers; namely, it becomes very expensive to switch storage providers. In th ..."
Abstract
-
Cited by 64 (1 self)
- Add to MetaCart
(Show Context)
The increasing popularity of cloud storage is leading organizations to consider moving data out of their own data centers and into the cloud. However, success for cloud storage providers can present a significant risk to customers; namely, it becomes very expensive to switch storage providers. In this paper, we make a case for applying RAID-like techniques used by disks and file systems, but at the cloud storage level. We argue that striping user data across multiple providers can allow customers to avoid vendor lock-in, reduce the cost of switching providers, and better tolerate provider outages or failures. We introduce RACS, a proxy that transparently spreads the storage load over many providers. We evaluate a prototype of our system and estimate the costs incurred and benefits reaped. Finally, we use trace-driven simulations to demonstrate how RACS can reduce the cost of switching storage vendors for a large organization such as the Internet Archive by seven-fold or more by varying erasure-coding parameters.
Capacity-Achieving Ensembles for the Binary Erasure Channel with Bounded Complexity
- IEEE TRANS. INFORMATION THEORY
, 2004
"... We present two sequences of ensembles of non-systematic irregular repeat-accumulate codes which asymptotically (as their block length tends to infinity) achieve capacity on the binary erasure channel (BEC) with bounded complexity. This is in contrast to all previous constructions of capacity-achievi ..."
Abstract
-
Cited by 63 (17 self)
- Add to MetaCart
(Show Context)
We present two sequences of ensembles of non-systematic irregular repeat-accumulate codes which asymptotically (as their block length tends to infinity) achieve capacity on the binary erasure channel (BEC) with bounded complexity. This is in contrast to all previous constructions of capacity-achieving sequences of ensembles whose complexity grows at least like the log of the inverse of the gap to capacity. The new bounded complexity result is achieved by allowing a su#cient number of state nodes in the Tanner graph representing the codes.
ARQ for Network Coding
"... Abstract—A new coding and queue management algorithm is proposed for communication networks that employ linear network coding. The algorithm has the feature that the encoding process is truly online, as opposed to a block-by-block approach. The setup assumes a packet erasure broadcast channel with s ..."
Abstract
-
Cited by 62 (12 self)
- Add to MetaCart
(Show Context)
Abstract—A new coding and queue management algorithm is proposed for communication networks that employ linear network coding. The algorithm has the feature that the encoding process is truly online, as opposed to a block-by-block approach. The setup assumes a packet erasure broadcast channel with stochastic arrivals and full feedback, but the proposed scheme is potentially applicable to more general lossy networks with linkby-link feedback. The algorithm guarantees that the physical queue size at the sender tracks the backlog in degrees of freedom (also called the virtual queue size). The new notion of a node “seeing ” a packet is introduced. In terms of this idea, our algorithm may be viewed as a natural extension of ARQ schemes to coded networks. Our approach, known as the drop-when-seen algorithm, is compared with a baseline queuing approach called drop-when-decoded. It is shown that the expected queue size for our approach is O ( ) (