Results 1 -
8 of
8
Source coding, large deviations, and approximate pattern matching
- IEEE Trans. Inform. Theory
, 2002
"... Dedicated to the memory of Aaron Wyner, a valued friend and colleague. Abstract—In this review paper, we present a development of parts of rate-distortion theory and pattern-matching algorithms for lossy data compression, centered around a lossy version of the asymptotic equipartition property (AEP) ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
Dedicated to the memory of Aaron Wyner, a valued friend and colleague. Abstract—In this review paper, we present a development of parts of rate-distortion theory and pattern-matching algorithms for lossy data compression, centered around a lossy version of the asymptotic equipartition property (AEP). This treatment closely parallels the corresponding development in lossless compression, a point of view that was advanced in an important paper of Wyner and Ziv in 1989. In the lossless case, we review how the AEP underlies the analysis of the Lempel–Ziv algorithm by viewing it as a random code and reducing it to the idealized Shannon code. This also provides information about the redundancy of the Lempel–Ziv algorithm and about the asymptotic behavior of several relevant quantities. In the lossy case, we give various versions of the statement of the generalized AEP and we outline the general methodology of its proof via large deviations. Its relationship with Barron and Orey’s generalized AEP is also discussed. The lossy AEP is applied to i) prove strengthened versions of Shannon’s direct sourcecoding theorem and universal coding theorems; ii) characterize the performance of “mismatched ” codebooks in lossy data compression; iii) analyze the performance of pattern-matching algorithms for lossy compression (including Lempel–Ziv schemes); and iv) determine the first-order asymptotic of waiting times between stationary processes. A refinement to the lossy AEP is then presented, and it is used to i) prove second-order (direct and converse) lossy source-coding theorems, including universal coding theorems; ii) characterize which sources are quantitatively easier to compress; iii) determine the second-order asymptotic of waiting times between stationary processes; and iv) determine the precise asymptotic behavior of longest match-lengths between stationary processes. Finally, we discuss extensions of the above framework and results to random fields. Index Terms—Data compression, large deviations, patternmatching, rate-distortion theory.
Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data Compression
- IEEE Trans. Inform. Theory
, 1999
"... We characterize the achievable pointwise redundancy rates for lossy data compression at a fixed distortion level. "Pointwise redundancy" refers to the difference between the description length achieved by an nth-order block code and the optimal nR(D) bits. For memoryless sources, we show that the be ..."
Abstract
-
Cited by 15 (10 self)
- Add to MetaCart
We characterize the achievable pointwise redundancy rates for lossy data compression at a fixed distortion level. "Pointwise redundancy" refers to the difference between the description length achieved by an nth-order block code and the optimal nR(D) bits. For memoryless sources, we show that the best achievable redundancy rate is of order O( p n) in probability. This follows from a second-order refinement to the classical source coding theorem, in the form of a "one-sided central limit theorem." Moreover, we show that, along (almost) any source realization, the description lengths of any sequence of block codes operating at distortion level D exceed nR(D) by at least as much as C p n log log n, infinitely often. Corresponding direct coding theorems are also given, showing that these rates are essentially achievable. The above rates are in sharp contrast with the expected redundancy rates of order O(log n) recently reported by various authors. Our approach is based on showing that...
Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study
"... entropy ..."
Critical Behavior in Lossy Source Coding
, 2000
"... |The following critical phenomenon was recently discovered. When a memoryless source is compressed using a variable-length xed-distortion code, the fastest convergence rate of the (pointwise) compression ratio to R(D) is either O( p n) or O(log n). We show it is always O( p n), except for discre ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
|The following critical phenomenon was recently discovered. When a memoryless source is compressed using a variable-length xed-distortion code, the fastest convergence rate of the (pointwise) compression ratio to R(D) is either O( p n) or O(log n). We show it is always O( p n), except for discrete, uniformly distributed sources. Keywords|Redundancy, rate-distortion theory, lossy data compression I.
Critical Redundancy in Lossy Source Coding
, 1999
"... The following critical phenomenon was recently discovered. When a memoryless source is compressed using a variable-length xed-distortion code, the fastest convergence rate of the (pointwise) compression ratio to R(D) is either O( p n) or O(log n). We show it is always O( p n), except for discre ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The following critical phenomenon was recently discovered. When a memoryless source is compressed using a variable-length xed-distortion code, the fastest convergence rate of the (pointwise) compression ratio to R(D) is either O( p n) or O(log n). We show it is always O( p n), except for discrete, uniformly distributed sources. Keywords Redundancy, rate-distortion theory, lossy data compression Department of Mathematics and Department of Statistics, Stanford University, Stanford, CA 94305. Email: amir@stat.stanford.edu Web: www-stat.stanford.edu/amir y Department of Statistics, Purdue University, 1399 Mathematical Sciences Building, W. Lafayette, IN 479071399. Email: yiannis@stat.purdue.edu Web: www.stat.purdue.edu/yiannis 1 A.D.'s research was supported in part by NFS grant # NSF-DMS 9704552. I.K.'s research was supported in part by a grant from the Purdue Research Foundation. 1 Introduction Suppose that data is produced by a stationary memoryless source fX n ; n ...
Critical Behavior in Data Compression
, 1999
"... Let x n 1 = (x 1 ; x 2 ; : : : ; xn ) be a realization of the independent and identically distributed random variables (X 1 ; X 2 ; : : : ; Xn ). A compression algorithm operating at distortion level D consists of an encoder that takes strings x n 1 to binary strings of variable length, and a d ..."
Abstract
- Add to MetaCart
Let x n 1 = (x 1 ; x 2 ; : : : ; xn ) be a realization of the independent and identically distributed random variables (X 1 ; X 2 ; : : : ; Xn ). A compression algorithm operating at distortion level D consists of an encoder that takes strings x n 1 to binary strings of variable length, and a decoder that maps these binary strings to new strings y n 1 = (y 1 ; y 2 ; : : : ; yn ), so that the decoded y n 1 is always within distortion D of the encoded x n 1 . Distortion is measured by some single-letter distortion measure such as mean-squared error. The description length ` n (x n 1 ) is the length of the binary description of x n 1 . For long realizations, the best compression ratio ` n (X n 1 )=n that can be achieved by any sequence of algorithms operating at distortion level D is given by Shannon's rate-distortion function R(D). The following critical phenomenon was recently discovered in [10]. Depending on the distribution of the random variables X i and the distort...

