| Bunton, S. (1996) On-Line Stochastic Processes in Data Compression. Ph.D. Thesis, University of Washington, Seattle, WA. |
....in transcription confuse voiced consonants with their unvoiced equivalents: f and v, b and p, d and t, and so on. The basic idea of CP is to weight alternative contexts rather than choose the most likely one. A similar strategy is also used in compression schemes such as context tree weighting (Bunton, 1996), and in recently developed theories of distance measurement (Cleary and Trigg, 1995) The idea of combining competing solutions, weighted by probability, instead of seeking the single best one, seems to be gaining momentum, not just in compression but in the bagging and boosting techniques that ....
Bunton, S. (1996) On-line stochastic processes in data compression. Ph.D. Thesis, University of Washington, Seattle, WA.
....the probability and then a coding step that uses the probability to optimally encode the next symbol. The encoding is usually done with some variant of arithmetic coding (Mo at, Neal, Witten, 1995) A large number of di erent estimation and modelling algorithms have been proposed. As shown by Bunton (1996; 1997) most of these can be tted into the framework of FSMX models. In such models, the predictions of the next symbol are conditioned on the basis of a preceding nite sequence of text. In some algorithms such as PPM (Cleary Witten, 1984; Mo at, 1990) there is a xed upper bound to the ....
....FSMX models. In such models, the predictions of the next symbol are conditioned on the basis of a preceding nite sequence of text. In some algorithms such as PPM (Cleary Witten, 1984; Mo at, 1990) there is a xed upper bound to the lengths of these contexts, in others (Cleary Teahan, 1997; Bunton, 1996; Horspool Cormack, 1986) there is no such prior bound and the contexts can be of unbounded length. Even algorithms such as the Burroughs Wheeler compressor (Burrows Wheeler, 1994) which super cially appear very di erent have been shown to t into this framework (Cleary Teahan, 1997) While ....
[Article contains additional citation context not shown here]
Bunton, S. 1996. On-line stochastic processes in data compression. Ph.D. thesis, University of Washington.
....adaptive: the counts for each context are updated progressively throughout the text. In this way, the models adapt to the specific statistical properties of the text being compressed. Experiments with English text show that order 5 character based PPM models perform well [19] although results in [4] suggest that higher orders may perform better using improved blending algorithms. Cleary et al. 6] report on another approach where the length of the context models are unbounded. Performance of these models on English text can be substantially improved by training them on large amounts of text ....
....best. Further experiments using the context trie based methods described in section 3.2 show that performance degrades with higher order models. We have not yet investigated this trade off with much larger training texts. Also, it is unclear how the addition of the blending mechanisms described in [4] will affect these results. In principle, the character based models, particularly the unbounded context PPM models, should be able to perform at least as well as the word models as they have the same or more information available to them. Clearly more work is needed on computing predictions from ....
Bunton, S. (1996) On-line stochastic processes in data compression. PhD thesis, University of Washington.
....state selection could be meaningfully combined to improve the state of the art in universal on line modelling. To prove these hypotheses, we set about unifying these and several other suffix tree based model families and applying the same optimizations to all of them in the larger work [3] from which much of this paper is excerpted. There were three obstacles to introducing information theoretic state selection to PPM variants. First, PPM # s model structure seemed to be quite distinct from the suffix tree models to which state selection can be applied. Second, the models that use ....
....understanding that is essential for correctly applying these techniques to any universal suffix tree model. Finally, in Section 7, all improvements are measured empirically as compression performance on the Calgary Corpus [6] using our executable taxonomy of on line sequence modelling algorithms [3], which completely controls all model features in each experiment. 2. SUFFIX TREE MODELS 2.1. Notation and terminology Broadly speaking, the suffix trees used in on line string modelling are finite state machines, where the current state has an associated conditioning context string that is a ....
[Article contains additional citation context not shown here]
Bunton, S. (1996) On-Line Stochastic Processes in Data Compression. Ph.D. Thesis, University of Washington, Seattle, WA.
....and thus these techniques, along with other variants of mixtures, are interchangeable. 1 Recursive Mixtures We are concerned with estimating a probability P e (a i ja 1 a 2 Delta Delta Delta a i Gamma1 ) using the frequencies stored at the excited states of a suffix tree FSM (see Chapter 2 of [Bun96]) where the excited states are those states of the FSM whose associated conditioning context partitions contain the sequence a 1 a 2 Delta Delta Delta a i Gamma1 2 A . At any time, the excited states of a suffix tree FSM are linked, at least abstractly, by an unbroken chain of suffix ....
....= count[a; s; u(s) k; where k is the initial frequency value (ideally, k = 0) and k is a global constant that remains fixed for the lifetime of the model. Lastly, let the node count function count : S R be defined as follows: count(s) X a:count[a;s;u(s) 0 count(a; s) 2 Chapter 5 of [Bun96] explains that the ability to dynamically select update excluded frequencies or full update frequencies on a per state basis is required for correctly combining mixtures, update exclusion (introduced in [Mof90] and state selection. Given the above definitions, a simple bottom up procedure for ....
[Article contains additional citation context not shown here]
S. Bunton. On-Line Stochastic Processes in Data Compression. PhD thesis, University of Washington, December 1996.
....of PPM s transition function is possible only because PPM s model satisfies the Markov property. This flexibility is not a feature of all Markov models with underlying suffix tree structure. We have proved that there exist useful finite context Markov models that are not FSMX [Bun96, Chapter 4]. Those models allow arbitrarily long extensions to state contexts and include Dynamic Markov Compression (DMC) CH87] models as a special case. They require explicit destination pointers because their conditioning contexts cannot be described by a single string. 2.6.3 Model Semantics I: ....
....thus these techniques, along with other variants of mixtures, are interchangeable. 6. 1 Recursive Mixtures We are concerned with estimating a probability P e (a i ja 1 a 2 Delta Delta Delta a i Gamma1 ) using the frequencies stored at the excited states of a suffix tree FSM (see Chapter 2 of [Bun96]) where the excited states are those states of the FSM whose associated conditioning context partitions contain the sequence a 1 a 2 Delta Delta Delta a i Gamma1 2 A . At any time, the excited states of a suffix tree FSM are linked, at least abstractly, by an unbroken chain of suffix 82 ....
[Article contains additional citation context not shown here]
S. Bunton. On-Line Stochastic Processes in Data Compression. PhD thesis, University of Washington, December 1996.
....and defines novel techniques, and a taxonomy for describing on line sequence modeling algorithms precisely and completely in a way that enables meaningful comparison. Full descriptions of the algorithms, and definitions of the technical terms used in this overview, are given in Chapters 2 6 of [Bun96], excerpts of which appear as the companion papers [Bun97b, Bun97a] 1 Design Philosophy The executable cross product and nomenclature described here are used to produce and describe the experimental results of [Bun97b, Bun97a] which evaluate different state selection and probability estimation ....
....precisely describing on line statistical algorithms. ffl The symbols accompanying the English names give a terse labeling system that completely describes on line algorithms. ffl It explains our software s command line options (see Figure 1) ffl It synopsizes the modeling concepts presented in [Bun96, Bun97b, Bun97a, Bun97c]. Usage: runDMC [ B(atch) U(decode) b e s v o r P y z s w i m K x G D M c] file : AlphabetBits: the log of the input alphabet size) b 1, 16 (Default = 8) neglogEPSILON: EPSILON = minimum frequency. ....
[Article contains additional citation context not shown here]
S. Bunton. On-Line Stochastic Processes in Data Compression. PhD thesis, University of Washington, December 1996.
....(b) experiments that evaluate the general effectiveness of specific model features by controlling the confounding factors induced by the myriad implementation decisions underlying any empirical evaluation. Indeed, this work is preliminary to such a taxonomy and controlled empirical study [4]. Our analysis reveals that the DMC model, with or without its counterproductive portions, offers abstract structural features not found in other models. Ironically, the space hungry DMC algorithm actually has a greater capacity for economical model representation than its counterparts have. Once ....
....feasible with a binary alphabet, since jAj out transitions are created for every new state. Other authors [14, 15, 19] have independently generalized DMC to larger alphabets using variations of what we call lazy cloning , which copies out transitions only as needed. However, only the solutions in [4, 14] successfully reduced DMC s memory requirements without eliminating DMC s advantages. The others all had a similar approach: when the required out transition was absent from the current state, the needed transition was copied from a state that was essentially a zero or first order state. Our ....
[Article contains additional citation context not shown here]
S. Bunton. On-Line Stochastic Processes in Data Compression. PhD thesis, University of Washington, Aug. 1995. (in preparation).
....at the selected state, since the act of selecting a state assumes that the descendants of the selected state do not exist. Thus, disabling update exclusion at the currently selected state is required when it is enabled globally for the modeling algorithm. The dual update mechanism introduced in [Bun96], Chapter 5, provides this ability by maintaining both update excluded and full update frequencies at every state. 5 A Percolating State Selection Mechanism Here we present a dynamic programming solution to the problem of finding the bestperforming model frontier without resorting to ....
.... by s s children is incomplete because L(s) Gamma [ t:suffix(t) s L(t) 6= fg: To complete s s context partition element, we maintain a shadow child of s that maintains a next symbol frequency distribution that is conditioned by L(s) Gamma [ t:suffix(t) s L(t) As explained in [Bun96], Chapter 5, this is the same frequency distribution that is formed by the update excluded frequency counts count[ Gamma; s; 1] if maximum order updates are globally enabled for the modeling algorithm. Alternatively, it is approximated very closely by the update excluded frequency counts ....
[Article contains additional citation context not shown here]
S. Bunton. On-Line Stochastic Processes in Data Compression. PhD thesis, University of Washington, December 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC