| B. Ryabko, Prediction of random sequences and universal coding. Problems of Information Transmission, Vol. 24, pp. 87--96, 1988. |
....squared loss. For each 21 X 2 E(f0; 1g) the corresponding Bayes scheme is of the form B(x ) P (X t = 1jX ) If F is strongly optimal for E(f0; 1g) then it follows from Proposition 2 that for every binary ergodic process X, jF (X ) P (X t = 1jX )j 0 wp1: However, it is known [6, 37, 23] that no such on line estimation scheme exists. Therefore no prediction scheme is strongly optimal for E(f0; 1g) and by the same reasoning, no prediction scheme is strongly optimal under the squared loss for E(A) if A has cardinality greater than one. A similar negative conclusion holds for ....
B.Y. Ryabko, Prediction of random sequences and universal coding, IEEE Trans. Info. Theory, vol.44, pp.2124-2147, 1988.
....Bayes scheme is of the form G(x t 1 ) P (X t = 1jX t 1 = x t 1 ) It follows from Propositions 3 and 4 that if F is a strongly optimal prediction scheme for E(f0; 1g) then jF (X t 1 ) P (X t = 1jX t 1 )j 0 wp1 for every binary stationary ergodic process X. However, it is known [3, 26, 18] that no such on line estimation scheme exists. Therefore no prediction scheme is strongly optimal for E(f0; 1g) and more generally, no prediction scheme is strongly optimal under the squared loss for E(X ) if X contains more than one element. Concerning Cesaro optimality, more can be said. ....
B.Y. Ryabko. Prediction of random sequences and universal coding, IEEE Trans. Info. Theory, vol.44, pp.2124-2147, 1998.
.... [8] Krichevsky and Trofimov [24] Rissanen and Langdon [25] and others [26 30] The connections between universal source coding and sequential decision problems with other loss functions can be traced back to the work of Cover [31] Rissanen [10] Rissanen and Langdon [25] Ryabko and others [15, 32, 33]. Following the development of universal source coding algorithms based on Bayesian mixtures and sequential probability assignment [9, 11, 13] we recently developed a related algorithm for universal linear prediction of real valued data with respect to both finite and continuous classes of ....
.... developed a related algorithm for universal linear prediction of real valued data with respect to both finite and continuous classes of experts [34 36] Our approach is based on some of the well known properties of Bayesian mixture probability models for sequential encoding of binary data [10, 22, 24, 27, 32, 33]. The essential idea behind universal sequential probability assignment [13] is to obtain a probability model for a sequence of data that is almost as good of a fit as the best out of a large class of models. Of course this could be achieved by collecting all of the data, testing each of the ....
B. Y. Ryabko, "Prediction of random sequences and universal coding," Prob. Inf. Transmission, vol. 24, pp. 87--96, Apr-June 1988. 30
....class of stationary data models, there exists at least one universal optimal predictor. Among them, one should look for an universal predictor with the speed of convergence as fast as possible (cf. 9] A large body of useful research on universal prediction was done in the last fifty years (cf. [1, 4, 8, 9, 10, 11, 12, 14, 15, 16, 19]) There exist predictors based on arithmetic coding (cf. 14, 15] Rissanen MDL (cf. 11, 12] nonparametric universal predictors (cf. 4] contextweighting, and so forth. In this paper, we consider a modified prediction algorithm based on pattern matching that was described in Ehrenfeucht and ....
.... one should look for an universal predictor with the speed of convergence as fast as possible (cf. 9] A large body of useful research on universal prediction was done in the last fifty years (cf. 1, 4, 8, 9, 10, 11, 12, 14, 15, 16, 19] There exist predictors based on arithmetic coding (cf. [14, 15]) Rissanen MDL (cf. 11, 12] nonparametric universal predictors (cf. 4] contextweighting, and so forth. In this paper, we consider a modified prediction algorithm based on pattern matching that was described in Ehrenfeucht and Mycielski [3] This predictor seems to be performing well in ....
B. Ryabko, Prediction of Random Sequences and Universal Coding, Problems of Information Transmission, 24, 3--14, 1988.
....We then show that a second mixture over all model orders provides a predictor which is universal with respect to both model orders and parameters. Each of these steps are contained in the proofs of Theorems 2, and 1, in Sections 3, and 4, respectively. The result is a twice universal [1] [2] linear predictor which implements a double mixture over model orders and parameters. This resembles the context tree weighting procedure in [3] which implements a double mixture over the parameters and model orders of context trees used in data compression. Key to the development of such ....
....in the stochastic context, it is not optimal for individual sequences. For this reason, rather than selecting a single set of parameters to use for prediction, we use the mixture approach of universal coding to obtain the universal predictor coefficients. This idea has already been applied in [2] for prediction in a probabilistic context. By transforming the problem into one of probability assignment, we can sequentially assign a probability to the sequence which is almost as good as that assigned by the best linear predictor. As such, we consider a means of estimating the parameters of ....
[Article contains additional citation context not shown here]
B. Y. Ryabko, "Prediction of random sequences and universal coding," Prob. Inf. Transmission, vol. 24, pp. 87--96, Apr-June 1988.
....E mail: meir eng.tau.ac.il 1 algorithm, that is not tuned to the data in advance, yet for any bounded sequence x n 1 it asymptotically attains the average square error achieved by the best linear predictor up to some order M . The universal predictor we present is a twice universal [2] [3] linear predictor, i.e. it is universal with respect to the choice of the parameters a 1 ; a p and the choice of p. As in many prediction problems, we transform the prediction problem into one of probability assignment, where in the case of linear prediction with square error loss, each ....
.... of these algorithms can be made to have O(M ) operations per time sample which results in a total complexity of O(Mn) An example lattice prediction algorithm is given in [4] 5 Concluding Remarks The main result of this paper, stated in Theorem 1, is an algorithm which is twice universal [2] [3] for linear prediction with respect to model orders and parameters. The universal predictor presented in this paper will perform as well as the best linear predictor of any order up to some maximum order, uniformly, for every individual sequence. With this algorithm, the problems of model order ....
B. Y. Ryabko, "Prediction of random sequences and universal coding," Prob. Inf. Transmission, vol. 24, pp. 87--96, Apr-June 1988.
....function m1 does not have a finite memory. We say that the (same) estimator mN is consistent, if it converges to m1 in the sense of integrated meansquared error. 2 Our notion of memory universality is inspired by a similar notion in the theory of universal coding, see, for example, Ryabko [43, 44]. Roughly speaking, memory universal estimators implicitly discover the true unknown memory q. As an important aside, we point out that our notion of memory universality is distinct from the notion of universal consistency traditionally considered in the nonparametric estimation literature ....
.... the same as estimating the corresponding conditional distribution of X 0 given the entire infinite history X ( Gamma1; Gamma1) The latter problem, owing to its applications in data compression, has received wide attention, for example, see Algoet [2] Cover [18] Rissanen [37, 38] and Ryabko [43, 44]. Our work fundamentally differs from the existing body of work for binary valued processes, in that, for binary valued processes each element of the sequence fm p g p1 is finitely parameterized, while for real valued processes considered here the elements of the sequence are not finitely ....
B. Y. Ryabko, "Prediction of random sequences and universal coding," Problems in Information Transmission, vol. 24, pp. 87-96, Apr.-June 1988.
....of model weighting techniques. An advantage of weighting procedures is that they perform well not only on the average but for each individual sequence. Model weighting (twice universal coding) is not new. It was first suggested by Ryabko[13] for the class of finite order Markov sources (see also [14] for a similar approach to prediction) The known literature on model weighting resulted however in probability assignments that require complicated sequential updating procedures. Instead of finding implementable coding methods one concentrated on achieving low redundancies. In what follows we ....
B.Ya. Ryabko, "Prediction of Random Sequences and Universal Coding," Problems Inform. Transm., vol. 24, No. 2, pp. 3-14, April-June 1988.
.... i (ajx 1 ; x i Gamma1 ) i Gamma 1) Gamma1 i Gamma1 X j=1 q j (ajx i Gammaj ; x i Gamma1 ) Consider the predictor defined by X i Delta = arg max a2A i (ajX 1 ; X i Gamma1 ) Then this predictor is a universal predictor for the class M. EXAMPLE. Ryabko [14] [15] discovered a universal predictor for the class of stationary data models which he constructed from an arithmetic encoder. We shall not present Ryabko s predictor here, because it is too complicated. However, we do present the results of Ryabko s case study, in which he compared the ....
B. Ryabko, "Prediction of random sequences and universal coding," Problemy Peredachi Informatsii, vol. 24, pp. 3-14, 1988.
....such a measure exists for the difference and M = SE, and Bailey [3] solved the problem in 2 log x denotes the natural logarithm of x. the negative. Instead, it was proved that (1=n) P n01 t=0 ae( x n 1 jx n 1 ) x n 1 jx n 1 ) 0 a.s. as n 1 for the difference [3] and for the ratio [18], respectively. On the other hand, Ornstein [15] solved the less demanding problem which Cover raised as another open problem [5] for any x 0 2 f0; 1g, x 0 jx 01 0n ) 0 (x 0 jx 01 0n ) 0 a.s. as n 1 w.r.t. any 2 SE. For a more general case, see Algoet [1] First, we show that universal ....
....2, based on the cutting and stacking technique [20] Bailey [3] constructed a counterexample for binary sequences and gave a derivation why the measure violates criterion 4 for the difference. Recently, for the same problem, Gyorfy, Morvai, and Yakowitz [10] gave a simplified derivation. Ryabko [18], on the other hand, constructed another counterexample for ternary sequences, and gave an intuitive explanation why the measure violates criterion 4 for the ratio, in which no mathematical derivation of violating the criteion was presented. Algoet [2] completed Ryabko s proof, and also showed ....
[Article contains additional citation context not shown here]
B. Y. Ryabko, "Prediction of random sequence and universal coding", Problemy Peredachi Informatsii, Vol. 24, pp. 3-14, April-June, 1988.
No context found.
B. Ryabko, Prediction of random sequences and universal coding. Problems of Information Transmission, Vol. 24, pp. 87--96, 1988.
No context found.
B. Ryabko. Prediction of random sequences and universal coding. Problems of Information Transmission, 24:87--96, 1988.
No context found.
B. Ryabko, Prediction of random sequences and universal coding. Problems of Information Transmission, Vol. 24, pp. 87--96, 1988.
No context found.
B. Ryabko, "Prediction of random sequences and universal coding," Problems of Information Transmission, vol. 24, pp. 87--96, April/June 1988.
No context found.
B. Ya. Ryabko, "Prediction of random sequences and universal coding," Problems of Inform. Trans., vol. 24, no. 2, pp. 87--96, Apr-June 1988.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC