| Rissanen, J., "Hypothesis Selection and Testing by the MDL Principle", The Computer Journal, vol.42, pp. 260269, 1999. |
....conditional parametric complexity. The structure of the first part varies in how the region for Y is specified. For example, in [1] the region is defined as Y with the prefix encoding R or perhaps log R . Alternatively, one might constrain Y on a standardized scale as Y r n as in [9]. Taking a di#erent approach, one can follow the logic leading to the NML density and perform a further normalization over the parameter space [10] Rather than consider various means of incorporating information about the parameter space # [a,b] directly into the code, we instead consider a ....
Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle. The Computer Journal, 42, 260--269.
....the data. These are conceptually two very di#erent things. The log likelihood function will be evaluated for di#erent values of theta (e.g. numerical integration of parameters; simulation; optimisation maximum likelihood, maximum posterior, minimum message length [7] minimum description length [6]) whereas we will generally evaluate the logprobability function for varied values of the data (e.g. numerical integration; optimisation mode; moments; numerical entropy) The log likelihood function is used for inference, and if time complexity is of any importance then it needs to evaluate ....
J. J. Rissanen. Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4):260--269, 1999.
....are computationally intensive. Other methods that sacrifice some accuracy for efficiency, are the penalized likelihood approaches, where the log likelihood term is penalized by subtraction of a complexity term. We use such a method that tries to find a model with Minimum Description Length (MDL) [15]. Assuming all the models are equally likely a priori we can write: log P (M O) log P (O M, #) log L, 16) where P (M O) is the approximate posterior distribution on M , P (O M, #) is the data likelihood term (Eq. 7 for kmeans and Eq. 8 for EM) given the ML estimates #, and log L ....
J. Rissanen. Hypothesis selection and testing by the MDL principle. The Computer Journal, 42(4):260--269, 1999.
....more accurate than the other criteria, but has improved a seemingly disproportionate amount over its results for S 0 and S 1 . Table 3 shows the average number of inferred cuts for data set S 2 . None of the criteria appear to be excessively over fitting. We also note that MDL has been refined [14] since the 1978 MDL paper [12] For a general comparison between MDL and MML, see, e.g. 14, 19, 20] and other articles in that special issue of the Computer Journal. Table 4 shows the average Kullback Leibler (KL) distances and standard deviations for data set S 2 . The KL distance means and ....
....its results for S 0 and S 1 . Table 3 shows the average number of inferred cuts for data set S 2 . None of the criteria appear to be excessively over fitting. We also note that MDL has been refined [14] since the 1978 MDL paper [12] For a general comparison between MDL and MML, see, e.g. [14, 19, 20] and other articles in that special issue of the Computer Journal. Table 4 shows the average Kullback Leibler (KL) distances and standard deviations for data set S 2 . The KL distance means and standard deviations for MML I are consistent for all sample sizes and are overall best, performing ....
J. J. Rissanen. Hypothesis selection and testing by the MDL principle. Computer Jrnl., 42(4):260--269, 1999.
.... length is a quantification of the trade o# between model complexity and goodness of fit that was first described by [10] Since their seminal paper various approximations and derivations have appeared under the veil of Minimum Message Length (MML) 12, 11] or Minimum Description Length (MDL) [6]. MML87 [12] has recently been applied to the problem of model selection in univariate polynomial regression with normally distributed noise by [8] It was compared against Generalised Cross Validation (GCV) Finite Prediction Error (FPE) Schwartz s Criterion (SCH) and VC Dimension and ....
J. J. Rissanen. Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4):260--269, 1999.
....that x is a function of y. Suppose we want to choose the q(y) which minimizes the worst case regret: min q max y ln p l (yjx(y) q(y) A.10) As cited in [169] Shtarkov [170] showed that the solution to this minimax problem is q(y) p l (yjx(y) R Y p l ( yjx( y)d y : A. 11) Rissanen [171, 169] call this the normalized maximum likelihood density, and the length of the resulting code, GammaL(yjx(y) ln Z Y p l ( yjx( y) d y; A.12) the stochastic complexity of the data y under the parametric model p l . The second term is referred to as the parametric complexity, since it indicates ....
J. Rissanen, "Hypothesis selection and testing by the MDL principle," The Computer Journal, 1998, invited to the special issue devoted to Kolmogorov complexity.
....is a function of y. Suppose we want to choose the q(y) which minimizes the worst case regret: min q max y ln p l (yjx(y) q(y) 62) As cited in (Barron et al. 1998) Shtarkov (1997) showed that the solution to this minimax problem is q(y) p l (yjx(y) R Y p l ( yjx( y) d y : 63) Rissanen (1998) calls this the normalized maximum likelihood density, and the length of the resulting code, GammaL(yjx(y) ln Z Y p l ( yjx( y) d y; 64) the stochastic complexity of the data y under the parametric model p l . The second term is referred to as the parametric complexity, since it indicates ....
Rissanen, J. (1998). Hypothesis selection and testing by the MDL principle. The Computer Journal. To appear in the special issue devoted to Kolmogorov complexity.
No context found.
Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle.
No context found.
Rissanen, J. (1999) Hypothesis selection and testing by the MDL principle. Comput. J., 42, 260--269.
....Even for such games we can introduce some variant of predictive complexity (it will not, however, be defined to within an additive constant any more; in the case of the absolute loss game we will be able to get away with to within a logarithm ) and fruitfully apply CAP. Analogously to Rissanen [18], we can generalize CAP to choosing a pool of strategies rather than one strategy. An important task now is to compare, theoretically and experimentally, CAP with other model selection principles (see [19] CAP is an active field of research at the Computer Learning Research Centre and the ....
Rissanen, J. (1999) Hypothesis selection and testing by the MDL principle. Comput. J., 42, 260--269.
....of the normalization procedure done by Dom, 3] which, in turn, sharpens the asymptotic formula in [9] that is applicable to more general parametric model classes. The resulting decomposition is similar to Kolmogorov s sufficient statistics in the algorithmic theory of information, 2] 1] 10] [11], and it will also be seen to extend the usual sufficient statistics decomposition of parametric likelihood functions of exponential type. Because the NML criterion involves the sum of the squares of both the residuals, which define the ML estimate of the noise variance, and the constructed ....
....value being 0. This means that we should not compare f (x; fl; 0 ; R) as obtained from (15) for k = 1, with (16) but we should recompute it for k = 1 for a small value of R, which will increase it and make it more competitive against (16) Much as in hypothesis testing with the NML criterion, [11], an appropriate value for the range is obtained with R = c 2 = n Gamma 1) where c is a positive constant to be determined presently. Then writing the NML density function for k = 1 as f (x; 1) we get with the same technique as above the simple result f (x; 0) f (x; 1) 2cSn Gamma1 ....
[Article contains additional citation context not shown here]
Rissanen, J. (1998), `Hypothesis Selection and Testing by the MDL Principle', invited paper to The Computer Journal, Vol. 42, Nr 4, pp 260-269
....of the normalization procedure done by Dom, 3] which, in turn, sharpens the asymptotic formula in [9] applicable to more general parametric model classes. The resulting decomposition is similar to Kolmogorov s sufficient statistics in the algorithmic theory of information, 2] 1] 10] [11], and it will also be seen to extend the usual sufficient statistics decomposition of parametric likelihood functions of exponential type. Because the NML criterion involves the sum of the squares of both the residuals, which define the ML estimate of the noise variance, and the constructed ....
.... Gamma( n Gamma k 2 ) Gamma ln Gamma( k 2 ) ln 4 k 2 n 2 ln(n) 15) We wish to get rid of the two parameters R and 0 , which clearly affect the criterion in an essential manner, or rather we replace them with other parameters which do not influence the relevant criterion. In [11] and [6] this was done simply by setting the two parameters to the values that minimize (15) R = R, and 0 = where R = n Gamma1 fi 0 (x) Sigma fi(x) However, the resulting f(x; fl; x) R(x) is not a density function. We can of course correct this by multiplying it by a ....
[Article contains additional citation context not shown here]
Rissanen, J. (1998), `Hypothesis Selection and Testing by the MDL Principle', invited paper to The Computer Journal (to appear)
No context found.
Rissanen, J., "Hypothesis Selection and Testing by the MDL Principle", The Computer Journal, vol.42, pp. 260269, 1999.
No context found.
J. Rissanen. Hypothesis selection and testing by the mdl principle. The Computer Journal, 42: 260--269, 1999.
No context found.
J. Rissanen. Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4):260-- 269, 1999.
No context found.
J. Rissanen. Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4):260-- 269, 1999.
No context found.
Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle.
No context found.
J. Rissanen. Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4):260--269, 1999.
No context found.
Jorma Rissanen. Hypothesis selection and testing by the mdl principle. The Computer Journal, 42:260-- 269, 1999.
No context found.
Rissanen, J., "Hypothesis Selection and Testing by the MDL Principle", The Computer Journal, vol.42, pp. 260269, 1999.
No context found.
Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4), 260 -- 269.
No context found.
Rissanen, J. (1999), "Hypothesis selection and testing by the MDL principle", Computer Journal, 42 (4), 260-269.
No context found.
Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4), 260 -- 269.
No context found.
J. J. Rissanen. Hypothesis selection and testing by the MDL principle. Computer Jrnl., 42(4):260-269, 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC