55 citations found. Retrieving documents...
J. Rissanen. Modelling by shortest data description. Automatica, volumne 14, pp 465--471, 1978.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Discriminative, Generative and Imitative Learning - Jebara (2002)   (Correct)

....so forth. Section 2.4 will discuss techniques for learning from data techniques in discriminative approaches. 2.2 Generative Learning There are many variations for learning generative models from data. These many approaches, priors and model selection criteria include minimum description length [163], Bayesian information criterion, Akaike information criterion, entropic priors, and so on, and a survey is beyond the scope of this thesis. We will instead quickly discuss the popular classical approaches that include Bayesian inference, maximum a posteriori and maximum likelihood estimation. ....

J Rissanen. Modelling by the shortest data description. Automatica, 14, 1978.


AND/OR Trees for the Learning of Functional Logic.. - Ferri-Ramírez.. (2001)   (Correct)

....for the induction of functional logic languages de ned in [6] making it conditional and capable of inducing functions with high arity more eciently. The proposed decision tree algorithm follows a short to long search. The split criterion is based on the Minimum Description Length (MDL) principle [18]. It di ers from other approaches whose quality criteria are based on discrimination, as in [1, 17] It also di ers in the way the search space (an AND OR tree) is traversed, producing an increasing number of solutions for increasing provided time. The paper is organised as follows. In Section 2, ....

J. Rissanen. Modelling by the shortest data description. Automatica-J.IFAC, 14:465{ 471, 1978. 13


Bayesian Inference for Reliable Biomedical Signal Processing - Sykacek (2000)   (1 citation)  (Correct)

....The Bayesian methodology, however, is the only one that makes the prior explicit. An example is model selection: whatever method we look at, there is always a best t versus complexity trade o . That is, Akaike s information criterion (AIC) see [Aka74] minimum description length (MDL) see [Ris78] minimum message length (MML) see [WF87] or likelihood ratio tests as used in frequentist statistics (see [SS71] all use a prior that prefers simple models. The same is true for statistical learning theory summarized in [Vap95] where model complexity is constrained by minimizing the VC ....

J. Rissanen. Modelling by shortest data description. Automatica, 14:465-471, 1978.


Induction of decision multi-trees using Levin search - Ferri-Ramírez..   (Correct)

....equations of the form f(X 1 ; Xn ) Y , we consider the function result Y as another attribute to be tested. All this extends the kind of tests performed in classical decision tree induction approaches. Our method uses a heuristic based on the Minimum Description Length (MDL) principle [21]. Hence, the decision tree is built in a short to long way. The MDL principle has been previously used in the induction of decision trees but just within the post pruning phase [11, 20] Also, the MDL principle has been used as a stopping criterion (pre pruning) 17] as a measure for globally ....

J. Rissanen. Modelling by shortest data description. Automatica, 14:465-471, 1978.


A Formal Definition of Intelligence Based on an.. - Hernandez-Orallo, al. (1998)   (Correct)

....of the proposals was the following one [Chaitin 1982] f) Develop formal definitions of intelligence and measures of its various components; apply information theory and complexity theory to AI. The second part of this claim began time ago [Wallace Boulton 1968] but it was Rissanen s MDL [Rissanen 1978] which promoted its current use. Later, other parent of algorithmic information theory [Solomonoff 1986] proposed explicitly to address directly AI using algorithmic probability . We have not found (to our limited bibliographical knowledge) any work related to the first part of the claim (except ....

....sentences as t 2 and has less false observational sentences (exceptions) But algorithmic complexity K(x) is an objective criterion for simplicity. This is precisely what R.J.Solomonoff proposed as a perfect theory of induction, in [Li Vitnyi 1997] words. Algorithmic Complexity inspired J. Rissanen in 1978 to use it as a general modelling method, giving the popular MDL principle [Rissanen 1978] Minimum Description Length (MDL) principle: The best theory to explain a set of data is the one which minimises the sum of: the length, in bits, of the description of the theory; and ,the length, in bits, ....

[Article contains additional citation context not shown here]

Rissanen, J. "Modelling by the shortest data description" Automatica-J.IFAC, 14:465-471, 1978.


Data Driven Gesture Model Acquisition Using Minimum.. - Walter, Psarrou, Gong   (Correct)

....such methods is that they require a validation set, which is often not available. Alternative approaches to determine the number of clusters are based on information criteria, such as A Information Criterion (AIC) 1] Bayesian Information Criterion (BIC) 14] and Minimum Description Length (MDL) [11]. In the following sections we show how MDL can be used to automatically segment the components within gesture space into clusters that correspond to atomic gestures, without any a priori knowledge on the number of atomic components present. We extract high level knowledge on gestures based on the ....

....clusters , known as the problem of model order selection, and the estimation of the model parameters v . The problem of model order selection has been widely studied in the literature (see [4] for a review) Heuristic methods have been proposed by Akaike [1] Schwarz [14] and Rissanen [11], who respectively proposed (AIC) A Information Criterion, BIC) Bayesian Information Criterion and (MDL) Minimum Description Length. These methods are heuristic in the sense that they do not minimise an error function between the estimated and the true model order. Instead these methods define ....

[Article contains additional citation context not shown here]

J. Rissanen. Modelling by shortest data description. Automatica, 14:465--471, 1978.


Minimum Message Length and Kolmogorov Complexity - Wallace, Dowe (1999)   (18 citations)  (Correct)

....UTM, but universality guarantees that different choices of UTM will affect the odds between future events only by a factor with bounds independent of S, E1andE2. The third stream was introduced by Wallace and Boulton [7, 8, 9, 10, 11, 12] with a similar but independent development by Rissanen [3]. Unlike the other streams, its basis is Shannon s theory of information rather than the theory of Turing machines. Like the second stream, it regards the given string S as being a representation, in some code, of data about the real world. We now seek a string H A where the first part H ....

....not (see Section 7) For computable h( and f ( exactly the same SMML construction can be derived from the aim of producing Bayesian point estimates of high posterior probability, without any reference to information or AC theory [8] 6.2. Minimum description length The MDL development [3, 16] differs in some respects from the MML approach, although in practical applications, similar results are usually obtained. First, in MDL the stated aim is most usually to select not a single fully specified hypothesis, but rather to select a parametrized family (called a model class ) For ....

Rissanen, J. J. (1978) Modelling by shortest data description. Automatica, 14, 465--471.


Converting A Trained Neural Network To A Decision Tree - Dectext -.. - Boz (2000)   (Correct)

....C4.5 a continuous valued feature may be used again down the tree. To nd multiple cut points, the method which is explained above is applied to each of the subsets found recursively. For stopping Fayyad and Irani use Minimum Description Length Principle [Quinlan and Rivest, 1989, Rissanen, 1986, Rissanen, 1978] A new cut point will be created for a subset if : Gain(S) log 2 N 1 4(S) 2.24) Gain(S) E(S) E p (S) 2.25) 4(S) log 2 3 2 [kE(S) k 1 E(S left ) k 2 E(S right ) 2.26) 29 2.2.7 Softening Thresholds for Continuous Valued Features For continuous feature values the threshold ....

Rissanen, J. (1978). Modelling by shortest data description. Automatica, 14:465-471.


Estimating the number of layers in a distribution.. - Thomas El-Maraghi..   (Correct)

....methods of model selection have been proposed in the computer vision literature (see [3] for a survey) Here, we will investigate two techniques, which we shall show to be closely related. They are the Bayesian evidence framework [2] 11] and the minimum description length principle (MDL) 2] 6][15]. The primary criteria for evaluating these techniques will be how well they estimate the number of layers in a distribution. Before continuing, it should be noted that it is not sufficient to simply select the model that yields the highest likelihood, because it is always possible to increase the ....

J. Rissanen, Modelling by shortest data description, Automatica, 14, pp. 465-471, 1978.


An Information-Theoretic External Cluster-Validity Measure - Dom (2001)   (10 citations)  (Correct)

....1 : 11) To actually use the encoding scheme implicit in this we would need an additional term of log n bits to encode jCj, but we omit it because it is fixed for a constant for a given ground truth set. This (11) can be seen as an application of the minimum description length principle (MDL)[15, 16]. Note that this measure is not symmetric with respect to C and K, which is reasonable. The status of these two sets of labels is not equivalent in this context. 4.2.2. Extreme Cases The following list enumerates several extreme cases of our measure. 1. Perfect clustering: The minimum possible ....

....Information theory has, since its inception in 1948[17] clearly demonstrated the viability of code length as a measure of information content. The subsequent development of the theory of algorithmic complexity[12] extended these ideas and ultimately led to the minimum description length principle[15, 16], which distilled the essence of these and extended them further. For this reason we feel that our measure, which embodies these principles, is superior to the other measures discussed here, which we consider to be more heuristic in nature. This is, of course, a philosophical argument. In support ....

J. Rissanen. Modelling by shortest data description. Automatica, 14:465--471, 1978.


NewsWeeder: Learning to Filter Netnews - Lang (1995)   (221 citations)  (Correct)

....measure to provide the rating prediction, we felt this would not fully take advantage of the gradations of relevance feedback obtained from the user. 2. 4 MINIMUM DESCRIPTION LENGTH (MDL) The alternative machine learning technique we compare to tf idf weighting is based on the MDL principle [Rissanen, 78] The MDL principle provides an information theoretic framework for balancing the tradeoff between model complexity and training error. In NewsWeeder s domain, this tradeoff involves how to weight each token s importance and how to decide which tokens should be left out of the model for not ....

J. Rissanen. Modelling by Shortest Data Description, Automatica, 14:465-471, 1978


Simplicity, Psychological Plausibility and Connectionism in.. - Ellison   (Correct)

....size of the data expressed in the terms of that grammar. This statement draws a connection between simplicity and the twin measures of the hypothesis suitability, Minimum Message Length (MML) and Minimum Description Length (MDL) MML (Wallace and Boulton 1968, Wallace and Freeman 1987) and MDL (Rissanen 1978, 1987) are similar techniques for evaluating hypotheses in the face of data. Although they are definitionally distinct, adding a stipulation to MDL makes it equivalent in effect to MML 1 . Li and Vitanyi (1987,1993) have shown these techniques to have close formal ties with the philosophical ....

Rissanen, J. (1978), `Modelling by shortest data description ', Automatica 14, 445--71.


A Kolmogorov Complexity-based Genetic Programming.. - De Falco.. (2000)   (1 citation)  (Correct)

.... by Kolmogorov with important later developments by Martin L of [Lof66] and Chaitin [Cha66] The second stream (chronologically the rst) springs from the work of Solomono , while the third was introduced by Wallace and Boulton [Wal68] with a similar but independent development by Rissanen [Ris78]. The motivations behind their work were completely di erent: for example Solomono , on the one hand, was interested in inductive inference and arti cial intelligence, with reference to the problem of sequential prediction and to unordered data prediction. Kolmogorov, on the other hand, was ....

J.J. Rissanen (1978). Modelling by shortest data description. Automatica 14:465-471.


A Review of the Parameter Estimation Problem of Fitting.. - Petersson, Holmström (1997)   (Correct)

....to a minimized variance (and a weighted least squares with ML weights) The term U n should measure the complexity of the model. How to chose U n is subjective. Ljung [52, page 421] has two suggestions, either the Akaike criterion [3, 4] U n (M) dim n (18) or the Rissanen criterion [74] U n (M) log n n dim : 19) A Review of the Problem of Fitting Positive Exponential Sums to Empirical Data 9 The goal of the Akaike criterion is to find a system description that gives the smallest mean square error. The goal of the Rissanen criterion is to achieve the shortest possible ....

J. Rissanen. Modelling by shortest data description. Automatica, 14:465--471, 1978.


Action-Reaction Learning: Analysis and Synthesis of Human Behaviour - Jebara (1998)   (4 citations)  (Correct)

....well to a desired solution. To summarize, the integrated system still has some subtleties that need to be addressed and is not a black box. There are some parameters that influence its efficiency, complexity, effectiveness, etc. and there are some principled ways to address these (refer to [5] [50]) Chapter 9 Interaction and Results Having discussed all the components and their integration into a complete system, we illustrate a practical application of ARL. The system is used to acquire interactive behaviour in a constrained scenario and results are presented for evaluation purposes. ....

J. Rissanen. Modelling by the shortest data description. Automatica, 1978.


Analysis of Different Model Selection Criteria - Kverh, Pajdla (1999)   (Correct)

....of the model m. By applying a natural logarithm over Eq (2) and then by multiplying it with 2 we obtain the following formula for BIC information criteria: BIC(#m ) 2 log L(#m ) dm log n. 3) 1. 2 Model selection based on MDL principle Model selection criteria based on MDL principle [6] try to minimize the number of bits needed to encode the observed data using model m: len m = len(e) len(# m ) 4) Knowing the model parameters and the residuals of data points, we are able to reconstruct the original data. Function len calculates the length of the code needed to encode ....

J. Rissanen. Modelling by shortest data description. Automatica, 14: 468-471, 1978.


MDL Summarization with Holes - Bu, Lakshmanan, Ng (2005)   (Correct)

No context found.

J. Rissanen. Modelling by shortest data description. Automatica, volumne 14, pp 465--471, 1978.


The Evolution Of Genetic Representations And Modular Adaptation - Toussaint (2003)   (4 citations)  (Correct)

No context found.

Rissanen, J. (1978). Modelling by shortest data description. Automatica 14, 465--471.


How Does Lexical Acquisition Begin? A Cognitive Perspective - Kit (2003)   (Correct)

No context found.

J. Rissanen. Modelling by shortest data description. Automatica, 14:465--471, 1978.


Data Driven Gesture Model Acquisition Using Minimum.. - Walter, Psarrou, Gong (2001)   (Correct)

No context found.

J. Rissanen. Modelling by shortest data description. Automatica, 14:465--471, 1978.


Message Length Estimators, Probabilistic Sampling and Optimal.. - Davidson, Yin   (Correct)

No context found.

Rissanen, J.J, Modelling by Shortest Data Description, Automatica, 14, pp. 465-471, 1978.


Appendix A - Experimental Details This   (Correct)

No context found.

Rissanen, J. Modelling by shortest data description. Automatica 14 (1978), 465-471.


Oiling the Wheels of Change: The Role of Adaptive.. - Abbass, Sastry, Goldberg (2004)   (Correct)

No context found.

J. J. Rissanen, "Modelling by shortest data description," Automatica, vol. 14, pp. 465--471, 1978.


Discriminative, Generative and Imitative Learning - Jebara (2002)   (Correct)

No context found.

J Rissanen. Modelling by the shortest data description. Automatica, 14, 1978.


Identifying Parameters and Model Order for Two Classes of.. - Petersson, Holmström (1998)   (Correct)

No context found.

J. Rissanen. Modelling by shortest data description. Automatica, 14:465--471, 1978.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC