5 citations found. Retrieving documents...
David J.C. MacKay. Bayesian Modeling and Neural Networks. PhD thesis, Dept. of Computation and Neural Systems, CalTech, 1992.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Bayesian Approaches to Segmenting a Simple Time Series - Oliver, Forbes (1997)   (3 citations)  (Correct)

....C nits. Hence the message length is approximately MessLen(y v ) Gamma log P rob(y v ) d 2 C (11) 3. 5 Why is the MML Different from the Bayes Factor Approach A number of authors who advocate the Bayes factors (or Evidence) approach have suggested that MDL is equivalent to Bayes factors[5, 12], and that MML is an approximation to Bayes factors[13] 1 By coding theory, we can encode an event of probability P in a message of length Gamma log P nits. A nit is the unit of message length when we take logarithms to the base e. Hence 1 nit 1.44 bits. The effectiveness of MML and MDL ....

David J.C. MacKay. Bayesian Modeling and Neural Networks. PhD thesis, Dept. of Computation and Neural Systems, CalTech, 1992.


MDL and MML: Similarities and Differences - Baxter, Oliver (1995)   (24 citations)  (Correct)

....inference are not often distinguished in the literature. A typical example is the following passage from MacKay Although some of the earliest work on complex model comparison involved the MDL framework [PW82] MDL has no apparent advantages, and in my work I approximate the evidence directly [Mac92, page 17]. Here MacKay has classified the MML results of Patrick and Wallace as MDL. The confounding of the two approaches is understandable because both, at least originally, involved coding the parameters of models and then choosing the model with the shortest message containing a description of the ....

.... (SC) of data D, relative to the model class of Equation (2) is [Ris89, page 59] SC = Gamma log P rob(D) 3) P rob(D) is also called the evidence[Goo85] and the marginal likelihood[KR93] Approximations to Equation (2) are the basis of the Bayesian inference methods of Schwarz[Sch78] and MacKay[Mac92]. These approaches are discussed in Oliver and Baxter[OB94] If a single model within a model class needs to be chosen, then a non coding principle, such as the maximum likelihood estimator, is used. Prior distributions on parameters are used only insofar that they are needed to evaluate the ....

David J.C. MacKay. Bayesian Modeling and Neural Networks. PhD thesis, Dept. of Computation and Neural Systems, CalTech, 1992.


MML and Bayesianism: Similarities and Differences (Introduction .. - Oliver, al. (1994)   (18 citations)  (Correct)

....number of parameters. For example, the set of neural nets with 1 hidden layer, and 3 input nodes, 6 hidden nodes, and 2 output nodes would constitute a model class. 6. 2 Levels of Inference In some Bayesian literature, model class selection is considered a distinct task from parameter estimation [16, 13, 22]. Using MacKay s terminology [16] parameter estimation is level one inference and model class selection is level two inference. For example, deciding on the structure of a neural net (i.e. deciding on the number of hidden layers, and the number of units in the hidden layers) is a level two ....

....of neural nets with 1 hidden layer, and 3 input nodes, 6 hidden nodes, and 2 output nodes would constitute a model class. 6. 2 Levels of Inference In some Bayesian literature, model class selection is considered a distinct task from parameter estimation [16, 13, 22] Using MacKay s terminology [16], parameter estimation is level one inference and model class selection is level two inference. For example, deciding on the structure of a neural net (i.e. deciding on the number of hidden layers, and the number of units in the hidden layers) is a level two inference, assigning weights to a ....

[Article contains additional citation context not shown here]

David J.C. MacKay. Bayesian Modeling and Neural Networks. PhD thesis, Dept. of Computation and Neural Systems, CalTech, 1992.


Minimum Message Length Inference: Theory and Applications - Baxter (1996)   (2 citations)  Self-citation (Thesis)   (Correct)

....of parameters. For example, the set of neural nets with 1 hidden layer, and 3 input nodes, 6 hidden nodes, and 2 output nodes would constitute a model class. 3.5.1. 2 Levels of Inference In some Bayesian literature, model class selection is considered a distinct task from parameter estimation [89, 71, 118]. Using MacKay s terminology [89] parameter estimation is level one inference and model class selection is level two inference. For example, deciding on the structure of a neural net (i.e. deciding on the number of hidden layers, and the number of units in the hidden layers) is a level two ....

....neural nets with 1 hidden layer, and 3 input nodes, 6 hidden nodes, and 2 output nodes would constitute a model class. 3.5.1. 2 Levels of Inference In some Bayesian literature, model class selection is considered a distinct task from parameter estimation [89, 71, 118] Using MacKay s terminology [89], parameter estimation is level one inference and model class selection is level two inference. For example, deciding on the structure of a neural net (i.e. deciding on the number of hidden layers, and the number of units in the hidden layers) is a level two inference, assigning weights to a ....

[Article contains additional citation context not shown here]

David J.C. MacKay. Bayesian Modeling and Neural Networks. PhD thesis, Dept. of Computation and Neural Systems, CalTech, 1992.


MML and Bayesianism: Similarities and Differences.. - Oliver, Baxter (1994)   (18 citations)  (Correct)

No context found.

David J.C. MacKay. Bayesian Modeling and Neural Networks. PhD thesis, Dept. of Computation and Neural Systems, CalTech, 1992.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC