11 citations found. Retrieving documents...
D. Heckerman and D. Geiger, "Likelihood and parameter priors for Bayesian networks," Tech. Rep. MSRTR -95-54, Microsoft Research, 1995.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Averaging, Maximum Penalized Likelihood and Bayesian.. - Ormoneit, Tresp   (Correct)

....degree of regularization is now determined by simply varying the equivalent sample sizes , and Sigma . In our experiments we will set = 0 and only vary Sigma , i.e. we only put a prior weight on Sigma. 2 The term equivalent sample size is taken from Heckerman and Geiger [29], who apply a similar approach to learning in Bayesian networks. 3 i.e. assuming an (improper) non informative uniform (hyper ) prior distribution. We employ the uniform prior in place of the more commonly used Jeffereys prior in order to obtain the EM update rules for maximum likelihood ....

D. Heckerman and D. Geiger, "Likelihood and parameter priors for Bayesian networks," Tech. Rep. MSRTR -95-54, Microsoft Research, 1995.


Dimensionality Reduction in Unsupervised Learning of.. - Peña, Lozano.. (2001)   (2 citations)  (Correct)

....V al(C) represents the marginal likelihood of a domain containing only continuous variables under the assumption that the continuous data is sampled from a multivariate normal distribution. Then, these terms can be evaluated in factorable closed form under some reasonable assumptions according to [15, 16, 19]. 7 3 Automatic Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks This section is devoted to the detailed presentation of a new automatic dimensionality reduction scheme applied to unsupervised learning of CGNs. The section starts with an introductory revision on ....

D. Heckerman and D. Geiger, \Likelihoods and Parameter Priors for Bayesian Networks," Technical Report MSR-TR-95-54, Microsoft Research, Redmond, WA, 1995.


Performance Evaluation of Compromise Conditional Gaussian .. - Peña, Lozano, Larrañaga (2001)   (Correct)

....j c; s h ) represents the marginal likelihood of a domain containing only continuous random variables under the assumption that the data is sampled from a multivariate normal distribution. Then, these terms can be evaluated in factorable closed form under some reasonable assumptions according to [13, 14, 17]. 3 Compromise Conditional Gaussian Networks for Data Clustering After introducing unrestricted CGNs for data clustering in the previous section, we propose two classes of compromise CGNs for data clustering. These compromise models represent an appealing balance between the cost of the ....

Heckerman, D. and Geiger, D., Likelihoods and Parameter Priors for Bayesian Networks, Technical Report MSR-TR-95-54, Microsoft Research, Redmond, WA, 1995.


Comparing Prequential Model Selection Criteria in.. - Kontkanen, Myllymäki.. (2001)   (Correct)

....that the marginal likelihood measure depends on the prior distribution P ( jM) defined on the model parameters. This prior can either be regarded as a formalization of our prior domain knowledge, in which case we are faced with the question of compatibility and consistency between different priors [12, 3], or only as a technical parameter representing no such information. In the latter case, it can be shown that a certain prior known as Jeffreys prior [14, 1] can be given strong theoretical justification from the predictive performance point of view with respect to the so called minimax loss ....

D. Heckerman and D. Geiger. Likelihoods and parameter priors for Bayesian networks. Technical Report MSR-TR-95-54, Microsoft Research, 1995.


Urban Legends in Bayesian Network Research I.. - Kontkanen..   (Correct)

....most commonly used model selection criterion in the Bayesian network literature is the (unsupervised) marginal likelihood, sometimes also called the evidence measure. By making certain technical assumptions, this criterion can be computed efficiently, as described in [ Cooper and Herskovits, 1992; Heckerman et al. 1995 ] Although this score can be shown to possess some desirable theoretical properties (see [ Bernardo and Smith, 1994; Merhav and Feder, 1998 ] the results hold only under very specific assumptions. Regardless of this, marginal likelihood is typically used also in model selection tasks where ....

....( jM)d : 1) We see that the marginal likelihood measure depends on the prior distribution P ( jM) defined on the model parameters. This prior can either be regarded as a formalization of our prior domain knowledge, which leads to interesting questions about the compatibility of different priors [ Heckerman and Geiger, 1995; Cowell, 1992 ] or as a technical parameter representing no such information. In the latter case, it can be shown that a certain prior known as Jeffreys prior [ Jeffreys, 1946; Berger, 1985 ] can be given strong theoretical justification from the predictive performance point of view with ....

D. Heckerman and D. Geiger. Likelihoods and parameter priors for Bayesian networks. Technical Report MSR-TR-95-54, Microsoft Research, 1995.


Rule Mining with Prior Knowledge - A Belief Networks Approach - Zhou, Liu, Li, Chua (2001)   (Correct)

....is obtained. Assumption 4 is that for the structure B s , the prior distribution density for the conditional probabilities is uniform. The four assumptions are restrictive. There has been a great deal of recent work building on general results from Bayesian statistics to relax these assumptions [10, 13, 12, 17]. The K2 requires an ordering on all the nodes prior to learning. Such an ordering prevents the formation of causal cycles. Moreover, it restricts the set of all possible parents of a node to the set of all nodes preceding it. Learning will then determine the genuine parents of a node only from ....

D. Heckerman and D. Geiger. Likelihoods and parameter priors for bayesian networks. Technical Report MSR-TR-95-54, Microsoft Research, 1995.


Inexact graph matching using learning and.. - Bengoetxea.. (2000)   (Correct)

.... in expert systems [32] Only probabilistic graphical models whose structural part is a directed acyclic graph will be considered, as these adapt properly to EDAs the only documented EDA that makes use of not directed graphical models is FDA [47] The following is an adaptation of the paper [31], and will be used to introduce Bayesian networks as a probabilistic graphical model suitable for its application in EDAs. Let X = X1 ; Xn ) be a set of random variables, and let x i be a value of X i , the i th component of X. Let y = x i ) X i 2Y be a value of Y X. Then, a ....

D. Heckerman and D. Geiger. Likelihoods and parameter priors for Bayesian networks. Technical Report. MSR-TR-9554. , 1995.


Incremental Methods for Bayesian Network Learning - Roure, Sangüesa (1999)   (Correct)

....several forms. Now it seems clear that when one expresses some prior knowledge each of the terms of this expression for the evaluation functions may be affected. We must note that in the literature usually only the first term, Gamma log P (B s ) is modified when some prior knowledge is given [16, 5]. Clearly P (B s ) is affected. If a prior is defined there is an influence of the distribution P (B s ) However, this influence is not restricted to these term. In effect, the formulation of H(B s ; D) is in fact an expression of how to estimate the local probability distributions factorized by ....

D. Heckerman and D. Geiger. Likelihoods and parameter priors for Bayesian networks. Technical Report MSR-TR-95-54, Microsoft Research, Advanced Technology Division, 1995.


Estimating The Parameters Of Mixed Bayesian Networks From.. - McMichael, Liu, Pan (1999)   (1 citation)  (Correct)

....easily be incorporated, by summing the appropriate sufficient statistics corresponding to the equated parameters. When there are hidden nodes or a high percentage of data is missing the likelihood function can be expected to have multiple maxima. Regularisation with appropriate smoothing priors [14] may become necessary. Figure 3: A more complex LNCBN (see below) It is possible to design an algorithm for estimating the parameters of LNCBNs with independence structure within the leaf node CG distributions, in which some of the continuous leaf node distributions are independent of the parent ....

D. Heckerman and D. Geiger, "Likelihoods and parameter priors for Bayesian networks," Tech. Rep. MSR-TR-95-54, Micro Soft Research, Redmond, WA, November 1995.


On Supervised Selection of Bayesian Networks - Kontkanen, Myllymäki.. (1999)   (Correct)

....both the model structure and the model parameters. However, this difference is not crucial in the Bayesian network modeling framework where setting the parameter values to their expected values produces a predictive distribution which is equal to that obtained by integrating over the parameters (Heckerman et al. 1995). Second, although both studies deal with extensions of the Naive Bayes model, Friedman et al. 1997) try to relax the unrealistic Naive Bayes assumptions by adding arcs between the leaf nodes of the network (the TAN model) while in this paper we prune the Naive Bayes network by removing arcs ....

....( jM )d : 1) We see that the marginal likelihood measure depends on the prior distribution P ( jM ) defined on the model parameters. This prior can either be regarded as a formalization of our prior domain knowledge, which leads to interesting questions about the compatibility of different priors (Heckerman Geiger, 1995; Cowell, 1992) or as a technical parameter representing no such information. In the latter case, it can be shown that a certain prior known as Jeffreys prior (Jeffreys, 1946; Berger, 1985) can be given strong theoretical justification from the predictive performance point of view with respect ....

Heckerman, D., & Geiger, D. (1995). Likelihoods and parameter priors for Bayesian networks (Tech.


Estimating The Parameters Of Mixed Bayesian Networks From.. - McMichael, Liu, Pan (1999)   (1 citation)  (Correct)

....easily be incorporated, by summing the appropriate sufficient statistics corresponding to the equated parameters. When there are hidden nodes or a high percentage of data is missing the likelihood function can be expected to have multiple maxima. Regularisation with appropriate smoothing priors [14] may become necessary. Figure 3: A more complex LNCBN (see below) It is possible to design an algorithm for estimating the parameters of LNCBNs with independence structure within the leaf node CG distributions, in which some of the continuous leaf node distributions are independent of the ....

D. Heckerman and D. Geiger, "Likelihoods and parameter priors for Bayesian networks," Tech. Rep. MSR-TR-95-54, Micro Soft Research, Redmond, WA, November 1995.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC