| Heckerman, D., Geiger, D., and Chickering, D. (1995). Learning discrete Bayesian networks. Machine Learning. to appear. |
....the states correspond to individual Bayesian network structures. 1 INTRODUCTION Recently,many researchers havedeveloped methods for learning Bayesian networks from data. The available techniques include Bayesian methods [Cooper and Herskovits, 1991, Buntine, 1991, Spiegelhalter et al. 1993, Heckerman et al. 1995] quasi Bayesian methods [Lam and Bacchus, 1993, Bouckaert, 1993] and non Bayesian methods [Pearl and Verma, 1991, Spirtes et al. 1993] Much of the work in learning Bayesian networks has been devoted to the derivation of a scoring metric.Given a candidate Bayesian network structure, the ....
....grant from Rockwell International. knowledge. Once the scoring metric has been defined, learning Bayesian networks reduces to a search for one or more structures that have a high metric score. Chickering (1995a) shows that this search problem is NP hard when the Bayesian scoring metric derived by Heckerman et al. 1995) is used. Consequently,itis appropriate to apply heuristic search algorithms in this domain. Before any search algorithm can be applied to a problem, wemust define the three components of a search space. First, we need to identify the states of the search, or equivalently, the set of all ....
[Article contains additional citation context not shown here]
Heckerman, D., Geiger, D., and Chickering, D. (1995). Learning discrete Bayesian networks. Machine Learning. to appear.
....according to G. 2 It follows by the definition of G h that the hypotheses corresponding 1 Researchers typically make this assumption for computational efficiency. Methods exist for scoring structures when there is missing data. 2 This is an acausal interpretation of a network structure. Heckerman et al. 1994, 1995) investigate a causal interpretation of a network structure as well. to equivalent structures are identical. We call this property hypothesis equivalence. The maximum likelihoodmetricof a structure G is defined as MML (G# C) max G L(G# G # C) It follows almost immediately from ....
....N ; c Delta Dim(G) Theorem 7 MMDL2 is scoreequivalent. Proof: The first term is score equivalentby definition of equivalent structures. The second term is score equivalentby Theorem 1. The third term is score equivalentby Theorem 3. 2 The last metric we consider is a Bayesian metric discussed byHeckerman et al. 1994, 1995) known as the BDe metric. A Bayesian metric is any metric that expresses the relative posterior probability of the structure hypothesis, given the observed cases and prior knowledge. Specifically, for anyBayesian metric M wehave M(G# C#) logp(G h j) logp( CjG h #) c (5) The first ....
[Article contains additional citation context not shown here]
Heckerman, D., Geiger, D., and Chickering, D. (1995). Learning discrete Bayesian networks. Machine Learning.toappear.
....ABSTRACT Algorithms for learning Bayesian networks from data have two components: a scoring metric and a search procedure. The scoring metric computes a score reflecting the goodness of fit of the structure to the data. The search procedure tries to identify network structures with high scores. Heckerman et al. 1995) introduce a Bayesian metric, called the BDe metric, that computes the relative posterior probability of a network structure given data. In this paper, we show that the search problem of identifying a Bayesian network among those where each node has at most K parents that has a relative ....
....that can be used to predict future events or infer causal relationships. Cooper and Herskovits (1992) herein referred to as CH derive a Bayesian metric, which we call the BD metric, from a set of reasonable assumptions about learning Bayesian networks containing only discrete variables. Heckerman et al. 1995) herein referred to as HGC expand upon the work of CH to derive a new metric, which we call the BDe metric, which has the desirable property of likelihood equivalence. Likelihood equivalence says that the data cannot help to discriminate equivalent structures. We now present the BD metric ....
Heckerman, D., Geiger, D., and Chickering, D. (1995). Learning discrete Bayesian networks. Machine Learning, 20:197-243.
....For over a decade, AI researchers have used Bayesian networks to encode expert knowledge. More recently, AI researchers and statisticians have begun to develop Bayesian methods for learning these networks from data [Cooper and Herskovits, 1991, Buntine, 1991, Spiegelhalter et al. 1993, Heckerman et al. 1994, 1995]. These methods take prior knowledge and combine it with data to produce one or more Bayesian networks. The resulting networks can be used for prediction or, in special cases, to infer causal relationships among variables. Essentially all of these learning methods use search to find network ....
....probabilities. One of the most challenging tasks in deriving a scoring metric is identifying classes of expressive and easy to assess priors for network structures and their parameters. The assessment of prior probabilities of network structure is treated elsewhere (e.g. Buntine, 1991, and Heckerman et al. 1995) In this paper we concentrate on the assessment of priors for network parameters. Learning Bayesian Networks, MSR TR 95 02 2 Many authors discuss the assessment of these priors for discrete networks networks containing only discrete variables (e.g. Cooper and Herskovits, 1991, 1992, ....
[Article contains additional citation context not shown here]
Heckerman, D., Geiger, D., and Chickering, D. (1995). Learning discrete Bayesian networks. Machine Learning, to appear.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC