| Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L., Cowell, R.G.: Bayesian Analysis in Expert Systems, Statistical Science 8(3) (1993) 219-283 |
....reasonable criterion would be local in the sense that it ignores dependencies among findings and is sensitive only to the dependencies among the ailment and findings. This observation applies to all classification and regression problems with complete data. One such local criterion, suggested by Spiegelhalter et al. 1993), is a variation on the sequential log marginal likelihood criterion: LC(S log p(a l jF l #D l #S ) 38) where a l and F l denote the observation of the ailment A and findings F in the lth case, respectively. In other words, to compute the lth term in the product, we train our model S ....
....under certain assumptions, we can derive the structure and parameter priors for many network structures from a manageable number of direct assessments. Several authors have discussed such assumptions and corresponding methods for deriving priors (Cooper and Herskovits, 1991, 1992# Buntine, 1991# Spiegelhalter et al. 1993# Heckerman et al. 1995b# Heckerman and Geiger, 1996) In this section, we examine some of these approaches. 10.1 Priors on Network Parameters First, let us consider the assessment of priors for the parameters of network structures. We consider the approachofHeckerman et al. 1995b) who address ....
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R. (1993). Bayesian analysis in expert systems. Statistical Science, 8:219--282.
....know beforehand the actual relation of the variables to each other as shown in figure 5.1, i.e. which variables are independent of others. Thus the Bayesian network is given more knowledge than the other algorithms. We could attempt to compute the structure of the network directly from the data [30, 61, 75, 76], but there doesn t exist enough data in many cases to find it correctly. The parameters, i.e. the conditional probabilities for each node, are computed from the data. If a certain combination of evidence variables are found to be impossible given the networks computed parameters, then the ....
D. Spiegelhalter, P. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, 8:219--282, 1993.
....of times as a collection of observed variables in our belief network. For instance, the accompaniment times are shown with darkened circles in Figure 3. Given an initial assignment of means and covariances to the trainable variables, we use the message passing algorithm of Bayesian networks [Spiegelhalter et al. 1993] , Cowell et al. 1999] to compute the conditional distributions (given the observed performance) of the trainable variables. We then perform analogous computations with the solo performances leading to several sets of conditional distributions for the trainable variables. These are used to ....
D. Spiegelhalter, A. P. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, 8(3):219--283, 1993.
....reasonable criterion would be local in the sense that it ignores dependencies among findings and is sensitive only to the dependencies among the ailment and findings. This observation applies to all classification and regression problems with complete data. One such local criterion, suggested by Spiegelhalter et al. 1993), is a variation on the sequential log marginal likelihood criterion: LC(Sh, O) 1ogp(atlFt,Ot, S (38) where at and Fl denote the observation of the ailment A and findings F in the th case, respectively. In other words, to compute the th term in the product, we train our model S with the ....
....under certain assumptions, we can derive the structure and parameter priors for many network structures from a manageable number of direct assessments. Several authors have discussed such assumptions and corresponding methods for deriving priors (Cooper and Herskovits, 1991, 1992; Buntine, 1991; Spiegelhalter et al. 1993; Heckerman et al. 1995b; Heckerman and Geiger, 1996) In this section, we examine some of these approaches. 10.1 Priors on Network Parameters First, let us consider the assessment of priors for the parameters of network structures. We consider the approach of Heckerman et al. 1995b) who ....
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R. (1993). Bayesian analysis in expert systems. Statistical Science, 8:219-282.
....note event. In light of the above, the description of our playing algorithm will be complete once we have described a means of computing posterior distributions on model variables, given the observation of other model variables. The literature on Bayesian Belief Networks, for example [8], 7] 9] addresses this problem thoroughly and we utilize that theory here. In particular we incorporate Lauritzen s specialization of the BBN theory to the case of Gaussian distributions [10] A brief account of the evidence propagation algorithm is given here, but we assume the reader is ....
Spiegelhalter D., Dawid A. P., Lauritzen S., Cowell R. (1993), "Bayesian Analysis in Expert Systems," Statistical Science, Vol. 8, No. 3, pp. 219--283.
....note event. In light of the above, the description of our playing algorithm will be complete once we have described a means of computing posterior distributions on model variables, given the observation of other model variables. The literature on Bayesian Belief Networks, for example [Spiegelhalter et al. 93] Lauritzen 96] Jensen 96] addresses this problem thoroughly and we utilize that theory here. In particular we incorporate Lauritzen s specialization of the BBN theory to the case of Gaussian distributions [Lauritzen 92] We do not discuss this computation here, except to mention that it is ....
Spiegelhalter D., Dawid A. P., Lauritzen S., Cowell R. (1993), "Bayesian Analysis in Expert Systems," Statistical Science, Vol. 8, No. 3, pp. 219--283.
....the dependency structure of the variables and can be interpreted as follows. The conditional distribution of a variable given all ancestors ( upstream variables in the graph) depends only on the immediate parents of the variable. Thus the model is a particular example of a Bayesian network [17], 18] 19] Exploiting the connectivity struc ture of the graph is the key to successful computing in such models. Our particular model is composed of both discrete and Gaussian variables with the property that, for every configuration of discrete variables, the continuous variables have ....
Spiegelhalter D., Dawid A. P., Lauritzen S., Cowell R. (1993), "Bayesian Analysis in Expert Systems," Statistical Science, VoL 8, No. 3, pp. 219-283.
.... the computation of marginal posterior distributions, identification of MAP configurations, and training of model parameters through a series of local operations known as message passing or flow calculations, as in the work of Spiegelhalter, Lauritzen, Cowell, Dawid, Jensen, and others, 1] [2], 3] 4] 5] The computational aspects of such models are well understood for variables with finite state space, however the methodology for evidence propagation in networks of continuous variables has received less attention. We focus here on extending the well known evidence propagation ....
....see that for any v 2 V , the family of v, fa(v) v pa(v) can be associated with a clique C(v) 2 C such that fa(v) C(v) Thus v:C(v) C and hence the representation of Eqn. 1. It is well known that the cliques of a triangulated graph can be arranged into a junction tree [12] 4] 9] [2], 3] Once the junction tree is constructed, for each pair of neighboring subsets in the tree, C 1 ; C 2 , we define the associated separator S = C 1 C 2 and let S be the collection of jCj Gamma 1 separators (not necessarily distinct) By defining OE S (x S ) j 1 for each S 2 S we can extend ....
Spiegelhalter D., Dawid A. P., Lauritzen S., Cowell R. (1993), "Bayesian Analysis in Expert Systems," Statistical Science, Vol. 8, No. 3, pp. 219--283.
....from a database of cases can help shorten this build and test cycle by suggesting an initial network. Learning algorithms for probabilistic networks developed so far can be divided into algorithms based on non Bayesian approaches [4, 17, 22, 23, 24] and algorithms based on a Bayesian approach [5, 10, 14, 21]. The non Bayesian approaches employ statistical tests on databases for deciding on the existence of arcs in the probabilistic network under construction. The Bayesian approach assumes a prior probability distribution over all possible networks and updates this distribution after observing the ....
D.J. Spiegelhalter, A.P. Dawid, S.L. Lauritzen, and R.G. Cowell. Bayesian analysis in expert systems. Statistical Science, 8:219--283, 1993.
....distributions, based on the notion of conditional independence, which we elaborate below. Many model selection algorithms (see [22 23, 38] for good reviews) have been proposed to construct Bayesian networks from data. Often these algorithms are based on assumptions similar to the following [8, 22 24, 31, 3]: 1. Each variable is discrete, having a finite number of states. We use i x and i # to denote the kth state of x i and the jth possible joint configuration of # i , respectively. We use r i and q i to denote the number of possible states of x i and the number of possible joint configurations ....
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R., "Bayesian analysis in expert systems," Statistical Science, vol.8, pp.219-282, 1993.
....input . In the mixtures of experts, is modelled by the gate while is modelled by the experts. probability statements. Recently, graphical models [126] and specifically Bayesian networks [106] have become a popular method in some applications of machine learning such as medical diagnosis [215]. One appealing property of graphical models is the ability to represent different models within a unified scheme. Buntine [27] summarises a number of such representations and describes general methods for learning with graphical models. Many graphical models, including mixture models [224] and ....
....knowledge is presented to all learning methods in a consistent manner so that those methods which can perform best with this extra information can be judged. Obviously this raises the question of how to encode prior knowledge which is a subject of current research in fields such as belief networks [215]. One final problem with any comparison of learning methods is that they are always in some sense artificial comparisons. This is because in many applications the researcher will have some idea as to which method will work well for their problem which they have worked out either through trial and ....
Spiegelhalter, D. J., Dawid, A. P., Lauritzen, S. L. and Cowell, R. G. [1993], `Bayesian analysis in expert systems', Statistical Science 8(3), 219--283.
....4 Models and Results We employed Bayesian model structure learning to infer predictive models from data and to identify key variables from the larger set of observations we collected. Over the last decade, there has been steady progress on methods for inferring Bayesian networks from data [6, 27, 12, 13]. Given a dataset, the methods typically perform heuristic search over a space of dependency models and employ a Bayesian score to identify models with the greatest ability to predict the data. The Bayesian score estimates p(modeljdata) by approximating p(datajmodel)p(model) Chickering et al. ....
D. Spiegelhalter, A. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, (8):219-282, 1993.
....the dependency structure of the variables and can be interpreted as follows. The conditional distribution of a variable given all ancestors ( upstream variables in the graph) depends only on the immediate parents of the variable. Thus the model is a particular example of a Bayesian network [16] [17], 18] 19] Exploiting the connectivity structure of the graph is the key to successful computing in such models. Our particular model is composed of both discrete and Gaussian variables with the property that, for every configuration of discrete variables, the continuous variables have ....
Spiegelhalter D., Dawid A. P., Lauritzen S., Cowell R. (1993), "Bayesian Analysis in Expert Systems," Statistical Science, Vol. 8, No. 3, pp. 219--283.
....to work with context speci c conditional (in)dependencies [21, 49] which di er from decision path to decision path. Our heuristic approach to the learning of RBMNs requires the learning of its component BNs from incomplete data. In the last few years, several methods for learning BNs have arisen [5, 12, 23, 37, 48], some of them that learn from incomplete data [9, 17, 33, 39, 40, 49] We describe how the Bayesian heuristic algorithm for the learning of BNs for data clustering developed by Pe na et al. 39] is extended to learn RBMNs. A key step in the Bayesian approach to learning graphical models in ....
Spiegelhalter, D., Dawid, A., Lauritzen, S. L., & Cowell, R. (1993). Bayesian analysis in expert systems. Statistical Science, 8, 219-282.
....Approach to Constructing Bayesian Networks 9 Bayesian networks have their roots in attempts to represent expert knowledge in domains where expert knowledge is uncertain, ambiguous, and or incomplete. Bayesian networks are based on probability theory. A primer on Bayesian networks is found in [20]. A Bayesian network model is represented at two levels, qualitative and quantitative. At the qualitative level, we have a directed acyclic graph in which nodes represent variables, and directed arcs describe the conditional independence relations embedded in the model. Figure 4 shows a Bayesian ....
....inference) in a Bayesian network is based on the notion of evidence propagation. Evidence propagation refers to an efficient computation of marginal probabilities of variables of interest, conditional on arbitrary configurations of other variables, which constitute the observed evidence [20]. Once a Bayesian network is constructed, it can be used to make inferences about the variables in the model. The conditionals given in a Bayesian network representation specify the prior joint distribution of the variables. If we observe (or learn about) the values of some variables, then such ....
. Speigelhalter, D. J., A. P. Dawid, S. L. Lauritzen and R. G. Cowell (1993) "Bayesian analysis in expert systems," Statistical Science, 8(3), 219-283.
....yet the Bayes network implementation used on Nomad can only handle discrete variables. Therefore the continuous variables need to be suitably quantized. Network training The network conditional probabilities are learned from example data using the robust Bayesian learning algorithm in [10]. Because the continuous feature vectors are quantized, there is a trade off between resolution and having enough example data to populate the conditional probability histograms. To solve this an error model was developed to perturb sample spectra and generate multiple training examples from each ....
D.J. Spiegelhalter, D.A. Philip, S.L. Lauritzen and R.G. Cowell , "Bayesian analysis in expert systems" in Statistical Science, 8(3), p219-283., 1993. 4165
....analyses. 4 Models and Results We employed Bayesian structure learning to infer predictive models from data and to identify key variables from the larger set of observations we collected. Over the last decade, there has been steady progress on methods for inferring Bayesian networks from data [6, 27, 12, 13]. Given a dataset, the methods typically perform heuristic search over a space of dependency models and employ a Bayesian score to identify models with the greatest ability to predict the data. The Bayesian score estimates p(modeljdata) by approximating p(datajmodel)p(model) Chickering, ....
D. Spiegelhalter, A. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, (8):219-282, 1993.
....in a search space for which the states correspond to individual Bayesian network structures. 1 INTRODUCTION Recently,many researchers havedeveloped methods for learning Bayesian networks from data. The available techniques include Bayesian methods [Cooper and Herskovits, 1991, Buntine, 1991, Spiegelhalter et al. 1993, Heckerman et al. 1995] quasi Bayesian methods [Lam and Bacchus, 1993, Bouckaert, 1993] and non Bayesian methods [Pearl and Verma, 1991, Spirtes et al. 1993] Much of the work in learning Bayesian networks has been devoted to the derivation of a scoring metric.Given a candidate Bayesian ....
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R. (1993). Bayesian analysis in expert systems. Statistical Science, 8:219--282.
....all the remaining variables. These independence relationships can often be represented in terms of a graph, where the variables are associated with the nodes, and a missing edge represents a particular independence relationship (precise definitions can be found in the Appendix) See, for instance, [34, 29, 44, 12, 40, 11] for general reviews, treatments, or pointers to the large literature on this topic. The independence relationships result in the fundamental fact that the global high dimensional probability distribution P(x 1 ; xn ) over all variables, can be factored into a product of simpler local ....
D. J. Spiegelhalter, A. P. Dawid, S. L. Lauritzen, and R. G. Cowell. Bayesian analysis in expert systems. Stat. Sci., 8:219--283, 1993.
....analyses. 4 Models and Results We employed Bayesian structure learning to infer predictive models from data and to identify key variables from the larger set of observations we collected. Over the last decade, there has been steady progress on methods for inferring Bayesian networks from data [6, 27, 12, 13]. Given a dataset, the methods typically perform heuristic search over a space of dependency models and employ a Bayesian score to identify models with the greatest ability to predict the data. The Bayesian score estimates p(model data) by approximating p(data model)p(model) Chickering, ....
D. Spiegelhalter, A. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, (8):219--282, 1993.
.... N jM) N Y i=1 P (v i ; u i jv i Gamma1 ; u i Gamma1 ; M) N Y i=1 P (v i jv i Gamma1 ; u i ; M) N Y i=1 P (u i jv i Gamma1 ; u i Gamma1 ; M) 3) Of these two products, the first one was called the partial (marginal) likelihood in [4] and conditional node monitor in [22]. We now see that if we use the partial marginal likelihood as a basis for a prequential scoring function, this results in a sequential process where at time i, the classification predictive distribution P (V i jv i Gamma1 ; u i ; M) P (V i jv i Gamma1 ; u i Gamma1 ; u i ; M) 4) is ....
D. Spiegelhalter, P. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, 8(3):219--283, 1993.
....network. The strength of Bayesian networks is that they provide a theoretical framework for combining statistical data with prior knowledge about the problem domain. Therefore, they are particularly useful in practical applications. Bayesian networks have been widely used for medical diagnosis[22][9] troubleshooting[10] and in the communication network field, they have been proposed to diagnose faults in Linear Lightwave Networks[6] In [6] other methods have been used for detection and the Bayesian networks are used for diagnosis only. In this work we propose using a Bayesian network as ....
D.J. Spiegelhalter, A.P. Dawid, S.L. Lauritzen, R.G. Cowell, "Bayesian analysis in expert systems," Statistical Science, vol. 8, no. 3, 1993, pp. 219-288.
....to parameterize the prior network (especially because of the counterintuitive conditioning on G c : see [HGC95] for a discussion) In addition, computing the parameter priors for an arbitrary graph structure from such a prior network requires running an inference algorithm, which can be slow. SDLC93] suggest a similar way of computing Dirichlet priors from a prior network. A much simpler alternative is to use a non informative prior. A natural choice is ijk = 0, which corresponds to maximum likelihood. In the binary case, this is called Haldane s prior. However, this is an improper ....
David J. A. Spiegelhalter, Philip Dawid, Steen L. Lauritzen, and Robert G. Cowell. Bayesian analysis in expert systems. Statistical Science, 8(3):219-283, 1993.
....to possess certain elegant asymptotic properties, but for some reason this method has been rarely used in practice. For our purposes, we modify the prequential score for classification domains by using Cox s partial marginal likelihood principle [ Cox, 1975 ] as suggested in [ Dawid, 1991 ] In [ Spiegelhalter et al. 1993 ] the resulting criterion was called the conditional node monitor. The criteria discussed above were empirically evaluated in different supervised model selection domains by using 18 public domain real world classification data sets. In order to eliminate the effect of model search, we wanted ....
.... P (v i ; u i j v i Gamma1 ; u i Gamma1 ; M) N Y i=1 P (v i j v i Gamma1 ; u i ; M) 6) Delta N Y i=1 P (u i j v i Gamma1 ; u i Gamma1 ; M) Of these two products, the first one was called the partial (marginal) likelihood in [ Cox, 1975 ] and conditional node monitor in [ Spiegelhalter et al. 1993 ] We see that if we use the partial marginal likelihood as a basis for a prequential scoring function, this results in a sequential process where at time i, the classification predictive distribution P (V i jv i Gamma1 ; u i ; M) P (V i jv i Gamma1 ; u i Gamma1 ; u i ; M) is ....
D. Spiegelhalter, P. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, 8(3):219--283, 1993.
....HP has established a research laboratory at Aalborg University and invested in the Aalborg company, HUGIN Expert A S. Bayesian network will be introduced using pregnancy testing as an example. Interested readers are referred to articles such as Lauritzen Spiegelhalter (1988) the key paper) and Spiegelhalter et al. 1993). An introductory book is Jensen (1996) A web search will produce many links but A Brief Introduction to Graphical Models and Bayesian Networks 2 by Kevin Murphy may serve as a starting point. Examples of the potential within pig production can be found at the homepage Potential Application ....
Spiegelhalter, D.J., A.P. Dawid, S.L. Lauritzen, & R.G. Cowell (1993). Bayesian Analysis in Expert Systems. Statistical Science, 8(3) pp. 219--283.
....to really express BK. This poses serious limitations to the application of Bayesian approaches in typical IDA settings. Some solutions to this problem have been proposed by the Bayesian community: priors are elicited through ranges or intervals, that are updated on the basis of data collection [19, 17]. Other solutions refuse the Bayesian statements, and first elicit and then update BK using different approaches, like fuzzy logic, Inductive Logic Programming and hierarchical structuring. We believe that IDA researcher may select the proper method to be applied for a certain application also in ....
Spiegelhalter D., Dawid A., Lauritzen S., Cowell R., Bayesian Analysis in Expert Systems. Statistical Science, 8 (1993) 219-283.
.... Kristensen, 1995) and may be combined with recent advances within Bayesian statistics 1. 3 The framework for specification The specification of the prior distribution is similar to the specification need within Bayesian approaches to statistical analysis and learning in expert systems (Spiegelhalter et al. 1993, 1996) One widely used program is the so called WinBUGS program (Spiegelhalter et al. 1999) The WinBUGS program is intended for inference in graphical models using the Markov Chain Monte Carlo approach. The original intention in the Dina pig model was to use the WinBUGS language for the ....
Spiegelhalter, D.J., A.P. Dawid, S.L. Lauritzen, & R.G. Cowell (1993). Bayesian Analysis in Expert Systems. Statistical Science, 8(3) pp. 219--283.
....Stochastic EM algorithm. 1 Introduction Graphical models have been a very useful tool to characterize and model dependences in large and complex models (Whittaker, 1990; Sanmart in, 1997) In Bayesian analysis, they became notorious when applied to knowledge propagation in large expert systems (Spiegelhalter et al. 1993). Their study was boosted with the spectacular development in the 90 s of Markov Chain Monte Carlo methods in general, and of Gibbs sampler in particular, for numerical high dimensional Bayesian integrations (Gelfand and Smith, 1990; Gilks et al. 1996; Spiegelhalter et al. 1996) The development ....
Spiegelhalter D.J., Dawid, A.P., Lauritzen, S.L. and Cowell, R.G. (1993). Bayesian analysis in expert systems (with discussion). Statist. Sci., 8, 219-283.
....statistical data with prior knowledge about the problem domain. Therefore, they are particularly useful in practical applications. d(n) n p(n) W p[W,n p(n) p[W p(n) p[n p(n) Figure 1 Example of independence assumptions Bayesian networks have been widely used for medical diagnosis [15] [4] troubleshooting [5] and in the communication network field, they have been proposed to diagnose faults in Linear Lightwave Networks [6] In [6] other methods have been used for detection and the Bayesian networks are used for diagnosis only. In this work, we propose using a Bayesian network ....
D.J. Spiegelhalter, A.P. Dawid, S.L. Lauritzen, R.G. Cowell, "Bayesian Analysis in Expert Systems," Statistical Science, vol. 8, no. 3, 1993, pp. 219-288.
....of each aspect of the mission from experience. One method of learning probability values is through the use of adaptive probabilistic networks, a subset of belief nets that can learn individual probability values and dis tributions using gradient descent [ Pearl, 1988; Russell and Norvig, 1995; Spiegelhalter et al. 1993 ] Another important extension of this project is to add temporal reasoning to the sensor planning algorithm. In general, the probability of a given object existing at a speci ed location should increase if the object has been captured with a sensor, but should decrease as time passes since the ....
D Spiegelhalter, P Dawid, S Lauritzen, and R Cowell. Bayesian analysis in expert systems. Statistical Science, 8:219-282, 1993.
....detail in Section 2. For over a decade, AI researchers have used Bayesian networks to encode expert knowledge. More recently, AI researchers and statisticians have begun to investigate methods for learning Bayesian networks, including Bayesian methods [Cooper and Herskovits, 1991, Buntine, 1991, Spiegelhalter et al. 1993, Dawid and Lauritzen, 1993, Heckerman et al. 1994] quasi Bayesian methods [Lam and Bacchus, 1993, Suzuki, 1993] and nonBayesian methods [Pearl and Verma, 1991, Spirtes et al. 1993] In this paper, we concentrate on the Bayesian approach, which takes prior knowledge and combines it with data ....
....of the paper (Sections 3 through 6) we explicate a set of assumptions for discrete networks networks containing only discrete variables that leads to such a class of informative priors. Our assumptions are based on those made by Cooper and Herskovits (1991, 1992) herein referred to as CH Spiegelhalter et al. 1993) and Dawid and Lauritzen (1993) herein referred to as SDLC and Buntine (1991) These researchers assumed parameter independence, which says that the parameters associated with each node in a Bayesian network are independent, parameter modularity, which says that if a node has the same parents ....
[Article contains additional citation context not shown here]
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R. (1993). Bayesian analysis in expert systems. Statistical Science, 8:219--282.
.... of experts (e.g. Howard Matheson, 1981; Pearl, 1982; Heckerman Wellman, 1995) More recently, statisticians and computer scientists have used these models for statistical inference or learning from data (e.g. Pearl Verma, 1991; Cooper Herskovits, 1992; Spirtes, Glymour, Scheines, 1993; Spiegelhalter, Dawid, Lauritzen, Cowell, 1993; Buntine, 1994; and Heckerman, Geiger, Chickering, 1995) In particular, these researchers have applied model selection and model averaging techniques to the class of DAG models for the purposes of prediction and identifying cause and effect from observational data. The basic idea behind these ....
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R. (1993). Bayesian analysis in expert systems. Statistical Science, 8:219--282.
.... the professed motivation for investigating such models lies primarily in the second category, Wright, 1921, Blalock, 1971, Simon, 1954, Pearl 1988] causal inferences have been treated very cautiously in the statistical literature [Lauritzen Spiegelhalter 1988, Cox 1992, Cox Wermuth 1993, Spiegelhalter et al. 1993], as well as in the literature on influence diagrams [Howard, 1987, Shachter, 1986] and expert systems applications [Heckerman, 1990, Neapolitan, 1990] The causal interpretation of the directed arcs has been de emphasized in favor of the safer interpretation in terms of relevance and ....
Spiegelhalter, D.J., S.L. Lauritzen, P.A. Dawid and R.G. Cowell (1993), Bayesian analysis in expert systems. Statistical Science, 8 (3), 219-247.
....and the UAI community have developed an impressive body of theory and algorithmic machinery for learning Bayesian networks from data. Learned Bayesian networks can be used for pattern discovery, prediction, diagnosis, and density estimation tasks. Early pioneering work in this area includes [5, 9, 10, 13]. The algorithm that has emerged as the current most popular approach is a simple greedy hill climbing algorithm that searches the space of candidate structures, guided by a network scoring function (either Bayesian or Minimum Description Length (MDL) based) The search begins with an initial ....
D. J. Spiegelhalter, A. P. Dawid, S. L. Lauritzen, and R. G. Cowell. Bayesian analysis in expert systems. Statistical Science, 8:219-283, 1993.
....maximum penalized likelihood, or fully Bayesian approaches, which involve different computational techniques of probabilistic inference such as the expectation maximization (EM) algorithm, Gibbs sampling, Laplace approximation, and Monte Carlo methods. For an overview, see [Buntine, 1994; Spiegelhalter et al. 1993] Qualitative network induction consists in learning a network structure from a database of sample cases. In principle one could use the factorization property of a probabilistic network to evaluate its quality by comparing for each 2 the probability computed from the network with the ....
D. Spiegelhalter, A. Dawid, S. Lauritzen and R. Cowell. Bayesian Analysis in Expert Systems. Statistical Science 8(3):219--283, 1993.
.... and X is multivariate gaussian (although some of our formulation and analysis applies much more generally) Our research is related to work in the area of Bayesian model determination for directed graphical models and probabilistic expert systems, see for instance Geiger and Heckerman (1994) and Spiegelhalter et al. (1993). For undirected graphical gaussian models the main reference is Dawid and Lauritzen (1993) who introduced hyper Markov priors allowing local computations in Bayesian model determination. Applications of such priors include those of Madigan and Raftery (1994) and Madigan and York (1995) who ....
....R T = ffg, but ffg is not a separator. Remark. Theorems 1 and 2 can be employed to characterise completely the legitimate incremental changes to the edge set of a decomposable graph. An alternative possibility is to reject such moves by running maximum cardinality search (MCS, see for instance Spiegelhalter et al. 1993) after each graphical update proposal, to check if the proposed graph g 0 is decomposable. However, while MCS tests for decomposability by means of a global search through the whole of the junction forest (without building the new clique organisation) our method only requires searching through ....
Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L., and Cowell, R.J. (1993). Bayesian analysis in expert systems. Statistical Science, 8, 219--283.
....involving Loughborough, Foulum and Aalborg. During a previous visit to Aalborg University, some common ground had already been established between the mathematical genetics approach to calculating probabilities on pedigrees [1] and the expert systems approach to handling general Bayesian networks [8, 11]. In the case of a large complex pedigree, exact calculations are no longer possible due to the enormous storage requirements involved and genetic analyses, using full pedigree information, provide a serious computational challenge. Either some structural information must be discarded and ....
D. J. Spiegelhalter, A. P. Dawid, S. L. Lauritzen, and R. G. Cowell. Bayesian analysis in expert systems. Statistical Science (1993) 8(3):219--247.
No context found.
Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L., Cowell, R.G.: Bayesian Analysis in Expert Systems, Statistical Science 8(3) (1993) 219-283
No context found.
D. Spiegelhalter, P. Dawid, S. L. Lauritzen, and R. Cowell, "Bayesian analysis in expert systems," Statistical Science, vol. 8, pp. 219-282, 1993.
No context found.
D. J. Spiegelhalter, A. P. Dawid, S. L. Lauritzen, and R. G. Cowell, "Bayesian analysis in expert systems," Statist. Sci., vol. 8, pp. 219--247, 1993.
No context found.
Spiegelhalter DJ, Dawid A, Lawritzen S, Cowell R. Bayesian analysis in expert systems. Statist Sci 1993;8(3):219--83.
No context found.
D. J. Spiegelhalter et al., Bayesian analysis in expert systems, Statistical Science 8 (1993), no. 3, 219-283.
No context found.
Spiegelhalter, D., Dawid, P., Lauritzen, S., and Cowell, R., Bayesian analysis in expert systems, Statistical Science, 8:219-282, 1993.
No context found.
D.J. Spiegelhalter, A.P. Dawid, S.L. Lauritzen, R.G. Cowell, Bayesian analysis in expert systems (with discussion), Statist. Sci. 8 (1993) 219-283.
No context found.
Spiegelhalter, D.J.,Dawid, A.P.,Lauritzen, S.L. and Cowell, R.G. (1993), "Bayesian analysis in expert systems," Statistical Science, 8(3) 219--283.
No context found.
Spiegelhalter D., Dawid A. P., Lauritzen S., Cowell R. (1993), "Bayesian Analysis in Expert Systems," Statistical Science, Vol. 8, No. 3, pp. 219--283.
No context found.
D. J. Spiegelhalter, A. P. Dawid, S. L. Lauritzen, and R. G. Cowell (1993), Bayesian analysis in expert systems, Statistical Science, Vol. 8, No. 3, 219-283.
No context found.
Spiegelhalter, D.J., A.P. Dawid, S.L. Lauritzen, and R.G. Cowell. "Bayesian Analysis in Expert Systems." Statistical Science, 8, 3, 219--283, 1993.
No context found.
Spiegelhalter, D. J., Dawid, A. P., Lauritzen, S. L., & Cowell, R. G. (1993). Bayesian Analysis in Expert Systems, Statistical Science, 8, 219-283.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC