#### DMCA

## Operations for Learning with Graphical Models (1994)

### Cached

### Download Links

- [www-cad.eecs.berkeley.edu]
- [ftp.gmd.de]
- [arxiv.org]
- CiteULike
- DBLP

### Other Repositories/Bibliography

Venue: | Journal of Artificial Intelligence Research |

Citations: | 274 - 13 self |

### Citations

12152 | Elements of information theory - Cover, Thomas - 2012 |

11684 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...ation of model M 1 applied either at the model level or the parameter level: Gibbs sampling,srst described in Section 7.1, other more general Markov chain Monte Carlo algorithms, EM style algorithms (=-=Dempster, Laird, & Rubin, 1977-=-), and various closed form approximations such as the meanseld approximation, and the Laplace approximation (Berger, 1985; Azevedo-Filho & Shachter, 1994). This section summarizes the main families of... |

8725 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.
- Pearl
- 1988
(Show Context)
Citation Context ...tion is the interpretation of a Bayesian network used in this paper. 2.2 Undirected graphical models Another popular form of graphical model is an undirected graph, sometimes called a Markov network (=-=Pearl, 1988-=-). This is a graphical model for a Markov randomseld. Markov randomselds became used in statistics with the advent of the Hammersley-Cliord theorem (Besag, York, & Mollie, 1991). A variant of the the... |

6460 |
C4.5: programs for machine learning
- Quinlan
- 1993
(Show Context)
Citation Context ... and Equation (1). This conditional probability models how the disease should vary for given values of age, occupation, and climate. Class probability trees (Breiman, Friedman, Olshen, & Stone, 1984; =-=Quinlan, 1992-=-), graphs and rules (Rivest, 1987; Oliver, 1993; Kohavi, 1994), and feed-forward networks are representations devised to express conditional models in dierent ways. In statistics, the conditional dis... |

5782 |
Classification and Regression Trees
- Breiman, Olshen, et al.
- 1984
(Show Context)
Citation Context ...n the simple medical problem from Figure 2 and Equation (1). This conditional probability models how the disease should vary for given values of age, occupation, and climate. Class probability trees (=-=Breiman, Friedman, Olshen, & Stone, 1984-=-; Quinlan, 1992), graphs and rules (Rivest, 1987; Oliver, 1993; Kohavi, 1994), and feed-forward networks are representations devised to express conditional models in dierent ways. In statistics, the ... |

5031 |
Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images
- Geman, Geman
- 1984
(Show Context)
Citation Context ...nt of the Hammersley-Cliord theorem (Besag, York, & Mollie, 1991). A variant of the theorem is given later in Theorem 2.1. Markov randomselds are used in imaging and spatial reasoning (Ripley, 1981; =-=Geman & Geman, 1984-=-; Besag et al., 1991) and various stochastic models in neural networks (Hertz, Krogh, & Palmer, 1991). Undirected graphs are also important because they simplify the theory of Bayesian networks (Lauri... |

4769 | Pattern Classification and Scene Analysis - Duda, Hart - 1973 |

3040 | Generalized Linear Models, - McCullagh, Nelder - 1989 |

2332 | Judgement under Uncertainty: Heuristics and Biases," - Tversky, Kahneman - 1974 |

2190 |
Introduction to the Theory of Neural Computation
- Hertz, Krogh, et al.
- 1991
(Show Context)
Citation Context ...m is given later in Theorem 2.1. Markov randomselds are used in imaging and spatial reasoning (Ripley, 1981; Geman & Geman, 1984; Besag et al., 1991) and various stochastic models in neural networks (=-=Hertz, Krogh, & Palmer, 1991-=-). Undirected graphs are also important because they simplify the theory of Bayesian networks (Lauritzen et al., 1990). Figure 3 shows a simple 44 image and an undirected model for the image. This mo... |

1939 |
Practical Optimization
- Gill, Murray, et al.
- 1981
(Show Context)
Citation Context ...ppose a graph is used to compile a function that searches for the MAP values of parameters in the graph conditioned on the known data. In general, this requires use of numerical optimization methods (=-=Gill, Murray, & Wright, 1981-=-). To use a gradient descent, conjugate gradient or Levenberg-Marquardt approach requires calculation ofsrst derivatives. To use a Newton-Raphson approach requires calculation of second derivatives, a... |

1830 |
Statistical decision theory and bayesian analysis. 2nd ed.
- Berger
- 1985
(Show Context)
Citation Context ...al Markov chain Monte Carlo algorithms, EM style algorithms (Dempster, Laird, & Rubin, 1977), and various closed form approximations such as the meanseld approximation, and the Laplace approximation (=-=Berger, 1985-=-; Azevedo-Filho & Shachter, 1994). This section summarizes the main families of these approximate methods. 7.1 Gibbs sampling Gibbs sampling is the basic tool of simulation and can be applied to most ... |

1505 | Local computations with probabilities on graphical structures and their application to expert systems (with discussion). - Lauritzen, Spiegelhalter - 1988 |

1366 | A bayesian method for the induction of probabilistic networks from data
- Cooper, Herskovits
- 1992
(Show Context)
Citation Context ...e evidence for each subgraph: evidence(M) = P Y i=0 evidence(M S i ) : (24) This holds in general if the original graph G is a Bayesian network, as used in learning Bayesian networks (Buntine, 1991c; =-=Cooper & Herskovits, 1992-=-). Corollary 6.1.2 Equation (24) holds if the parent graph G is a Bayesian network with plates. In general, we might consider searching through a family of graphical models. To do this local search (J... |

1127 | Optimization in constraint networks - Dechter, Dechter, et al. - 1990 |

1118 | Optimal Statistical Decisions - DeGroot |

1106 | An Introduction to Hidden Markov Models,” - Rabiner, Juang - 1986 |

948 | Estimation of Dependences Based on Empirical Data - Vapnik - 1982 |

884 |
Statistical Analysis of finite mixture distributions
- Titterington, Smith, et al.
- 1985
(Show Context)
Citation Context ...e, if some hidden variables are introduced into the data, the problem becomes exponential family if the hidden values are known. This is represented by the mixture model in Figure 22. Mixture models (=-=Titterington, Smith, & Makov, 1985-=-; Poland, 1994) are used to model unsupervised learning, incomplete data in the classication problems, robust regression, and general density estimation. Mixture models extend the exponential family ... |

843 | Theory of probability. - Jeffreys - 1961 |

840 | UCI Repository of machine learning databases [Machine-readable data repository - Murphy, Aha - 1994 |

724 | A new method for solving hard satisfiability problems - Selman, Levesque, et al. - 1992 |

722 | Neural networks and the bias/variance dilemma - Geman, Bienenstock, et al. - 1992 |

720 |
Bayesian inference in statistical analysis.
- Box, Tiao
- 1973
(Show Context)
Citation Context ...ives their matching posteriors. More extensive summaries of this are given by DeGroot (1970) and Bernardo and Smith (1994). The parameters for these priors can be set using standard reference priors (=-=Box & Tiao, 1973-=-; Bernardo & Smith, 1994) or elicited from a domain expert. There are several other important consequences of the Pitman-Koopman Theorem or recursive arc reversal that should not go unnoticed. Comment... |

626 | Uncertainty: A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis, - Morgan, Henrion - 1990 |

556 |
Stochastic Simulation
- RIPLEY
- 1987
(Show Context)
Citation Context ...amilies of these approximate methods. 7.1 Gibbs sampling Gibbs sampling is the basic tool of simulation and can be applied to most probability distributions (Geman & Geman, 1984; Gilks et al., 1993a; =-=Ripley, 1987-=-) as long as the full joint has no zeros (all variable instantiations are possible). It is a special case of the general Markov chain Monte Carlo methods for approximate inference (Ripley, 1987; Neal,... |

489 | A practical bayesian framework for backpropagation networks
- MacKay
- 1992
(Show Context)
Citation Context ...falling in the exponential family. A rough fallback method is to calculate a MAP value for the weight parameters. This would be the method used for the Laplace approximation (Buntine & Weigend, 1991; =-=MacKay, 1992-=-) covered in (Tanner, 1993; Tierney & Kadane, 1986). The setting of priors for feed-forward networks is dicult (MacKay, 1993; Nowlan & Hinton, 1992; Wolpert, 1994), and it will not be considered here... |

479 | Simulated Annealing: Theory and Applications. - Laarhoven, Aarts - 1987 |

451 | Evaluating influence diagrams - Shachter - 1986 |

448 | Bayesian computations via the Gibbs sampler and related Markov Chain Monte Carlo methods - SMITH, ROBERTS - 1993 |

429 | An Analysis of Bayesian Classifiers - Langley, Iba, et al. - 1992 |

424 | Learning decision lists
- Rivest
- 1987
(Show Context)
Citation Context ...l probability models how the disease should vary for given values of age, occupation, and climate. Class probability trees (Breiman, Friedman, Olshen, & Stone, 1984; Quinlan, 1992), graphs and rules (=-=Rivest, 1987-=-; Oliver, 1993; Kohavi, 1994), and feed-forward networks are representations devised to express conditional models in dierent ways. In statistics, the conditional distributions are also represented a... |

420 | Decision theoretic generalizations of the PAC model for neural nets and other learning applications - Haussler - 1992 |

412 | Decision analysis and behavioral research - Winterfeldt, Edwards - 1986 |

361 | Model selection and accounting for model uncertainty in graphical models using Occam's window
- Madigan, Rafterym
(Show Context)
Citation Context ...kind of computation is done for class probability trees where representative sets of trees are found using a heuristic branch and bound algorithm (Buntine, 1991b), and for learning Bayesian networks (=-=Madigan & Raftery, 1994-=-). A sampling scheme for Bayesian networks is presented in Section 8.3. 173 Buntine Sample GIBBS Sampler New Data p(M i |Sample) p(New Data|M i ) 0.212 0.324 0.172 0.292 0.012 0.032 0.109 0.201 1.000 ... |

334 | Asymptotic Expansions of Integrals - Bleistein, Handelsman - 1975 |

334 |
Planning and Control
- Dean, Wellman
- 1991
(Show Context)
Citation Context ...omselds) used to represent correlation for images and hidden causes. Graphical models are used in domains such as diagnosis, probabilistic expert systems, and, more recently, in planning and control (=-=Dean & Wellman, 1991-=-; Chan & Shachter, 1992), dynamic systems and time-series (Kjru, 1992; Dagum, Galper, Horvitz, & Seiver, 1994), and general data analysis (Gilks et al., 1993a) and statistics (Whittaker, 1990). This... |

328 |
Accurate approximations for posterior moments and marginal densities
- Tierney, Kadane
- 1986
(Show Context)
Citation Context ...gh fallback method is to calculate a MAP value for the weight parameters. This would be the method used for the Laplace approximation (Buntine & Weigend, 1991; MacKay, 1992) covered in (Tanner, 1993; =-=Tierney & Kadane, 1986-=-). The setting of priors for feed-forward networks is dicult (MacKay, 1993; Nowlan & Hinton, 1992; Wolpert, 1994), and it will not be considered here other than assuming a prior is used, p(w). The gr... |

247 | Theory refinement on Bayesian networks. - Buntine - 1991 |

221 | How easy is local search
- Johnson, Papadimitriou, et al.
- 1988
(Show Context)
Citation Context ...2). Corollary 6.1.2 Equation (24) holds if the parent graph G is a Bayesian network with plates. In general, we might consider searching through a family of graphical models. To do this local search (=-=Johnson, Papdimitriou, & Yannakakis, 1985-=-) or numerical optimization can be used tosnd high posterior models, or Markov chain Monte Carlo methods to select a sample of representative models, as discussed in Section 7.2. To do this, how to re... |

217 |
Bayesian analysis in expert systems
- Spiegelhalter, Dawid, et al.
- 1993
(Show Context)
Citation Context ... problem is composed of a product of multinomials or Gaussians. This is the basis of various Bayesian algorithms developed for these problems (Buntine, 1991b; Madigan & Raftery, 1994; Buntine, 1991c; =-=Spiegelhalter, Dawid, Lauritzen, & Cowell, 1993-=-; Heckerman, Geiger, & Chickering, 1994). Strictly speaking, decision trees and Bayesian networks over multinomial or Gaussian variables are in the exponential family (see Comment 4.1). However, it is... |

215 | Sequential updating of conditional probabilities on directed graphical structures - Spiegelhalter, Lauritzen - 1990 |

214 | Connectionist learning of belief networks - Neal - 1992 |

208 | A Course in Density Estimation - Devroye - 1987 |

200 | Reference posterior distributions for Bayesian inference (with discussion - Bernardo - 1979 |

181 |
Hyper-Markov laws in the statistical analysis of decomposable graphical models
- Dawid, Lauritzen
- 1993
(Show Context)
Citation Context ...his incremental modication of evidence, Bayes factors, andsnest decompositions is also general, and follows directly from the independence test. A similar property for undirected graphs is given in (=-=Dawid & Lauritzen, 1993-=-). This is developed below for the case of directed arcs and non-deterministic variables. Handling deterministic variables will require repeated application of these results, because several non-deter... |

175 | Independence properties of directed Markov fields, Network 20 - Lauritzen, Dawid, et al. - 1990 |

164 |
A language and program for complex Bayesian modelling. The Statistician 43
- Gilks, Thomas, et al.
- 1994
(Show Context)
Citation Context ...ore recently, in planning and control (Dean & Wellman, 1991; Chan & Shachter, 1992), dynamic systems and time-series (Kjru, 1992; Dagum, Galper, Horvitz, & Seiver, 1994), and general data analysis (=-=Gilks et al., 1993-=-a) and statistics (Whittaker, 1990). This paper shows the task of learning can also be modeled with graphical models. This metalevel use of graphical models wassrst suggested by Spiegelhalter and Laur... |

158 |
Solving Large-Scale Constraint Satisfaction and Scheduling Problems using a Heuristic Repair Method
- Minton, Johnston, et al.
- 1990
(Show Context)
Citation Context ...ascent in real valued problems corresponds to simple methods from function optimization (Gill et al., 1981) and in discrete problems corresponds to local repair or local search (Johnson et al., 1985; =-=Minton, Johnson, Philips, & Laird, 1990-=-; Selman, Levesque, & Mitchell, 1992). Gibbs sampling varies gradient ascent by introducing a random component. The algorithm usually tries to ascend, but will sometimes descend, as a strategy for exp... |

149 | Probabilistic similarity networks - Heckerman - 1991 |

147 | Subjective bayesian methods for rule-based inference systems - Duda, Hart, et al. - 1990 |

144 | Learning classification trees - BUNTINE - 1992 |

141 |
Simplifying neural networks by soft weight-sharing. Neural Computation,4:473-493
- Nowlan, Hinton
- 1992
(Show Context)
Citation Context ...used for the Laplace approximation (Buntine & Weigend, 1991; MacKay, 1992) covered in (Tanner, 1993; Tierney & Kadane, 1986). The setting of priors for feed-forward networks is dicult (MacKay, 1993; =-=Nowlan & Hinton, 1992-=-; Wolpert, 1994), and it will not be considered here other than assuming a prior is used, p(w). The graph implies the 189 Buntine w1 w2 w3 w4 w5 x1 x2 x3 m1 m2 o1 o2 h1 h2 h3 Sigmoid Sigmoid Sigmoid Σ... |

137 |
Bayesian back-propagation
- Buntine, Weigend
- 1991
(Show Context)
Citation Context ... no reasonable component falling in the exponential family. A rough fallback method is to calculate a MAP value for the weight parameters. This would be the method used for the Laplace approximation (=-=Buntine & Weigend, 1991-=-; MacKay, 1992) covered in (Tanner, 1993; Tierney & Kadane, 1986). The setting of priors for feed-forward networks is dicult (MacKay, 1993; Nowlan & Hinton, 1992; Wolpert, 1994), and it will not be c... |

136 | Overfitting avoidance as bias - Schaffer - 1993 |

134 |
The chain graph Markov property
- Frydenberg
- 1990
(Show Context)
Citation Context ...ent probability model to the graph G. 2.4 Mixed graphical models Undirected and directed graphs can also be mixed in a sequence. These mixed graphs are called chain graphs (Wermuth & Lauritzen, 1989; =-=Frydenberg, 1990-=-). These chain graphs are sometimes used here, However, a precise understanding of them is not required for this paper. A simple chain graph is given in Figure 6. In this case, the single disease node... |

121 | Statistical Data Analysis in the Computer Age - Efron, Tibshirani - 1991 |

121 | Bayes Factors and model uncertainty
- Kass, Raftery
- 1993
(Show Context)
Citation Context ...few simple algorithmic criteria. The basic computational techniques of probabilistic (Bayesian) inference used in this computational theory of learning are widely reviewed (Tanner, 1993; Press, 1989; =-=Kass & Raftery, 1993-=-; Neal, 1993; Bretthorst, 1994). These include various exact methods, Markov chain Monte Carlo methods such as Gibbs sampling, the expectation maximization (EM) algorithm, and the Laplace approximatio... |

120 | Unknown attribute values in induction
- Quinlan
- 1989
(Show Context)
Citation Context ... force 1;c < 2;c . Gibbs sampling applies whenever there are variables associated with the data that are not given. Hidden or latent variables are an example. Incomplete data (or missing values) (=-=Quinlan, 1989-=-), robust methods and modeling of outliers, and various density estimation and non-parametric methods all fall in this family of models (Titterington et al., 1985). Gibbs sampling generalizes to virtu... |

97 | Making Hard Decisions - Clemen - 1996 |

88 | A Theory of Learning Classification Rules - Buntine - 1990 |

87 | Decision Analysis and Expert Systems - Henrion, Breese, et al. - 1991 |

79 |
Bayes factors and choice criteria for linear models
- Smith, Spiegelhalter
- 1980
(Show Context)
Citation Context ...d the evidence for model M , or model likelihood, and is the basis for most Bayesian model selection, model averaging methods, and Bayesian hypothesis testing 171 Buntine methods using Bayes factors (=-=Smith & Spiegelhalter, 1980-=-; Kass & Raftery, 1993). The Bayes factor is a relative quantity used to compare one model M 1 with another M 2 : Bayes-factor(M 2 ;M 1 ) = p(samplejM 2 ) p(samplejM 1 ) : Kass and Raftery (1993) revi... |

76 | A program to perform Bayesian inference using Gibbs sampling. In - Thomas, Spiegelhalter, et al. - 1992 |

64 | Reconciling Bayesian and frequentist evidence in the one-sided testing problem - Casella, RL - 1987 |

59 | fields and inverse problems in imaging - Geman, “Random |

50 | Statistical mechanics of learning from examples - Seung, Sompolinsky, et al. - 1992 |

46 | Statistical analysis and the illusion of objectivity - Berry - 1988 |

42 | Automatic pattern recognition: A study of the probability of error - Devroye - 1988 |

40 | Compiling Prior Knowledge into an Explicit Bias - Cohen - 1992 |

40 |
Bayesian non-linear modelling for the energy prediction competition
- MacKay
- 1994
(Show Context)
Citation Context ...be the method used for the Laplace approximation (Buntine & Weigend, 1991; MacKay, 1992) covered in (Tanner, 1993; Tierney & Kadane, 1986). The setting of priors for feed-forward networks is dicult (=-=MacKay, 1993-=-; Nowlan & Hinton, 1992; Wolpert, 1994), and it will not be considered here other than assuming a prior is used, p(w). The graph implies the 189 Buntine w1 w2 w3 w4 w5 x1 x2 x3 m1 m2 o1 o2 h1 h2 h3 Si... |

39 | Decision graphs – An extension of decision trees
- Oliver
- 1993
(Show Context)
Citation Context ...models how the disease should vary for given values of age, occupation, and climate. Class probability trees (Breiman, Friedman, Olshen, & Stone, 1984; Quinlan, 1992), graphs and rules (Rivest, 1987; =-=Oliver, 1993-=-; Kohavi, 1994), and feed-forward networks are representations devised to express conditional models in dierent ways. In statistics, the conditional distributions are also represented as regression m... |

35 | Computing second derivatives in feedforward networks: A review - Buntine, Weigend - 1993 |

34 | Bayesian Classification with Correlation and Inheritance - Cheeseman, Hanson, et al. - 1991 |

32 | Introduction to Stochastic Processes - inlar, E - 1975 |

31 |
Normal/independent distributions and their applications in robust regression
- Lange, Sinsheimer
- 1993
(Show Context)
Citation Context ... such as Student's t distribution, or an L q norm for 1 < q < 2. By introducing a convolution, these robust regression models can be handled by combining the EM algorithm with standard least squares (=-=Lange & Sinsheimer, 1993-=-). 8.2 Feed-forward networks with a linear output layer A similar example is the standard feed-forward network where thesnal output layer is linear. This situation is given by Figure 23 if we change t... |

26 | Classifiers: a theoretical and empirical study - Buntine - 1991 |

25 | Thinking Backward for Knowledge Acquisition
- Shachter, Heckerman
- 1987
(Show Context)
Citation Context ...ov chain. A Bayesian network is a graphical model that uses directed arcs exclusively to form a directed acyclic graph (DAG), (i.e., a directed graph without directed cycles). Figure 2, adapted from (=-=Shachter & Heckerman, 1987-=-) shows a simple Bayesian network for Occupation ClimateAge DiseaseSymptoms Figure 2: A simplied medical problem a simplied medical problem. The graphical model represents a conditional decompositio... |

25 | An ordered examination of influence diagrams - Shachter - 1990 |

24 |
Gibbs sampling in Bayesian networks
- Hrycej
- 1990
(Show Context)
Citation Context ...a reordering of the variables. The second approach to performing inference is approximate and corresponds to approximate algorithms such as Gibbs sampling, and other Markov chain Monte Carlo methods (=-=Hrycej, 1990-=-; Hertz et al., 1991; Neal, 1993). In some cases, the complexity of thesrst approach is inherently exponential in the number of variables, so the second can be more ecient. The two approaches can be ... |

23 | A: Uncertain reasoning and forecasting
- Dagum, Galper, et al.
- 1995
(Show Context)
Citation Context ...d in domains such as diagnosis, probabilistic expert systems, and, more recently, in planning and control (Dean & Wellman, 1991; Chan & Shachter, 1992), dynamic systems and time-series (Kjru, 1992; =-=Dagum, Galper, Horvitz, & Seiver, 1994-=-), and general data analysis (Gilks et al., 1993a) and statistics (Whittaker, 1990). This paper shows the task of learning can also be modeled with graphical models. This metalevel use of graphical mo... |

23 | Decision analysis: Perspectives on inference, decision, and experimentation - Howard - 1970 |

21 | An Experimental Comparison of Knowledge Engineering for Expert Systems and for Decision Analysis - HENRION, COOLEY - 1987 |

19 | Statistical models for financial volatility - Engle - 1993 |

14 | Applications of Expert Systems - Quinlan - 1987 |

14 | Detecting novel classes with applications to fault diagnosis - Smyth, Mellstrom - 1992 |

13 | Estimation and inference by compact encoding - Wallace, Freeman - 1987 |

12 | Supervised learning and divide-and-conquer: A statistical approach - Jordan, Jacobs - 1993 |

12 |
Hierarchical Bayesian analysis using Monte Carlo integration: computing posterior distributions when there are many possible models
- Stewart
- 1987
(Show Context)
Citation Context ... similarity networks (1991). 188 Learning with Graphical Models Exact Bayes factors: Model selection and averaging methods are used to deal with multiple models (Kass & Raftery, 1993; Buntine, 1991b; =-=Stewart, 1987-=-; Madigan & Raftery, 1994). These require the computation of Bayes factors for models constructed during search. Exact methods for computing Bayes factors are considered in Section 6.3. Derivatives: V... |

11 | editors. Automatic Differentiation of Algorithms: Theory, Implementation and Application - Griewank, Corliss - 1991 |

10 | Probabilistic prediction of protein secondary structure using causal networks - Delcher, Kasif, et al. - 1993 |

7 | A Bayesian perspective on confidence - Heckerman, Jimison - 1989 |

7 | Evidential Reasoning Using Likelihood Weighting. Paper presented at - Shachter, Peot - 1989 |

5 | Towards efficient inference in multiply connected belief networks - Henrion - 1990 |

5 | Initial exploration of the ASRS database - Kraft, Buntine - 1993 |

5 | Maximum entropy connections: neural networks - MacKay - 1990 |

4 |
Reformulating inference problems through selective conditioning
- Dagum, Horvitz
- 1992
(Show Context)
Citation Context ...srst approach is inherently exponential in the number of variables, so the second can be more ecient. The two approaches can be combined in some cases after appropriate reformulation of the problem (=-=Dagum & Horvitz, 1992-=-). 4.1 Exact inference without plates The exact inference approach has been highly rened for the case where all variables are discrete. It is not surprising that available algorithms have strong simi... |

3 | Structural controllability and observability in influence diagrams - Chan, Shachter - 1992 |

3 | Probabilistic inference for artificial intelligence using Monte Carlo methods based on Markov chains - Neal - 1992 |

2 | A computational scheme for reasoning in dynamic probabilistic networks - Kjaeruff - 1992 |