Results 1  10
of
163
Learning in graphical models
 STATISTICAL SCIENCE
, 2004
"... Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for ..."
Abstract

Cited by 800 (10 self)
 Add to MetaCart
(Show Context)
Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for approaching these problems, and indeed many of the models developed by researchers in these applied fields are instances of the general graphical model formalism. We review some of the basic ideas underlying graphical models, including the algorithmic ideas that allow graphical models to be deployed in largescale data analysis problems. We also present examples of graphical models in bioinformatics, errorcontrol coding and language processing.
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 758 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Propagation of Probabilities, Means and Variances in Mixed Graphical Association Models
 Journal of the American Statistical Association
, 1992
"... A scheme is presented for modelling and local computation of exact probabilities, means and variances for mixed qualitative and quantitative variables. The models assume that the conditional distribution of the quantitative variables, given the qualitative, is multivariate Gaussian. The computationa ..."
Abstract

Cited by 178 (2 self)
 Add to MetaCart
A scheme is presented for modelling and local computation of exact probabilities, means and variances for mixed qualitative and quantitative variables. The models assume that the conditional distribution of the quantitative variables, given the qualitative, is multivariate Gaussian. The computational architecture is set up by forming a tree of belief universes, and the calculations are then performed by local message passing between universes. The asymmetry between the quantitative and qualitative variables sets some additional limitations for the specification and propagation structure. Approximate methods when these are not appropriately fulfilled are sketched. Lauritzen and Spiegelhalter (1988) showed how to exploit the local structure in the specification of a discrete probability model for fast and efficient computation, thereby paving the way for exploiting probability based models as parts of realistic systems for planning and decision support. The technique was subsequently imp...
Optimal Junction Trees
 In UAI
, 1994
"... The paper deals with optimality issues in connection with updating beliefs in networks. We address two processes: triangulation and construction of junction trees. In the first part, we give a simple algorithm for constructing an optimal junction tree from a triangulated network. In the second part, ..."
Abstract

Cited by 93 (0 self)
 Add to MetaCart
(Show Context)
The paper deals with optimality issues in connection with updating beliefs in networks. We address two processes: triangulation and construction of junction trees. In the first part, we give a simple algorithm for constructing an optimal junction tree from a triangulated network. In the second part, we argue that any exact method based on local calculations must either be less efficient than the junction tree method, or it has an optimality problem equivalent to that of triangulation. 1 INTRODUCTION The junction tree propagation method (Jensen et al., 1990
A general algorithm for approximate inference and its applciation to hybrid bayes nets
 In Uncertainty in Artificial Intelligence (UAI'98
, 1998
"... The clique tree algorithm is the standard method for doing inference in Bayesian networks. It works by manipulating clique potentials — distributions over the variables in a clique. While this approach works well for many networks, it is limited by the need to maintain an exact representation of the ..."
Abstract

Cited by 92 (2 self)
 Add to MetaCart
(Show Context)
The clique tree algorithm is the standard method for doing inference in Bayesian networks. It works by manipulating clique potentials — distributions over the variables in a clique. While this approach works well for many networks, it is limited by the need to maintain an exact representation of the clique potentials. This paper presents a new unified approach that combines approximate inference and the clique tree algorithm, thereby circumventing this limitation. Many known approximate inference algorithms can be viewed as instances of this approach. The algorithm essentially does clique tree propagation, using approximate inference to estimate the densities in each clique. In many settings, the computation of the approximate clique potential can be done easily using statistical importance sampling. Iterations are used to gradually improve the quality of the estimation. 1
Causal Inference from Graphical Models
, 2001
"... Introduction The introduction of Bayesian networks (Pearl 1986b) and associated local computation algorithms (Lauritzen and Spiegelhalter 1988, Shenoy and Shafer 1990, Jensen, Lauritzen and Olesen 1990) has initiated a renewed interest for understanding causal concepts in connection with modelling ..."
Abstract

Cited by 78 (6 self)
 Add to MetaCart
Introduction The introduction of Bayesian networks (Pearl 1986b) and associated local computation algorithms (Lauritzen and Spiegelhalter 1988, Shenoy and Shafer 1990, Jensen, Lauritzen and Olesen 1990) has initiated a renewed interest for understanding causal concepts in connection with modelling complex stochastic systems. It has become clear that graphical models, in particular those based upon directed acyclic graphs, have natural causal interpretations and thus form a base for a language in which causal concepts can be discussed and analysed in precise terms. As a consequence there has been an explosion of writings, not primarily within mainstream statistical literature, concerned with the exploitation of this language to clarify and extend causal concepts. Among these we mention in particular books by Spirtes, Glymour and Scheines (1993), Shafer (1996), and Pearl (2000) as well as the collection of papers in Glymour and Cooper (1999). Very briefly, but fundamentally,
Hybrid Bayesian Networks for Reasoning about Complex Systems
, 2002
"... Many realworld systems are naturally modeled as hybrid stochastic processes, i.e., stochastic processes that contain both discrete and continuous variables. Examples include speech recognition, target tracking, and monitoring of physical systems. The task is usually to perform probabilistic inferen ..."
Abstract

Cited by 71 (0 self)
 Add to MetaCart
Many realworld systems are naturally modeled as hybrid stochastic processes, i.e., stochastic processes that contain both discrete and continuous variables. Examples include speech recognition, target tracking, and monitoring of physical systems. The task is usually to perform probabilistic inference, i.e., infer the hidden state of the system given some noisy observations. For example, we can ask what is the probability that a certain word was pronounced given the readings of our microphone, what is the probability that a submarine is trying to surface given our sonar data, and what is the probability of a valve being open given our pressure and flow readings. Bayesian networks are
Representing and Solving Decision Problems with Limited Information
 Management Science
, 2001
"... We introduce the notion of LImited Memory Influence Diagram (LIMID) to describe multistage decision problems where the traditional assumption of no forgetting is relaxed. This can be relevant in situations with multiple decision makers or when decisions must be prescribed under memory constraints, ..."
Abstract

Cited by 62 (3 self)
 Add to MetaCart
We introduce the notion of LImited Memory Influence Diagram (LIMID) to describe multistage decision problems where the traditional assumption of no forgetting is relaxed. This can be relevant in situations with multiple decision makers or when decisions must be prescribed under memory constraints, such as e.g. in partially observed Markov decision processes (POMDPs). We give an algorithm for improving any given strategy by local computation of single policy updates and investigate conditions for the resulting strategy to be optimal. Key words: Local computation; message passing; optimal strategies; partially observed Markov decision process, single policy updating. To appear in Management Science. y Department of Mathematical Sciences, Aalborg University, Fredrik Bajers Vej 7G, DK9220 Aalborg, Denmark. 1 1
Approximation Algorithms and Decision Making in the DempsterShafer Theory of Evidence  An Empirical Study
 International Journal of Approximate Reasoning
, 1996
"... The computational complexity of reasoning within the DempsterShafer theory of evidence is one of the major points of criticism this formalism has to face. To overcome this difficulty various approximation algorithms have been suggested that aim at reducing the number of focal elements in the belief ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
(Show Context)
The computational complexity of reasoning within the DempsterShafer theory of evidence is one of the major points of criticism this formalism has to face. To overcome this difficulty various approximation algorithms have been suggested that aim at reducing the number of focal elements in the belief functions involved. This article reviews a number of algorithms based on this method and introduces a new onethe D1 algorithmthat was designed to bring about minimal deviations in those values that are relevant to decision making. It describes an empirical study that examines the appropriateness of these approximation procedures in decisionmaking situations. It presents and interprets the empirical findings along several dimensions and discusses the various tradeoffs that have to be taken into account when actually applying one of these methods. 2 1. Introduction The complexity of the computations that have to be carried out in the DempsterShafer theory of evidence (DST) [3, 10] i...
Binary Join Trees For Computing Marginals In The ShenoyShafer Architecture
, 1997
"... The main goal of this paper is to describe a data structure called binary join trees that are useful in computing multiple marginals efficiently in the ShenoyShafer architecture. We define binary join trees, describe their utility, and describe a procedure for constructing them. ..."
Abstract

Cited by 51 (9 self)
 Add to MetaCart
The main goal of this paper is to describe a data structure called binary join trees that are useful in computing multiple marginals efficiently in the ShenoyShafer architecture. We define binary join trees, describe their utility, and describe a procedure for constructing them.