Results 1  10
of
51
Learning Bayesian networks: The combination of knowledge and statistical data
 Machine Learning
, 1995
"... We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simpl ..."
Abstract

Cited by 1142 (36 self)
 Add to MetaCart
(Show Context)
We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simplify the encoding of a user’s prior knowledge. In particular, a user can express his knowledge—for the most part—as a single prior Bayesian network for the domain. 1
A Tutorial on Learning Bayesian Networks
 Communications of the ACM
, 1995
"... We examine a graphical representation of uncertain knowledge called a Bayesian network. The representation is easy to construct and interpret, yet has formal probabilistic semantics making it suitable for statistical manipulation. We show how we can use the representation to learn new knowledge by c ..."
Abstract

Cited by 363 (13 self)
 Add to MetaCart
We examine a graphical representation of uncertain knowledge called a Bayesian network. The representation is easy to construct and interpret, yet has formal probabilistic semantics making it suitable for statistical manipulation. We show how we can use the representation to learn new knowledge by combining domain knowledge with statistical data. 1 Introduction Many techniques for learning rely heavily on data. In contrast, the knowledge encoded in expert systems usually comes solely from an expert. In this paper, we examine a knowledge representation, called a Bayesian network, that lets us have the best of both worlds. Namely, the representation allows us to learn new knowledge by combining expert domain knowledge and statistical data. A Bayesian network is a graphical representation of uncertain knowledge that most people find easy to construct and interpret. In addition, the representation has formal probabilistic semantics, making it suitable for statistical manipulation (Howard,...
Axiomatizing causal reasoning
 Uncertainty in Artificial Intelligence
, 1998
"... Causal models defined in terms of a collection of equations, as defined by Pearl, are axiomatized here. Axiomatizations are provided for three successively more general classes of causal models: (1) the class of recursive theories (those without feedback), (2) the class of theories where the solutio ..."
Abstract

Cited by 80 (7 self)
 Add to MetaCart
(Show Context)
Causal models defined in terms of a collection of equations, as defined by Pearl, are axiomatized here. Axiomatizations are provided for three successively more general classes of causal models: (1) the class of recursive theories (those without feedback), (2) the class of theories where the solutions to the equations are unique, (3) arbitrary theories (where the equations may not have solutions and, if they do, they are not necessarily unique). It is shown that to reason about causality in the most general third class, we must extend the language used by Galles and Pearl (1997, 1998). In addition, the complexity of the decision procedures is characterized for all the languages and classes of models considered. 1.
A new look at causal independence
 In Proc. of the Tenth Conference on Uncertainty in Artificial Ingelligence
, 1994
"... Heckerman (1993) defined causal independence in terms of a set of temporal conditional independence statements. These statements formalized certain types of causal interaction where (1) the effect is independent of the order that causes are introduced and (2) the impact of a single cause on the effe ..."
Abstract

Cited by 79 (4 self)
 Add to MetaCart
Heckerman (1993) defined causal independence in terms of a set of temporal conditional independence statements. These statements formalized certain types of causal interaction where (1) the effect is independent of the order that causes are introduced and (2) the impact of a single cause on the effect does not depend on what other causes have previously been applied. In this paper, we introduce an equivalent atemporal characterization of causal independence based on a functional representation of the relationship between causes and the effect. In this representation, the interaction between causes and effect can be written as a nested decomposition of functions. Causal independence can be exploited by representing this decomposition in the belief network, resulting in representations that are more efficient for inference than general causal models. We present empirical results showing the benefits of a causalindependence representation for beliefnetwork inference. 1
Counterfactual Probabilities: Computational Methods, Bounds and Applications
 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
, 1994
"... Evaluation of counterfactual queries (e.g., "If A were true, would C have been true?") is important to fault diagnosis, planning, and determination of liability. In this paper we present methods for computing the probabilities of such queries using the formulation proposed in [Balke and P ..."
Abstract

Cited by 63 (23 self)
 Add to MetaCart
Evaluation of counterfactual queries (e.g., "If A were true, would C have been true?") is important to fault diagnosis, planning, and determination of liability. In this paper we present methods for computing the probabilities of such queries using the formulation proposed in [Balke and Pearl, 1994], where the antecedent of the query is interpreted as an external action that forces the proposition A to be true. When a prior probability is available on the causal mechanisms governing the domain, counterfactual probabilities can be evaluated precisely. However, when causal knowledge is specified as conditional probabilities on the observables, only bounds can computed. This paper develops techniques for evaluating these bounds, and demonstrates their use in two applications: (1) the determination of treatment efficacy from studies in which subjects may choose their own treatment, and (2) the determination of liability in productsafety litigation.
Probabilistic Evaluation of Counterfactual Queries
 IN PROCEEDINGS AAAI94
, 1994
"... Evaluation of counterfactual queries (e.g., "If A were true, would C have been true?") is important to fault diagnosis, planning, and determination of liability. We present a formalism that uses probabilistic causal networks to evaluate one's belief that the counterfactual consequent, ..."
Abstract

Cited by 57 (20 self)
 Add to MetaCart
Evaluation of counterfactual queries (e.g., "If A were true, would C have been true?") is important to fault diagnosis, planning, and determination of liability. We present a formalism that uses probabilistic causal networks to evaluate one's belief that the counterfactual consequent, C, would have been true if the antecedent, A, were true. The antecedent of the query is interpreted as an external action that forces the proposition A to be true, which is consistent with Lewis' Miraculous Analysis. This formalism offers a concrete embodiment of the "closest world" approach which (1) properly reflects common understanding of causal influences, (2) deals with the uncertainties inherent in the world, and (3) is amenable to machine representation.
Canonical probabilistic models for knowledge engineering
, 2000
"... The hardest task in knowledge engineering for probabilistic graphical models, such as Bayesian networks and influence diagrams, is obtaining their numerical parameters. Models based on acyclic directed graphs and composed of discrete variables, currently most common in practice, require for every va ..."
Abstract

Cited by 37 (14 self)
 Add to MetaCart
(Show Context)
The hardest task in knowledge engineering for probabilistic graphical models, such as Bayesian networks and influence diagrams, is obtaining their numerical parameters. Models based on acyclic directed graphs and composed of discrete variables, currently most common in practice, require for every variable a number of parameters that is exponential in the number of its parents in the graph, which makes elicitation from experts or learning from databases a daunting task. In this paper, we review the so called canonical models, whose main advantage is that they require much fewer parameters. We propose a general framework for them, based on three categories: deterministic models, ICI models, and simple canonical models. ICI models rely on the concept of independence of causal influence and can be subdivided into noisy and leaky. We then analyze the most common families of canonical models (the OR/MAX, the AND/MIN, and the noisy XOR), generalizing them and offering criteria for applying them in practice. We also briefly review temporal
Elicitation of Probabilities for Belief Networks: Combining Qualitative and . . .
 IN UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (95): PROCEEDINGS OF THE 11TH CONFERENCE, LOS ALTOS CA
, 1995
"... Although the usefulness of belief networks for reasoning under uncertainty is widely accepted, obtaining numerical probabilities that they require is still perceived a major obstacle. Often not enough statistical data is available to allow for reliable probability estimation. Available informa ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
Although the usefulness of belief networks for reasoning under uncertainty is widely accepted, obtaining numerical probabilities that they require is still perceived a major obstacle. Often not enough statistical data is available to allow for reliable probability estimation. Available information may not be directly amenable for encoding in the network. Finally, domain experts may be reluctant to provide numerical probabilities. In this paper, we propose a method for elicitation of probabilities from a domain expert that is noninvasive and accommodates whatever probabilistic information the expert is willing to state. We express all available information, whether qualitative or quantitative in nature, in a canonical form consisting of (in)equalities expressing constraints on the hyperspace of possible joint probability distributions. We then use this canonical form to derive secondorder probability distributions over the desired probabilities.
Some Properties of Joint Probability Distributions
 In Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence (UAI–94
, 1994
"... Several Artificial Intelligence schemes for reasoning under uncertainty explore either explicitly or implicitly asymmetries among probabilities of various states of their uncertain domain models. Even though the correct working of these schemes is practically contingent upon the existence of a ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
(Show Context)
Several Artificial Intelligence schemes for reasoning under uncertainty explore either explicitly or implicitly asymmetries among probabilities of various states of their uncertain domain models. Even though the correct working of these schemes is practically contingent upon the existence of a small number of probable states, no formal justification has been proposed of why this should be the case. This paper attempts to fill this apparent gap by studying asymmetries among probabilities of various states of uncertain models. By rewriting the joint probability distribution over a model's variables into a product of individual variables' prior and conditional probability distributions and applying central limit theorem to this product, we can demonstrate that the probabilities of individual states of the model can be expected to be drawn from highly skewed lognormal distributions. With sufficient asymmetry in individual prior and conditional probability distributions, a small fraction of states can be expected to cover a large portion of the total probability space with the remaining states having practically negligible probability. Theoretical discussion is supplemented by simulation results and an illustrative realworld example. 1
Defining Explanation in Probabilistic Systems
 In Proc. UAI97
, 1997
"... As probabilistic systems gain popularity and are coming into wider use, the need for a mechanism that explains the system's findings and recommendations becomes more critical. The system will also need a mechanism for ordering competing explanations. We examine two representative approaches to ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
(Show Context)
As probabilistic systems gain popularity and are coming into wider use, the need for a mechanism that explains the system's findings and recommendations becomes more critical. The system will also need a mechanism for ordering competing explanations. We examine two representative approaches to explanation in the literature one due to G ardenfors and one due to Pearland show that both suffer from significant problems. We propose an approach to defining a notion of "better explanation" that combines some of the features of both together with more recent work by Pearl and others on causality. 1 INTRODUCTION Probabilistic inference is often hard for humans to understand. Even a simple inference in a small domain may seem counterintuitive and surprising; the situation only gets worse for large and complex domains. Thus, a system doing probabilistic inference must be able to explain its findings and recommendations to evoke confidence on the part of the user. Indeed, in experiments wi...