Results 11  20
of
254
Learning Bayesian network classifiers by maximizing conditional likelihood
 In ICML2004
, 2004
"... Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function ..."
Abstract

Cited by 85 (0 self)
 Add to MetaCart
Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function thereof) and the goal of classification (maximizing accuracy or conditional likelihood). Unfortunately, the computational cost of optimizing structure and parameters for conditional likelihood is prohibitive. In this paper we show that a simple approximation— choosing structures by maximizing conditional likelihood while setting parameters by maximum likelihood—yields good results. On a large suite of benchmark datasets, this approach produces better class probability estimates than naive Bayes, TAN, and generativelytrained Bayesian networks. 1.
Towards Combining Inductive Logic Programming with Bayesian Networks
, 2001
"... Recently, new representation languages that integrate first order logic with Bayesian networks have been developed. Bayesian logic programs are one of these languages. In this paper, we present results on combining Inductive Logic Programming (ILP) with Bayesian networks to learn both the qualitativ ..."
Abstract

Cited by 83 (12 self)
 Add to MetaCart
(Show Context)
Recently, new representation languages that integrate first order logic with Bayesian networks have been developed. Bayesian logic programs are one of these languages. In this paper, we present results on combining Inductive Logic Programming (ILP) with Bayesian networks to learn both the qualitative and the quantitative components of Bayesian logic programs. More precisely, we show how to combine the ILP setting learning from interpretations with scorebased techniques for learning Bayesian networks. Thus, the paper positively answers Koller and Pfeffer's question, whether techniques from ILP could help to learn the logical component of first order probabilistic models.
Discretizing Continuous Attributes While Learning Bayesian Networks
 In Proc. ICML
, 1996
"... We introduce a method for learning Bayesian networks that handles the discretization of continuous variables as an integral part of the learning process. The main ingredient in this method is a new metric based on the Minimal Description Length principle for choosing the threshold values for the dis ..."
Abstract

Cited by 78 (4 self)
 Add to MetaCart
(Show Context)
We introduce a method for learning Bayesian networks that handles the discretization of continuous variables as an integral part of the learning process. The main ingredient in this method is a new metric based on the Minimal Description Length principle for choosing the threshold values for the discretization while learning the Bayesian network structure. This score balances the complexity of the learned discretization and the learned network structure against how well they model the training data. This ensures that the discretization of each variable introduces just enough intervals to capture its interaction with adjacent variables in the network. We formally derive the new metric, study its main properties, and propose an iterative algorithm for learning a discretization policy. Finally, we illustrate its behavior in applications to supervised learning. 1 INTRODUCTION Bayesian networks provide efficient and effective representation of the joint probability distribution over a set ...
Learning Bayesian Belief Networks Based on the Minimum Description Length Principle: Basic Properties
, 1996
"... This paper was partially presented at the 9th conference on Uncertainty in Artificial Intelligence, July 1993. ..."
Abstract

Cited by 69 (0 self)
 Add to MetaCart
This paper was partially presented at the 9th conference on Uncertainty in Artificial Intelligence, July 1993.
Learning Bayesian Network Structures by Searching For the Best Ordering With Genetic Algorithms
 IEEE Transactions on Systems, Man and Cybernetics
, 1996
"... In this paper we present a ne_(l n [!ii ' with respect to Bayesian networks con ogy for inducing Bayesian network structures frop3 titute the roblem of the evidence propagation and a database of cases. The methodology is based oap&lll searching for the best ordering of the system vari th ..."
Abstract

Cited by 68 (9 self)
 Add to MetaCart
In this paper we present a ne_(l n [!ii ' with respect to Bayesian networks con ogy for inducing Bayesian network structures frop3 titute the roblem of the evidence propagation and a database of cases. The methodology is based oap&lll searching for the best ordering of the system vari the problem of the model search. The problem of shies by means of genetic algorithl{. Since his th_vidence propagation consists of once the vMproblem of finding an optimal ordea. teeuarue}rables are known, the assignment of resembles the traveling salesman p'FolUleh)ve use .... IW. ....... probablhles to the values of the rest of the van genetic operators that were developed for the latter  problem. The quality of a variable ordering is eval ables. Cooper [4] demonstrated that this problem Mated with the algorithm K2. We present empirical results that were obtained with a simulation of the ALARM network.
On the Sample Complexity of Learning Bayesian Networks
, 1996
"... In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: w ..."
Abstract

Cited by 54 (2 self)
 Add to MetaCart
(Show Context)
In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: with probability one, it will converge to the target distribution, given a sufficient number of samples. However, the rate of this convergence has been hitherto unknown. In this work we examine the sample complexity of MDL based learning procedures for Bayesian networks. We show that the number of samples needed to learn an fflclose approximation (in terms of entropy distance) with confidence ffi is O i ( 1 ffl ) 4 3 log 1 ffl log 1 ffi log log 1 ffi j . This means that the sample complexity is a loworder polynomial in the error threshold and sublinear in the confidence bound. We also discuss how the constants in this term depend on the complexity of the target distribution. F...
Sequential Update of Bayesian Network Structure
 In Proc. 13th Conference on Uncertainty in Artificial Intelligence (UAI’97
, 1997
"... There is an obvious need for improving the performance and accuracy of a Bayesian network as new data is observed. Because of errors in model construction and changes in the dynamics of the domains, we cannot afford to ignore the information in new data. While sequential update of parameters for a f ..."
Abstract

Cited by 51 (3 self)
 Add to MetaCart
(Show Context)
There is an obvious need for improving the performance and accuracy of a Bayesian network as new data is observed. Because of errors in model construction and changes in the dynamics of the domains, we cannot afford to ignore the information in new data. While sequential update of parameters for a fixed structure can be accomplished using standard techniques, sequential update of network structure is still an open problem. In this paper, we investigate sequential update of Bayesian networks were both parameters and structure are expected to change. We introduce a new approach that allows for the flexible manipulation of the tradeoff between the quality of the learned networks and the amount of information that is maintained about past observations. We formally describe our approach including the necessary modifications to the scoring functions for learning Bayesian networks, evaluate its effectiveness through and empirical study, and extend it to the case of missing data. 1 Introductio...
Learning Bayesian Networks from Data: An Efficient Approach Based on Information Theory
, 1997
"... This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the ..."
Abstract

Cited by 49 (0 self)
 Add to MetaCart
This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the node ordering is given, the algorithm only require ) ( 2 N O CI tests and is correct given that the underlying model is DAGFaithful [Spirtes et. al., 1996]. The other algorithm deals with the general case and requires ) ( 4 N O conditional independence (CI) tests. It is correct given that the underlying model is monotone DAGFaithful (see Section 4.4). A system based on these algorithms has been developed and distributed through the Internet. The empirical results show that our approach is efficient and reliable. 1 Introduction The Bayesian network is a powerful knowledge representation and reasoning tool under conditions of uncertainty. A Bayesian network is a directed acyclic graph ...
Learning Bayesian Nets that Perform Well
 In UAI97
, 1997
"... A Bayesian net (BN) is more than a succinct way to encode a probabilistic distribution; it also corresponds to a function used to answer queries. A BN can therefore be evaluated by the accuracy of the answers it returns. Many algorithms for learning BNs, however, attempt to optimize another criterio ..."
Abstract

Cited by 43 (15 self)
 Add to MetaCart
(Show Context)
A Bayesian net (BN) is more than a succinct way to encode a probabilistic distribution; it also corresponds to a function used to answer queries. A BN can therefore be evaluated by the accuracy of the answers it returns. Many algorithms for learning BNs, however, attempt to optimize another criterion (usually likelihood, possibly augmented with a regularizing term), which is independent of the distribution of queries that are posed. This paper takes the "performance criteria" seriously, and considers the challenge of computing the BN whose performance  read "accuracy over the distribution of queries"  is optimal. We show that many aspects of this learning task are more difficult than the corresponding subtasks in the standard model. To appear in Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI97), Providence, RI, August 1997. 1 INTRODUCTION Many tasks require answering questions; this model applies, for example, to both expert systems th...