Results 1  10
of
46
Learning Optimal Bayesian Networks: A Shortest Path Perspective
, 2013
"... In this paper, learning a Bayesian network structure that optimizes a scoring function for a given dataset is viewed as a shortest path problem in an implicit statespace search graph. This perspective highlights the importance of two research issues: the development of search strategies for solving ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
(Show Context)
In this paper, learning a Bayesian network structure that optimizes a scoring function for a given dataset is viewed as a shortest path problem in an implicit statespace search graph. This perspective highlights the importance of two research issues: the development of search strategies for solving the shortest path problem, and the design of heuristic functions for guiding the search. This paper introduces several techniques for addressing the issues. One is an A * search algorithm that learns an optimal Bayesian network structure by only searching the most promising part of the solution space. The others are mainly two heuristic functions. The first heuristic function represents a simple relaxation of the acyclicity constraint of a Bayesian network. Although admissible and consistent, the heuristic may introduce too much relaxation and result in a loose bound. The second heuristic function reduces the amount of relaxation by avoiding directed cycles within some groups of variables. Empirical results show that these methods constitute a promising approach to learning optimal Bayesian network structures.
Learning Optimal Bounded Treewidth Bayesian Networks via Maximum Satisfiability
, 2014
"... Bayesian network structure learning is the wellknown computationally hard problem of finding a directed acyclic graph structure that optimally describes given data. A learned structure can then be used for probabilistic inference. While exact inference in Bayesian networks is in general NPhard, ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
(Show Context)
Bayesian network structure learning is the wellknown computationally hard problem of finding a directed acyclic graph structure that optimally describes given data. A learned structure can then be used for probabilistic inference. While exact inference in Bayesian networks is in general NPhard, it is tractable in networks with low treewidth. This provides good motivations for developing algorithms for the NPhard problem of learning optimal bounded treewidth Bayesian networks (BTWBNSL). In this work, we develop a novel scorebased approach to BTWBNSL, based on casting BTWBNSL as weighted partial Maximum satisfiability. We demonstrate empirically that the approach scales notably better than a recent exact dynamic programming algorithm for BTWBNSL.
An improved admissible heuristic for learning optimal Bayesian networks
 IN PROCEEDINGS OF THE 28TH CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI12
, 2012
"... Recently two search algorithms, A* and breadthfirst branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Recently two search algorithms, A* and breadthfirst branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses optimal parents independently. As a result, the heuristic may contain many directed cycles and result in a loose bound. This paper introduces an improved admissible heuristic that tries to avoid directed cycles within small groups of variables. A sparse representation is also introduced to store only the unique optimal parent choices. Empirical results show that the new techniques significantly improved the efficiency and scalability of A* and BFBnB on most of datasets tested in this paper.
Learning Bounded Treewidth Bayesian Networks using Integer Linear Programming
, 2014
"... In many applications one wants to compute conditional probabilities given a Bayesian network. This inference problem is NPhard in general but becomes tractable when the network has low treewidth. Since the inference problem is common in many application areas, we provide a practical algorithm for ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In many applications one wants to compute conditional probabilities given a Bayesian network. This inference problem is NPhard in general but becomes tractable when the network has low treewidth. Since the inference problem is common in many application areas, we provide a practical algorithm for learning bounded treewidth Bayesian networks. We cast this problem as an integer linear program (ILP). The program can be solved by an anytime algorithm which provides upper bounds to assess the quality of the found solutions. A key component of our program is a novel integer linear formulation for bounding treewidth of a graph. Our tests clearly indicate that our approach works in practice, as our implementation was able to find an optimal or nearly optimal network for most of the data sets.
Evaluating Anytime Algorithms for Learning Optimal Bayesian Networks
 In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI13
, 2013
"... Exact algorithms for learning Bayesian networks guarantee to find provably optimal networks. However, they may fail in difficult learning tasks due to limited time or memory. In this research we adapt several anytime heuristic searchbased algorithms to learn Bayesian networks. These algorithms find ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Exact algorithms for learning Bayesian networks guarantee to find provably optimal networks. However, they may fail in difficult learning tasks due to limited time or memory. In this research we adapt several anytime heuristic searchbased algorithms to learn Bayesian networks. These algorithms find highquality solutions quickly, and continually improve the incumbent solution or prove its optimality before resources are exhausted. Empirical results show that the anytime window A * algorithm usually finds higherquality, often optimal, networks more quickly than other approaches. The results also show that, surprisingly, while generating networks with few parents per variable are structurally simpler, they are harder to learn than complex generating networks with more parents per variable. 1
Answer Set Programming as SAT modulo Acyclicity1
"... Abstract. Answer set programming (ASP) is a declarative programming paradigm for solving search problems arising in knowledgeintensive domains. One viable way to implement the computation of answer sets corresponding to problem solutions is to recast a logic program as a Boolean satisfiability (SAT ..."
Abstract

Cited by 8 (8 self)
 Add to MetaCart
(Show Context)
Abstract. Answer set programming (ASP) is a declarative programming paradigm for solving search problems arising in knowledgeintensive domains. One viable way to implement the computation of answer sets corresponding to problem solutions is to recast a logic program as a Boolean satisfiability (SAT) problem and to use existing SAT solver technology for the actual search. Such mappings can be obtained by augmenting Clark’s completion with constraints guaranteeing the strong justifiability of answer sets. To this end, we consider an extension of SAT by graphs subject to an acyclicity constraint, called SAT modulo acyclicity. We devise a linear embedding of logic programs and study the performance of answer set computation with SAT modulo acyclicity solvers. 1
Characteristic imsets for learning Bayesian network structure
 Int. J. of Approx. Reasoning
"... The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector represen ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector representative. The original proposal from Characteristic imsets are (shown to be) zeroone vectors and have many elegant properties, suitable for intended application of linear/integer programming methods to learning BN structure. They are much closer to the graphical description; we describe a simple transition between the characteristic imset and the essential graph, known as a traditional unique graphical representative of the BN structure. In the end, we relate our proposal to other recent approaches which apply linear programming methods in probabilistic reasoning.
Predicting the Hardness of Learning Bayesian Networks
, 2014
"... There are various algorithms for finding a Bayesian network structure (BNS) that is optimal with respect to a given scoring function. No single algorithm dominates the others in speed, and, given a problem instance, it is a priori unclear which algorithm will perform best and how fast it will solve ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
There are various algorithms for finding a Bayesian network structure (BNS) that is optimal with respect to a given scoring function. No single algorithm dominates the others in speed, and, given a problem instance, it is a priori unclear which algorithm will perform best and how fast it will solve the problem. Estimating the runtimes directly is extremely difficult as they are complicated functions of the instance. The main contribution of this paper is characterization of the empirical hardness of an instance for a given algorithm based on a novel collection of nontrivial, yet efficiently computable features. Our empirical results, based on the largest evaluation of stateoftheart BNS learning algorithms to date, demonstrate that we can predict the runtimes to a reasonable degree of accuracy, and effectively select algorithms that perform well on a particular instance. Moreover, we also show how the results can be utilized in building a portfolio algorithm that combines several individual algorithms in an almost optimal manner.
Advances in Bayesian Network Learning using Integer Programming
"... We consider the problem of learning Bayesian networks (BNs) from complete discrete data. This problem of discrete optimisation is formulated as an integer program (IP). We describe the various steps we have taken to allow efficient solving of this IP. These are (i) efficient search for cutting plane ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
We consider the problem of learning Bayesian networks (BNs) from complete discrete data. This problem of discrete optimisation is formulated as an integer program (IP). We describe the various steps we have taken to allow efficient solving of this IP. These are (i) efficient search for cutting planes, (ii) a fast greedy algorithm to find highscoring (perhaps not optimal) BNs and (iii) tightening the linear relaxation of the IP. After relating this BN learning problem to set covering and the multidimensional 01 knapsack problem, we present our empirical results. These show improvements, sometimes dramatic, over earlier results. 1
Learning sparse causal models is not nphard
 In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence
, 2013
"... Abstract This paper shows that causal model discovery is not an NPhard problem, in the sense that for sparse graphs bounded by node degree k the sound and complete causal model can be obtained in worst case order N 2(k+2) independence tests, even when latent variables and selection bias may be pre ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract This paper shows that causal model discovery is not an NPhard problem, in the sense that for sparse graphs bounded by node degree k the sound and complete causal model can be obtained in worst case order N 2(k+2) independence tests, even when latent variables and selection bias may be present. We present a modification of the wellknown FCI algorithm that implements the method for an independence oracle, and suggest improvements for sample/realworld data versions. It does not contradict any known hardness results, and does not solve an NPhard problem: it just proves that sparse causal discovery is perhaps more complicated, but not as hard as learning minimal Bayesian networks.