Results 1  10
of
58
Bayesian network learning with cutting planes.
 In Proceedings of the TwentySeventh Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI11),
, 2011
"... Abstract The problem of learning the structure of Bayesian networks from complete discrete data with a limit on parent set size is considered. Learning is cast explicitly as an optimisation problem where the goal is to find a BN structure which maximises log marginal likelihood (BDe score). Integer ..."
Abstract

Cited by 46 (7 self)
 Add to MetaCart
(Show Context)
Abstract The problem of learning the structure of Bayesian networks from complete discrete data with a limit on parent set size is considered. Learning is cast explicitly as an optimisation problem where the goal is to find a BN structure which maximises log marginal likelihood (BDe score). Integer programming, specifically the SCIP framework, is used to solve this optimisation problem. Acyclicity constraints are added to the integer program (IP) during solving in the form of cutting planes. Finding good cutting planes is the key to the success of the approachthe search for such cutting planes is effected using a subIP. Results show that this is a particularly fast method for exact BN learning.
Efficient structure learning of Bayesian networks using constraints
 Journal of Machine Learning Research
"... This paper addresses the problem of learning Bayesian network structures from data based on score functions that are decomposable. It describes properties that strongly reduce the time and memory costs of many known methods without losing global optimality guarantees. These properties are derived fo ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of learning Bayesian network structures from data based on score functions that are decomposable. It describes properties that strongly reduce the time and memory costs of many known methods without losing global optimality guarantees. These properties are derived for different score criteria such as Minimum Description Length (or Bayesian Information Criterion), Akaike Information Criterion and Bayesian Dirichlet Criterion. Then a branchandbound algorithm is presented that integrates structural constraints with data in a way to guarantee global optimality. As an example, structural constraints are used to map the problem of structure learning in Dynamic Bayesian networks into a corresponding augmented Bayesian network. Finally, we show empirically the benefits of using the properties with stateoftheart methods and with the new algorithm, which is able to handle larger data sets than before.
Approximate Inference in Graphical Models using LP Relaxations
, 2010
"... Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the vari ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
(Show Context)
Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the variables. We present new approaches to approximate inference based on linear programming (LP) relaxations. Our algorithms optimize over the cycle relaxation of the marginal polytope, which we show to be closely related to the first lifting of the SheraliAdams hierarchy, and is significantly tighter than the pairwise LP relaxation. We show how to efficiently optimize over the cycle relaxation using a cuttingplane algorithm that iteratively introduces constraints into the relaxation. We provide a criterion to determine which constraints would be most helpful in tightening the relaxation, and give efficient algorithms for solving the search problem of finding the best cycle constraint to add according to this criterion.
Improving the scalability of optimal Bayesian network learning with externalmemory frontier breadthfirst branch and bound search
 IN PROCEEDINGS OF THE 27TH CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
"... Previous work has shown that the problem of learning the optimal structure of a Bayesian network can be formulated as a shortest path finding problem in a graph and solved using A* search. In this paper, we improve the scalability of this approach by developing a memoryefficient heuristic search ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
(Show Context)
Previous work has shown that the problem of learning the optimal structure of a Bayesian network can be formulated as a shortest path finding problem in a graph and solved using A* search. In this paper, we improve the scalability of this approach by developing a memoryefficient heuristic search algorithm for learning the structure of a Bayesian network. Instead of using A*, we propose a frontier breadthfirst branch and bound search that leverages the layered structure of the search graph of this problem so that no more than two layers of the graph, plus solution reconstruction information, need to be stored in memory at a time. To further improve scalability, the algorithm stores most of the graph in external memory, such as hard disk, when it does not fit in RAM. Experimental results show that the resulting algorithm solves significantly larger problems than the current state of the art.
Learning Optimal Bounded Treewidth Bayesian Networks via Maximum Satisfiability
, 2014
"... Bayesian network structure learning is the wellknown computationally hard problem of finding a directed acyclic graph structure that optimally describes given data. A learned structure can then be used for probabilistic inference. While exact inference in Bayesian networks is in general NPhard, ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
Bayesian network structure learning is the wellknown computationally hard problem of finding a directed acyclic graph structure that optimally describes given data. A learned structure can then be used for probabilistic inference. While exact inference in Bayesian networks is in general NPhard, it is tractable in networks with low treewidth. This provides good motivations for developing algorithms for the NPhard problem of learning optimal bounded treewidth Bayesian networks (BTWBNSL). In this work, we develop a novel scorebased approach to BTWBNSL, based on casting BTWBNSL as weighted partial Maximum satisfiability. We demonstrate empirically that the approach scales notably better than a recent exact dynamic programming algorithm for BTWBNSL.
An improved admissible heuristic for learning optimal Bayesian networks
 IN PROCEEDINGS OF THE 28TH CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI12
, 2012
"... Recently two search algorithms, A* and breadthfirst branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
Recently two search algorithms, A* and breadthfirst branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses optimal parents independently. As a result, the heuristic may contain many directed cycles and result in a loose bound. This paper introduces an improved admissible heuristic that tries to avoid directed cycles within small groups of variables. A sparse representation is also introduced to store only the unique optimal parent choices. Empirical results show that the new techniques significantly improved the efficiency and scalability of A* and BFBnB on most of datasets tested in this paper.
Learning Bounded Treewidth Bayesian Networks using Integer Linear Programming
, 2014
"... In many applications one wants to compute conditional probabilities given a Bayesian network. This inference problem is NPhard in general but becomes tractable when the network has low treewidth. Since the inference problem is common in many application areas, we provide a practical algorithm for ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In many applications one wants to compute conditional probabilities given a Bayesian network. This inference problem is NPhard in general but becomes tractable when the network has low treewidth. Since the inference problem is common in many application areas, we provide a practical algorithm for learning bounded treewidth Bayesian networks. We cast this problem as an integer linear program (ILP). The program can be solved by an anytime algorithm which provides upper bounds to assess the quality of the found solutions. A key component of our program is a novel integer linear formulation for bounding treewidth of a graph. Our tests clearly indicate that our approach works in practice, as our implementation was able to find an optimal or nearly optimal network for most of the data sets.
Evaluating Anytime Algorithms for Learning Optimal Bayesian Networks
 In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI13
, 2013
"... Exact algorithms for learning Bayesian networks guarantee to find provably optimal networks. However, they may fail in difficult learning tasks due to limited time or memory. In this research we adapt several anytime heuristic searchbased algorithms to learn Bayesian networks. These algorithms find ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
Exact algorithms for learning Bayesian networks guarantee to find provably optimal networks. However, they may fail in difficult learning tasks due to limited time or memory. In this research we adapt several anytime heuristic searchbased algorithms to learn Bayesian networks. These algorithms find highquality solutions quickly, and continually improve the incumbent solution or prove its optimality before resources are exhausted. Empirical results show that the anytime window A * algorithm usually finds higherquality, often optimal, networks more quickly than other approaches. The results also show that, surprisingly, while generating networks with few parents per variable are structurally simpler, they are harder to learn than complex generating networks with more parents per variable. 1
Answer Set Programming as SAT modulo Acyclicity1
"... Abstract. Answer set programming (ASP) is a declarative programming paradigm for solving search problems arising in knowledgeintensive domains. One viable way to implement the computation of answer sets corresponding to problem solutions is to recast a logic program as a Boolean satisfiability (SAT ..."
Abstract

Cited by 8 (8 self)
 Add to MetaCart
(Show Context)
Abstract. Answer set programming (ASP) is a declarative programming paradigm for solving search problems arising in knowledgeintensive domains. One viable way to implement the computation of answer sets corresponding to problem solutions is to recast a logic program as a Boolean satisfiability (SAT) problem and to use existing SAT solver technology for the actual search. Such mappings can be obtained by augmenting Clark’s completion with constraints guaranteeing the strong justifiability of answer sets. To this end, we consider an extension of SAT by graphs subject to an acyclicity constraint, called SAT modulo acyclicity. We devise a linear embedding of logic programs and study the performance of answer set computation with SAT modulo acyclicity solvers. 1
Characteristic imsets for learning Bayesian network structure
 Int. J. of Approx. Reasoning
"... The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector represen ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector representative. The original proposal from Characteristic imsets are (shown to be) zeroone vectors and have many elegant properties, suitable for intended application of linear/integer programming methods to learning BN structure. They are much closer to the graphical description; we describe a simple transition between the characteristic imset and the essential graph, known as a traditional unique graphical representative of the BN structure. In the end, we relate our proposal to other recent approaches which apply linear programming methods in probabilistic reasoning.