Results 1  10
of
115
A Simple Approach for Finding the Globally Optimal Bayesian Network Structure
"... We study the problem of learning the best Bayesian network structure with respect to a decomposable score such as BDe, BIC or AIC. This problem is known to be NPhard, which means that solving it becomes quickly infeasible as the number of variables increases. Nevertheless, in this paper we show tha ..."
Abstract

Cited by 96 (6 self)
 Add to MetaCart
We study the problem of learning the best Bayesian network structure with respect to a decomposable score such as BDe, BIC or AIC. This problem is known to be NPhard, which means that solving it becomes quickly infeasible as the number of variables increases. Nevertheless, in this paper we show that it is possible to learn the best Bayesian network structure with over 30 variables, which covers many practically interesting cases. Our algorithm is less complicated and more efficient than the techniques presented earlier. It can be easily parallelized, and offers a possibility for efficient exploration of the best networks consistent with different variable orderings. In the experimental part of the paper we compare the performance of the algorithm to the previous stateoftheart algorithm. Free sourcecode and an onlinedemo can be found at http://bcourse.hiit.fi/bene.
Learning Bayesian Network Structure using LP Relaxations
"... We propose to solve the combinatorial problem of finding the highest scoring Bayesian network structure from data. This structure learning problem can be viewed as an inference problem where the variables specify the choice of parents for each node in the graph. The key combinatorial difficulty aris ..."
Abstract

Cited by 58 (2 self)
 Add to MetaCart
We propose to solve the combinatorial problem of finding the highest scoring Bayesian network structure from data. This structure learning problem can be viewed as an inference problem where the variables specify the choice of parents for each node in the graph. The key combinatorial difficulty arises from the global constraint that the graph structure has to be acyclic. We cast the structure learning problem as a linear program over the polytope defined by valid acyclic structures. In relaxing this problem, we maintain an outer bound approximation to the polytope and iteratively tighten it by searching over a new class of valid constraints. If an integral solution is found, it is guaranteed to be the optimal Bayesian network. When the relaxation is not tight, the fast dual algorithms we develop remain useful in combination with a branch and bound method. Empirical results suggest that the method is competitive or faster than alternative exact methods based on dynamic programming. 1
Structure learning in random fields for heart motion abnormality detection
 In CVPR
, 2008
"... Coronary Heart Disease can be diagnosed by assessing the regional motion of the heart walls in ultrasound images of the left ventricle. Even for experts, ultrasound images are difficult to interpret leading to high intraobserver variability. Previous work indicates that in order to approach this pr ..."
Abstract

Cited by 56 (8 self)
 Add to MetaCart
(Show Context)
Coronary Heart Disease can be diagnosed by assessing the regional motion of the heart walls in ultrasound images of the left ventricle. Even for experts, ultrasound images are difficult to interpret leading to high intraobserver variability. Previous work indicates that in order to approach this problem, the interactions between the different heart regions and their overall influence on the clinical condition of the heart need to be considered. To do this, we propose a method for jointly learning the structure and parameters of conditional random fields, formulating these tasks as a convex optimization problem. We consider blockL1 regularization for each set of features associated with an edge, and formalize an efficient projection method to find the globally optimal penalized maximum likelihood solution. We perform extensive numerical experiments comparing the presented method with related methods that approach the structure learning problem differently. We verify the robustness of our method on echocardiograms collected in routine clinical practice at one hospital. 1.
Structure Learning of Bayesian Networks using Constraints
"... This paper addresses exact learning of Bayesian network structure from data and expert’s knowledge based on score functions that are decomposable. First, it describes useful properties that strongly reduce the time and memory costs of many known methods such as hillclimbing, dynamic programming and ..."
Abstract

Cited by 51 (6 self)
 Add to MetaCart
(Show Context)
This paper addresses exact learning of Bayesian network structure from data and expert’s knowledge based on score functions that are decomposable. First, it describes useful properties that strongly reduce the time and memory costs of many known methods such as hillclimbing, dynamic programming and sampling variable orderings. Secondly, a branch and bound algorithm is presented that integrates parameter and structural constraints with data in a way to guarantee global optimality with respect to the score function. It is an anytime procedure because, if stopped, it provides the best current solution and an estimation about how far it is from the global solution. We show empirically the advantages of the properties and the constraints, and the applicability of the algorithm to large data sets (up to one hundred variables) that cannot be handled by other current methods (limited to around 30 variables). 1.
Bayesian network learning with cutting planes.
 In Proceedings of the TwentySeventh Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI11),
, 2011
"... Abstract The problem of learning the structure of Bayesian networks from complete discrete data with a limit on parent set size is considered. Learning is cast explicitly as an optimisation problem where the goal is to find a BN structure which maximises log marginal likelihood (BDe score). Integer ..."
Abstract

Cited by 46 (7 self)
 Add to MetaCart
(Show Context)
Abstract The problem of learning the structure of Bayesian networks from complete discrete data with a limit on parent set size is considered. Learning is cast explicitly as an optimisation problem where the goal is to find a BN structure which maximises log marginal likelihood (BDe score). Integer programming, specifically the SCIP framework, is used to solve this optimisation problem. Acyclicity constraints are added to the integer program (IP) during solving in the form of cutting planes. Finding good cutting planes is the key to the success of the approachthe search for such cutting planes is effected using a subIP. Results show that this is a particularly fast method for exact BN learning.
Exact bayesian structure learning from uncertain interventions
 AI & Statistics, In
, 2007
"... We show how to apply the dynamic programming algorithm of Koivisto and Sood [KS04, Koi06], which computes the exact posterior marginal edge probabilities p(Gij = 1D) of a DAG G given data D, to the case where the data is obtained by interventions (experiments). In particular, we consider the case w ..."
Abstract

Cited by 46 (5 self)
 Add to MetaCart
We show how to apply the dynamic programming algorithm of Koivisto and Sood [KS04, Koi06], which computes the exact posterior marginal edge probabilities p(Gij = 1D) of a DAG G given data D, to the case where the data is obtained by interventions (experiments). In particular, we consider the case where the targets of the interventions are a priori unknown. We show that it is possible to learn the targets of intervention at the same time as learning the causal structure. We apply our exact technique to a biological data set that had previously been analyzed using MCMC [SPP + 05, EW06, WGH06]. 1
Finding optimal Bayesian networks by dynamic programming
, 2005
"... Finding the Bayesian network that maximizes a score function is known as structure learning or structure discovery. Most approaches use local search in the space of acyclic digraphs, which is prone to local maxima. Exhaustive enumeration requires superexponential time. In this paper we describe a “ ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
(Show Context)
Finding the Bayesian network that maximizes a score function is known as structure learning or structure discovery. Most approaches use local search in the space of acyclic digraphs, which is prone to local maxima. Exhaustive enumeration requires superexponential time. In this paper we describe a “merely ” exponential space/time algorithm for finding a Bayesian network that corresponds to a global maxima of a decomposable scoring function, such as BDeu or BIC. NSF IIS0325581, NSERC PGSB
Bayesian structure learning using dynamic programming and MCMC
 In UAI, 2007b
"... We show how to significantly speed up MCMC sampling of DAG structures by using a powerful nonlocal proposal based on Koivisto’s dynamic programming (DP) algorithm (11; 10), which computes the exact marginal posterior edge probabilities by analytically summing over orders. Furthermore, we show how s ..."
Abstract

Cited by 30 (1 self)
 Add to MetaCart
We show how to significantly speed up MCMC sampling of DAG structures by using a powerful nonlocal proposal based on Koivisto’s dynamic programming (DP) algorithm (11; 10), which computes the exact marginal posterior edge probabilities by analytically summing over orders. Furthermore, we show how sampling in DAG space can avoid subtle biases that are introduced by approaches that work only with orders, such as Koivisto’s DP algorithm and MCMC order samplers (6; 5). 1
Efficient structure learning of Bayesian networks using constraints
 Journal of Machine Learning Research
"... This paper addresses the problem of learning Bayesian network structures from data based on score functions that are decomposable. It describes properties that strongly reduce the time and memory costs of many known methods without losing global optimality guarantees. These properties are derived fo ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of learning Bayesian network structures from data based on score functions that are decomposable. It describes properties that strongly reduce the time and memory costs of many known methods without losing global optimality guarantees. These properties are derived for different score criteria such as Minimum Description Length (or Bayesian Information Criterion), Akaike Information Criterion and Bayesian Dirichlet Criterion. Then a branchandbound algorithm is presented that integrates structural constraints with data in a way to guarantee global optimality. As an example, structural constraints are used to map the problem of structure learning in Dynamic Bayesian networks into a corresponding augmented Bayesian network. Finally, we show empirically the benefits of using the properties with stateoftheart methods and with the new algorithm, which is able to handle larger data sets than before.
Exact structure discovery in Bayesian networks with less space
 In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI
, 2009
"... The fastest known exact algorithms for scorebased structure discovery in Bayesian networks on n nodes run in time and space 2 n n O(1). The usage of these algorithms is limited to networks on at most around 25 nodes mainly due to the space requirement. Here, we study space–time tradeoffs for finding ..."
Abstract

Cited by 29 (7 self)
 Add to MetaCart
(Show Context)
The fastest known exact algorithms for scorebased structure discovery in Bayesian networks on n nodes run in time and space 2 n n O(1). The usage of these algorithms is limited to networks on at most around 25 nodes mainly due to the space requirement. Here, we study space–time tradeoffs for finding an optimal network structure. When little space is available, we apply the Gurevich– Shelah recurrence—originally proposed for the Hamiltonian path problem—and obtain time 2 2n−s n O(1) in space 2 s n O(1) for any s = n/2,n/4,n/8,...; we assume the indegree of each node is bounded by a constant. For the more practical setting with moderate amounts of space, we present a novel scheme. It yields running time 2 n (3/2) p n O(1) in space 2 n (3/4) p n O(1) for any p = 0,1,...,n/2; these bounds hold as long as the indegrees are at most 0.238n. Furthermore, the latter scheme allows easy and efficient parallelization beyond previous algorithms. We also explore empirically the potential of the presented techniques. 1