Results 1  10
of
30
An improved admissible heuristic for learning optimal Bayesian networks
 IN PROCEEDINGS OF THE 28TH CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI12
, 2012
"... Recently two search algorithms, A* and breadthfirst branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Recently two search algorithms, A* and breadthfirst branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses optimal parents independently. As a result, the heuristic may contain many directed cycles and result in a loose bound. This paper introduces an improved admissible heuristic that tries to avoid directed cycles within small groups of variables. A sparse representation is also introduced to store only the unique optimal parent choices. Empirical results show that the new techniques significantly improved the efficiency and scalability of A* and BFBnB on most of datasets tested in this paper.
Learning Bounded Treewidth Bayesian Networks using Integer Linear Programming
, 2014
"... In many applications one wants to compute conditional probabilities given a Bayesian network. This inference problem is NPhard in general but becomes tractable when the network has low treewidth. Since the inference problem is common in many application areas, we provide a practical algorithm for ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In many applications one wants to compute conditional probabilities given a Bayesian network. This inference problem is NPhard in general but becomes tractable when the network has low treewidth. Since the inference problem is common in many application areas, we provide a practical algorithm for learning bounded treewidth Bayesian networks. We cast this problem as an integer linear program (ILP). The program can be solved by an anytime algorithm which provides upper bounds to assess the quality of the found solutions. A key component of our program is a novel integer linear formulation for bounding treewidth of a graph. Our tests clearly indicate that our approach works in practice, as our implementation was able to find an optimal or nearly optimal network for most of the data sets.
Characteristic imsets for learning Bayesian network structure
 Int. J. of Approx. Reasoning
"... The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector represen ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector representative. The original proposal from Characteristic imsets are (shown to be) zeroone vectors and have many elegant properties, suitable for intended application of linear/integer programming methods to learning BN structure. They are much closer to the graphical description; we describe a simple transition between the characteristic imset and the essential graph, known as a traditional unique graphical representative of the BN structure. In the end, we relate our proposal to other recent approaches which apply linear programming methods in probabilistic reasoning.
An Ensemble of Bayesian Networks for Multilabel Classification ∗
"... We present a novel approach for multilabel classification based on an ensemble of Bayesian networks. The class variables are connected by a tree; each model of the ensemble uses a different class as root of the tree. We assume the features to be conditionally independent given the classes, thus gene ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We present a novel approach for multilabel classification based on an ensemble of Bayesian networks. The class variables are connected by a tree; each model of the ensemble uses a different class as root of the tree. We assume the features to be conditionally independent given the classes, thus generalizing the naive Bayes assumption to the multiclass case. This assumption allows us to optimally identify the correlations between classes and features; such correlations are moreover shared across all models of the ensemble. Inferences are drawn from the ensemble via logarithmic opinion pooling. To minimize Hamming loss, we compute the marginal probability of the classes by running standard inference on each Bayesian network in the ensemble, and then pooling the inferences. To instead minimize the subset 0/1 loss, we pool the joint distributions of each model and cast the problem as a MAP inference in the corresponding graphical model. Experiments show that the approach is competitive with stateoftheart methods for multilabel classification. 1
S.: Trading off speed and accuracy in multilabel classification
 In: Proceedings of the 7th European Workshop on Probabilistic Graphical Models
, 2014
"... Abstract. In previous work, we devised an approach for multilabel classification based on an ensemble of Bayesian networks. It was characterized by an efficient structural learning and by high accuracy. Its shortcoming was the high computational complexity of the MAP inference, necessary to ident ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Abstract. In previous work, we devised an approach for multilabel classification based on an ensemble of Bayesian networks. It was characterized by an efficient structural learning and by high accuracy. Its shortcoming was the high computational complexity of the MAP inference, necessary to identify the most probable joint configuration of all classes. In this work, we switch from the ensemble approach to the single model approach. This allows important computational savings. The reduction of inference times is exponential in the difference between the treewidth of the single model and the number of classes. We adopt moreover a more sophisticated approach for the structural learning of the class subgraph. The proposed single models outperforms alternative approaches for multilabel classification such as binary relevance and ensemble of classifier chains. 1
Modeling Temporal Interactions with Interval Temporal Bayesian Networks for Complex Activity Recognition
"... Abstract—Complex activities typically consist of multiple primitive events happening in parallel or sequentially over a period of time. Understanding such activities requires recognizing not only each individual event but, more importantly, capturing their spatiotemporal dependencies over different ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Complex activities typically consist of multiple primitive events happening in parallel or sequentially over a period of time. Understanding such activities requires recognizing not only each individual event but, more importantly, capturing their spatiotemporal dependencies over different time intervals. Most of the current graphical modelbased approaches have several limitations. First, timesliced graphical models such as hidden Markov models (HMMs) and dynamic Bayesian networks are typically based on points of time and they hence can only capture three temporal relations: precedes, follows, and equals. Second, HMMs are probabilistic finitestate machines that grow exponentially as the number of parallel events increases. Third, other approaches such as syntactic and descriptionbased methods, while rich in modeling temporal relationships, do not have the expressive power to capture uncertainties. To address these issues, we introduce the interval temporal Bayesian network (ITBN), a novel graphical model that combines the Bayesian Network with the interval algebra to explicitly model the temporal dependencies over time intervals. Advanced machine learning methods are introduced to learn the ITBN model structure and parameters. Experimental results show that by reasoning with spatiotemporal dependencies, the proposed model leads to a significantly improved performance when modeling and recognizing complex activities involving both parallel and sequential events. Index Terms—Activity recognition, temporal reasoning, Bayesian networks, interval temporal Bayesian networks Ç 1
Learning chordal Markov networks by dynamic programming
 IN: PROCEEDINGS OF THE 28TH ANNUAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS
, 2014
"... We present an algorithm for finding a chordal Markov network that maximizes any given decomposable scoring function. The algorithm is based on a recursive characterization of clique trees, and it runs in O(4n) time for n vertices. On an eightvertex benchmark instance, our implementation turns out t ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We present an algorithm for finding a chordal Markov network that maximizes any given decomposable scoring function. The algorithm is based on a recursive characterization of clique trees, and it runs in O(4n) time for n vertices. On an eightvertex benchmark instance, our implementation turns out to be about ten million times faster than a recently proposed, constraint satisfaction based algorithm (Corander et al., NIPS 2013). Within a few hours, it is able to solve instances up to 18 vertices, and beyond if we restrict the maximum clique size. We also study the performance of a recent integer linear programming algorithm (Bartlett and Cussens, UAI 2013). Our results suggest that, unless we bound the clique sizes, currently only the dynamic programming algorithm is guaranteed to solve instances with around 15 or more vertices.
Extended Tree Augmented Naive Classifier
"... Abstract. This work proposes an extended version of the wellknown treeaugmented naive Bayes (TAN) classifier where the structure learning step is performed without requiring features to be connected to the class. Based on a modification of Edmonds ’ algorithm, our structure learning procedure expl ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. This work proposes an extended version of the wellknown treeaugmented naive Bayes (TAN) classifier where the structure learning step is performed without requiring features to be connected to the class. Based on a modification of Edmonds ’ algorithm, our structure learning procedure explores a superset of the structures that are considered by TAN, yet achieves global optimality of the learning score function in a very efficient way (quadratic in the number of features, the same complexity as learning TANs). A range of experiments show that we obtain models with better accuracy than TAN and comparable to the accuracy of the stateoftheart classifier averaged onedependence estimator. 1
Efficient and Accurate Learning of Bayesian Networks using ChiSquared Independence Tests
"... Bayesian network structure learning is a wellknown NPcomplete problem, whose solution is of importance in machine learning. Two algorithms are proposed, both of which assess dependency between variables using the chisquared test of independence between pairs of variables and the loglikelihood ev ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Bayesian network structure learning is a wellknown NPcomplete problem, whose solution is of importance in machine learning. Two algorithms are proposed, both of which assess dependency between variables using the chisquared test of independence between pairs of variables and the loglikelihood evaluation criterion for the network. The first determines the effect of adding a potential edge (in both directions) on the loglikelihood. The second uses KL divergence to determine direction, and edges to be included are determined by thresholding normalized chisquared statistics. Experiments on multinomial data show that the proposed algorithms are more efficient and accurate than an optimized branch and bound algorithm, and human experts. 1.
Simultaneous Facial Feature Tracking and Facial Expression Recognition
"... The tracking and recognition of facial activities from images or videos attracted great attention in computer vision field. Facial activities are characterized by three levels: First, in the bottom level, facial feature points around each facial component, i.e., eyebrow, mouth, etc, capture the deta ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The tracking and recognition of facial activities from images or videos attracted great attention in computer vision field. Facial activities are characterized by three levels: First, in the bottom level, facial feature points around each facial component, i.e., eyebrow, mouth, etc, capture the detailed face shape information; Second, in the middle level, facial action units (AUs), defined in Facial Action Coding System, represent the contraction of a specific set of facial muscles, i.e., lid tightener, eyebrow raiser, etc; Finally, in the top level, six prototypical facial expressions represent the global facial muscle movement and are commonly used to describe the human emotion state. In contrast to the mainstream approaches, which usually only focus on one or two levels of facial activities, and track (or recognize) them separately, this paper introduces a unified probabilistic framework based on the Dynamic Bayesian network (DBN) to simultaneously and coherently represent the facial evolvement in different levels, their interactions and their observations. Advanced machine learning methods are introduced to learn the model based on both training data and subjective prior knowledge. Given the model and the measurements of facial motions, all three levels of facial activities are simultaneously recognized through a probabilistic inference. Extensive experiments are performed to illustrate the feasibility and effectiveness of the proposed model on all three level facial activities. Index Terms Simultaneous tracking and recognition, facial feature tracking, facial action unit recognition, expression recognition, Bayesian network. I.