Results 1  10
of
78
Bayesian Network Classifiers
, 1997
"... Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with stateoftheart classifiers such as C4.5. This fact raises the question of whether a classifier with less restr ..."
Abstract

Cited by 796 (20 self)
 Add to MetaCart
Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with stateoftheart classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we evaluate approaches for inducing classifiers from data, based on the theory of learning Bayesian networks. These networks are factored representations of probability distributions that generalize the naive Bayesian classifier and explicitly represent statements about independence. Among these approaches we single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness that characterize naive Bayes. We experimentally tested these approaches, using problems from the University of California at Irvine repository, and compared them to C4.5, naive Bayes, and wrapper methods for feature selection.
Selectivity Estimation using Probabilistic Models
, 2001
"... Estimating the result size of complex queries that involve selection on multiple attributes and the join of several relations is a difficult but fundamental task in database query processing. It arises in costbased query optimization, query profiling, and approximate query answering. In this paper, ..."
Abstract

Cited by 98 (3 self)
 Add to MetaCart
Estimating the result size of complex queries that involve selection on multiple attributes and the join of several relations is a difficult but fundamental task in database query processing. It arises in costbased query optimization, query profiling, and approximate query answering. In this paper, we show how probabilistic graphical models can be effectively used for this task as an accurate and compact approximation of the joint frequency distribution of multiple attributes across multiple relations. Probabilistic Relational Models (PRMs) are a recent development that extends graphical statistical models such as Bayesian Networks to relational domains. They represent the statistical dependencies between attributes within a table, and between attributes across foreignkey joins. We provide an efficient algorithm for constructing a PRM from a database, and show how a PRM can be used to compute selectivity estimates for a broad class of queries. One of the major contributions of this work is a unified framework for the estimation of queries involving both select and foreignkey join operations. Furthermore, our approach is not limited to answering a small set of predetermined queries; a single model can be used to effectively estimate the sizes of a wide collection of potential queries across multiple tables. We present results for our approach on several realworld databases. For both singletable multiattribute queries and a general class of selectjoin queries, our approach produces more accurate estimates than standard approaches to selectivity estimation, using comparable space and time.
Building Classifiers using Bayesian Networks
 In Proceedings of the thirteenth national conference on artificial intelligence
, 1996
"... Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state of the art classifiers such as C4.5. This fact raises the question of whether a classifier with less restr ..."
Abstract

Cited by 92 (2 self)
 Add to MetaCart
(Show Context)
Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state of the art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we examine and evaluate approaches for inducing classifiers from data, based on recent results in the theory of learning Bayesian networks. Bayesian networks are factored representations of probability distributions that generalize the naive Bayes classifier and explicitly represent statements about independence. Among these approacheswe single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness which are characteristic of naive Bayes. We experimentally tested these approaches using benchmark problems from...
A computational algebra approach to the reverse engineering of gene regulatory networks
 Journal of Theoretical Biology
, 2004
"... This paper proposes a new method to reverse engineer gene regulatory networks from experimental data. The modeling framework used is timediscrete deterministic dynamical systems, with a finite set of states for each of the variables. The simplest examples of such models are Boolean networks, in whi ..."
Abstract

Cited by 64 (10 self)
 Add to MetaCart
This paper proposes a new method to reverse engineer gene regulatory networks from experimental data. The modeling framework used is timediscrete deterministic dynamical systems, with a finite set of states for each of the variables. The simplest examples of such models are Boolean networks, in which variables have only two possible states. The use of a larger number of possible states allows a finer discretization of experimental data and more than one possible mode of action for the variables, depending on threshold values. Furthermore, with a suitable choice of state set, one can employ powerful tools from computational algebra, that underlie the reverseengineering algorithm, avoiding costly enumeration strategies. To perform well, the algorithm requires wildtype together with perturbation time courses. This makes it suitable for small to mesoscale networks rather than networks on a genomewide scale. An analysis of the complexity of the algorithm is performed. The algorithm is validated on a recently published Boolean network model of segment polarity development in Drosophila melanogaster.
A variational approximation for Bayesian networks with discrete and continuous latent variables
 In UAI
, 1999
"... We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust ..."
Abstract

Cited by 51 (5 self)
 Add to MetaCart
We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust the variational parameters to improve the quality of the approximation. We demonstrate experimentally that this approximation is much faster than sampling, but comparable in accuracy. We also introduce a simple new technique for handling evidence, which allows us to handle arbitrary distributionson observed nodes, as well as achieving a significant speedup in networks with discrete variables of large cardinality. 1
Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network
 Proc. 1st IEEE Computer Society Bioinformatics Conference
, 2002
"... We propose a new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network. An essential point of Bayesian network construction is in the estimation of the conditional distribution of each random variable. We consider fitting nonparametric ..."
Abstract

Cited by 51 (19 self)
 Add to MetaCart
(Show Context)
We propose a new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network. An essential point of Bayesian network construction is in the estimation of the conditional distribution of each random variable. We consider fitting nonparametric regression models with heterogeneous error variances to the microarray gene expression data to capture the nonlinear structures between genes. A problem still remains to be solved in selecting an optimal graph, which gives the best representation of the system among genes. We theoretically derive a new graph selection criterion from Bayes approach in general situations. The proposed method includes previous methods based on Bayesian networks. We demonstrate the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae gene expression data newly obtained by disrupting 100 genes. 1.
Bayesian Network Classification with Continuous Attributes: Getting the Best of Both Discretization and Parametric Fitting
 In Proceedings of the International Conference on Machine Learning (ICML
, 1998
"... In a recent paper, Friedman, Geiger, and Goldszmidt [8] introduced a classifier based on Bayesian networks, called Tree Augmented Naive Bayes (TAN), that outperforms naive Bayes and performs competitively with C4.5 and other stateoftheart methods. This classifier has several advantages including ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
(Show Context)
In a recent paper, Friedman, Geiger, and Goldszmidt [8] introduced a classifier based on Bayesian networks, called Tree Augmented Naive Bayes (TAN), that outperforms naive Bayes and performs competitively with C4.5 and other stateoftheart methods. This classifier has several advantages including robustness and polynomial computational complexity. One limitation of the TAN classifier is that it applies only to discrete attributes, and thus, continuous attributes must be prediscretized. In this paper, we extend TAN to deal with continuous attributes directly via parametric (e.g., Gaussians) and semiparametric (e.g., mixture of Gaussians) conditional probabilities. The result is a classifier that can represent and combine both discrete and continuous attributes. In addition, we propose a new method that takes advantage of the modeling language of Bayesian networks in order to represent attributes both in discrete and continuous form simultaneously, and use both versions in the classifi...
Learning the dimensionality of hidden variables
 In UAI ’01
, 2001
"... A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. Detecting hidden variables poses two problems: determining the relations to other variables in the model and determining the ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
(Show Context)
A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. Detecting hidden variables poses two problems: determining the relations to other variables in the model and determining the number of states of the hidden variable. In this paper, we address the latter problem in the context of Bayesian networks. We describe an approach that utilizes a scorebased agglomerative stateclustering. As we show, this approach allows us to efficiently evaluate models with a range of cardinalities for the hidden variable. We show how to extend this procedure to deal with multiple interacting hidden variables. We demonstrate the effectiveness of this approach by evaluating it on synthetic and reallife data. We show that our approach learns models with hidden variables that generalize better and have better structure than previous approaches. 1
Parallel estimation of distribution algorithms
, 2002
"... The thesis deals with the new evolutionary paradigm based on the concept of Estimation of Distribution Algorithms (EDAs) that use probabilistic model of promising solutions found so far to obtain new candidate solutions of optimized problem. There are six primary goals of this thesis: 1. Suggestion ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
The thesis deals with the new evolutionary paradigm based on the concept of Estimation of Distribution Algorithms (EDAs) that use probabilistic model of promising solutions found so far to obtain new candidate solutions of optimized problem. There are six primary goals of this thesis: 1. Suggestion of a new formal description of EDA algorithm. This high level concept can be used to compare the generality of various probabilistic models by comparing the properties of underlying mappings. Also, some convergence issues are discussed and theoretical ways for further improvements are proposed. 2. Development of new probabilistic model and methods capable of dealing with continuous parameters. The resulting Mixed Bayesian Optimization Algorithm (MBOA) uses a set of decision trees to express the probability model. Its main advantage against the mostly used IDEA and EGNA approach is its backward compatibility with discrete domains, so it is uniquely capable of learning linkage between mixed continuousdiscrete genes. MBOA handles the discretization of continuous parameters as an integral part of the learning process, which outperforms the histogrambased
Learning to predict the effects of actions: Synergy between rules and landmarks
 In Proceedings of the 6th (IEEE) International Conference on Development and Learning
, 2007
"... Abstract — A developing agent must learn the structure of its world, beginning with its sensorimotor world. It learns rules to predict how its motor signals change the sensory input it receives. It learns the limits to its motion. It learns which effects of its actions are unconditional and which ef ..."
Abstract

Cited by 25 (20 self)
 Add to MetaCart
(Show Context)
Abstract — A developing agent must learn the structure of its world, beginning with its sensorimotor world. It learns rules to predict how its motor signals change the sensory input it receives. It learns the limits to its motion. It learns which effects of its actions are unconditional and which effects are conditional, including what they depend on. We present preliminary results evaluating an implemented computational model of this important kind of foundational developmental learning. Our model demonstrates synergy between the learning of landmarks representing important qualitative distinctions, and the learning of rules that exploit those distinctions to make reliable predictions. These qualitative distinctions make it possible to define discrete events, and then to identify predictive rules describing regularities among events and the values of context variables. The attention of the learning agent is focused by a stratified model that structures the set of variables, and the structure of the stratified model is simultaneously created by the learning process. Index Terms — developmental learning, sensorimotor learning, qualitative abstraction, predictive rules, landmark values I.