• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Efficient approximations for the marginal likelihood of bayesian networks with hidden variables (1997)

by D M Chickering, D Heckerman
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 194
Next 10 →

Empirical Analysis of Predictive Algorithm for Collaborative Filtering

by John S. Breese, David Heckerman, Carl Kadie - Proceedings of the 14 th Conference on Uncertainty in Artificial Intelligence , 1998
"... 1 ..."
Abstract - Cited by 1497 (4 self) - Add to MetaCart
Abstract not found

A tutorial on learning with Bayesian networks

by David Heckerman - LEARNING IN GRAPHICAL MODELS , 1995
"... ..."
Abstract - Cited by 1069 (3 self) - Add to MetaCart
Abstract not found

Bayesian Network Classifiers

by Nir Friedman, Dan Geiger, Moises Goldszmidt , 1997
"... Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state-of-the-art classifiers such as C4.5. This fact raises the question of whether a classifier with less restr ..."
Abstract - Cited by 796 (20 self) - Add to MetaCart
Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state-of-the-art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we evaluate approaches for inducing classifiers from data, based on the theory of learning Bayesian networks. These networks are factored representations of probability distributions that generalize the naive Bayesian classifier and explicitly represent statements about independence. Among these approaches we single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness that characterize naive Bayes. We experimentally tested these approaches, using problems from the University of California at Irvine repository, and compared them to C4.5, naive Bayes, and wrapper methods for feature selection.
(Show Context)

Citation Context

...en, 1995) or gradient descent (Binder et al., 1997). The problem of selecting the best structure is usually intractable in the presence of missing values. Several recent efforts (Geiger et al., 1996; =-=Chickering & Heckerman, 1996-=-) have examined approximations to the marginal score that can be evaluated efficiently. Additionally, Friedman (1997b) has proposed a variant of EM for selecting the graph structure thatBAYESIAN NETW...

Dynamic Bayesian Networks: Representation, Inference and Learning

by Kevin Patrick Murphy , 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have bee ..."
Abstract - Cited by 770 (3 self) - Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data. In particular, the main novel technical contributions of this thesis are as follows: a way of representing Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.

How many clusters? Which clustering method? Answers via model-based cluster analysis

by Chris Fraley, Adrian E. Raftery - THE COMPUTER JOURNAL , 1998
"... ..."
Abstract - Cited by 450 (21 self) - Add to MetaCart
Abstract not found

Trust management for the semantic web

by Matthew Richardson, Rakesh Agrawal, Pedro Domingos - In ISWC , 2003
"... Abstract. Though research on the Semantic Web has progressed at a steady pace, its promise has yet to be realized. One major difficulty is that, by its very nature, the Semantic Web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give ea ..."
Abstract - Cited by 271 (3 self) - Add to MetaCart
Abstract. Though research on the Semantic Web has progressed at a steady pace, its promise has yet to be realized. One major difficulty is that, by its very nature, the Semantic Web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give each source. We cannot expect each user to know the trustworthiness of each source, nor would we want to assign top-down or global credibility values due to the subjective nature of trust. We tackle this problem by employing a web of trust, in which each user provides personal trust values for a small number of other users. We compose these trusts to compute the trust a user should place in any other user in the network. A user is not assigned a single trust rank. Instead, different users may have different trust values for the same user. We define properties for combination functions which merge such trusts, and define a class of functions for which merging may be done locally while maintaining these properties. We give examples of specific functions and apply them to data from Epinions and our BibServ bibliography server. Experiments confirm that the methods are robust to noise, and do not put unreasonable expectations on users. We hope that these methods will help move the Semantic Web closer to fulfilling its promise. 1.
(Show Context)

Citation Context

.... Note that for weighted average to make sense, if the user has not specified a belief we need to impute the value. Techniques such as those used in collaborative filtering [30] and Bayesian networks =-=[13]-=- for dealing with missing values may be applicable. If only relative rankings of beliefs are necessary, then it may be sufficient to use 0 for all unspecified beliefs. 5. Similarity of Probabilistic a...

A Variational Bayesian Framework for Graphical Models

by Hagai Attias - In Advances in Neural Information Processing Systems 12 , 2000
"... This paper presents a novel practical framework for Bayesian model averaging and model selection in probabilistic graphical models. Our approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner. These posteriors ..."
Abstract - Cited by 267 (7 self) - Add to MetaCart
This paper presents a novel practical framework for Bayesian model averaging and model selection in probabilistic graphical models. Our approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner. These posteriors fall out of a free-form optimization procedure, which naturally incorporates conjugate priors. Unlike in large sample approximations, the posteriors are generally nonGaussian and no Hessian needs to be computed. Predictive quantities are obtained analytically. The resulting algorithm generalizes the standard Expectation Maximization algorithm, and its convergence is guaranteed. We demonstrate that this approach can be applied to a large class of models in several domains, including mixture models and source separation. 1 Introduction A standard method to learn a graphical model 1 from data is maximum likelihood (ML). Given a training dataset, ML estimates a single optimal value f...

The Bayesian Structural EM Algorithm

by Nir Friedman , 1998
"... In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete data---that is, in the presence of missing values or hidden variables. In a recent paper, I in ..."
Abstract - Cited by 260 (13 self) - Add to MetaCart
In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete data---that is, in the presence of missing values or hidden variables. In a recent paper, I introduced an algorithm called Structural EM that combines the standard Expectation Maximization (EM) algorithm, which optimizes parameters, with structure search for model selection. That algorithm learns networks based on penalized likelihood scores, which include the BIC/MDL score and various approximations to the Bayesian score. In this paper, I extend Structural EM to deal directly with Bayesian model selection. I prove the convergence of the resulting algorithm and show how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.
(Show Context)

Citation Context

... cannot be solved in closed form. Current attempts to learn from incomplete data using the Bayesian score use either stochastic simulation or Laplace's approximation to approximate this integral (see =-=[7]-=- and the references within). The former methods tend to be computationally expensive, and the latter methods can be imprecise. In particular, the Laplace approximation assumes that the likelihood func...

Consensus clustering -- A resampling-based method for class discovery and visualization of gene expression microarray data

by Stefano Monti, Pablo Tamayo, Jill Mesirov, Todd Golub - MACHINE LEARNING 52 (2003) 91–118 FUNCTIONAL GENOMICS SPECIAL ISSUE , 2003
"... ..."
Abstract - Cited by 255 (11 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...mptions on which they are based. In particular, most of these methods are based on asymptotic approximations of the marginal likelihood, whose accuracy tends to decrease as the sample size decreases (=-=Chickering & Heckerman, 1997-=-; Kass & Raftery, 1995). This can clearly be a problem in the “large N , small p” paradigm (i.e., high dimension and small sample size) typical of gene expression data (West, 2002; West et al., 2001)....

Learning Belief Networks in the Presence of Missing Values and Hidden Variables

by Nir Friedman - Proceedings of the Fourteenth International Conference on Machine Learning , 1997
"... In recent years there has been a flurry of works on learning probabilistic belief networks. Current state of the art methods have been shown to be successful for two learning scenarios: learning both network structure and parameters from complete data, and learning parameters for a fixed network fr ..."
Abstract - Cited by 148 (13 self) - Add to MetaCart
In recent years there has been a flurry of works on learning probabilistic belief networks. Current state of the art methods have been shown to be successful for two learning scenarios: learning both network structure and parameters from complete data, and learning parameters for a fixed network from incomplete data---that is, in the presence of missing values or hidden variables. However, no method has yet been demonstrated to effectively learn network structure from incomplete data. In this paper, we propose a new method for learning network structure from incomplete data. This method is based on an extension of the Expectation-Maximization (EM) algorithm for model selection problems that performs search for the best structure inside the EM procedure. We prove the convergence of this algorithm, and adapt it for learning belief networks. We then describe how to learn networks in two scenarios: when the data contains missing values, and in the presence of hidden variables. We provide...
(Show Context)

Citation Context

...current candidate. To the best of our knowledge, such methods have been successfully applied only to problems where there are few choices to be made: Clustering methods (e.g., [Cheeseman et al. 1988; =-=Chickering and Heckerman 1996-=-]) select the number of values for a single hidden variable in networks with a fixed structure, and Heckerman [1995] describes an experiment with a single missing value and five observable variables. ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University