• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Discriminative, generative and imitative learning. (2002)

by T Jebara
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 43
Next 10 →

Dynamic Bayesian Networks: Representation, Inference and Learning

by Kevin Patrick Murphy , 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have bee ..."
Abstract - Cited by 770 (3 self) - Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data. In particular, the main novel technical contributions of this thesis are as follows: a way of representing Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.

MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

by Jun Zhu, Amr Ahmed, Eric P. Xing
"... Supervised topic models utilize document’s side information for discovering predictive low dimensional representations of documents; and existing models apply likelihoodbased estimation. In this paper, we present a max-margin supervised topic model for both continuous and categorical response variab ..."
Abstract - Cited by 93 (27 self) - Add to MetaCart
Supervised topic models utilize document’s side information for discovering predictive low dimensional representations of documents; and existing models apply likelihoodbased estimation. In this paper, we present a max-margin supervised topic model for both continuous and categorical response variables. Our approach, the maximum entropy discrimination latent Dirichlet allocation (MedLDA), utilizes the max-margin principle to train supervised topic models and estimate predictive topic representations that are arguably more suitable for prediction. We develop efficient variational methods for posterior inference and demonstrate qualitatively and quantitatively the advantages of MedLDA over likelihood-based topic models on movie review and 20 Newsgroups data sets. 1.

A discriminative framework for modeling object classes

by Alex Holub, Pietro Perona - In Proc. IEEE CVPR , 2005
"... Here we explore a discriminative learning method on underlying generative models for the purpose of discriminating between object categories. Visual recognition algorithms learn models from a set of training examples. Generative models learn their representations by considering data from a single cl ..."
Abstract - Cited by 45 (5 self) - Add to MetaCart
Here we explore a discriminative learning method on underlying generative models for the purpose of discriminating between object categories. Visual recognition algorithms learn models from a set of training examples. Generative models learn their representations by considering data from a single class. Generative models are popular in computer vision for many reasons, including their ability to elegantly incorporate prior knowledge and to handle correspondences between object parts and detected features. However, generative models are often inferior to discriminative models during classification tasks. We study a discriminative approach to learning object categories which maintains the representational power of generative learning, but trains the generative models in a discriminative manner. The discriminatively trained models perform better during classification tasks as a result of selecting discriminative sets of features. We conclude by proposing a multiclass object recognition system which initially trains object classes in a generative manner, identifies subsets of similar classes with high confusion, and finally trains models for these subsets in a discriminative manner to realize gains in classification performance. 1
(Show Context)

Citation Context

...pproach is also known as maximizing the Conditional Likelihood or CL). Utilizing a generative framework in conjunction with a discriminative optimization has been previously proposed by other authors =-=[12, 13, 14]-=-. These studies do not observe substantial gains in using CL over generative approaches on traditional learning systems data-sets such as the UCI data-sets. One of the objectives of this study is to a...

Max-Margin Nonparametric Latent Feature Models for Link Prediction

by Jun Zhu
"... We present a max-margin nonparametric latent feature relational model, which u-nites the ideas of max-margin learning and Bayesian nonparametrics to discover discriminative latent features for link prediction and automatically infer the unknown latent social dimension. By minimizing a hinge-loss usi ..."
Abstract - Cited by 21 (9 self) - Add to MetaCart
We present a max-margin nonparametric latent feature relational model, which u-nites the ideas of max-margin learning and Bayesian nonparametrics to discover discriminative latent features for link prediction and automatically infer the unknown latent social dimension. By minimizing a hinge-loss using the linear expectation operator, we can perform posterior inference efficiently without dealing with a highly nonlinear link likelihood function; by using a fully-Bayesian formulation, we can avoid tuning regularization constants. Experimental results on real datasets appear to demonstrate the benefits inherited from max-margin learning and fully-Bayesian nonparametric inference. 1.
(Show Context)

Citation Context

...opose to directly minimize some objective function (e.g., hinge-loss) that measures the quality of link prediction, under the principle of maximum entropy discrimination (MED) (Jaakkola et al., 1999; =-=Jebara, 2002-=-), which was introduced as an elegant framework to integrate max-margin learning and Bayesian generative modeling. The present work extends MED in several novel ways to solve the challenging link pred...

The Latent Maximum Entropy Principle

by Shaojun Wang, Dale Schuurmans, Yunxin Zhao - In Proc. of ISIT , 2002
"... We present an extension to Jaynes' maximum entropy principle that handles latent variables. The principle of latent maximum entropy we propose is di#erent from both Jaynes' maximum entropy principle and maximum likelihood estimation, but often yields better estimates in the presence of h ..."
Abstract - Cited by 19 (5 self) - Add to MetaCart
We present an extension to Jaynes' maximum entropy principle that handles latent variables. The principle of latent maximum entropy we propose is di#erent from both Jaynes' maximum entropy principle and maximum likelihood estimation, but often yields better estimates in the presence of hidden variables and limited training data. We first show that solving for a latent maximum entropy model poses a hard nonlinear constrained optimization problem in general. However, we then show that feasible solutions to this problem can be obtained e#ciently for the special case of log-linear models---which forms the basis for an e#cient approximation to the latent maximum entropy principle. We derive an algorithm that combines expectation-maximization with iterative scaling to produce feasible log-linear solutions. This algorithm can be interpreted as an alternating minimization algorithm in the information divergence, and reveals an intimate connection between the latent maximum entropy and maximum likelihood principles.

Max-Margin Min-Entropy Models

by Kevin Miller, M. Pawan Kumar , Ben Packer, Danny Goodman, Daphne Koller - AISTATS , 2012
"... We propose a novel family of discriminative lvms, called max-margin min-entropy (m3e) models, that predicts the output by minimizing the Rényi entropy [18] of the corresponding generalized distribuhal-00773602, ..."
Abstract - Cited by 16 (4 self) - Add to MetaCart
We propose a novel family of discriminative lvms, called max-margin min-entropy (m3e) models, that predicts the output by minimizing the Rényi entropy [18] of the corresponding generalized distribuhal-00773602,

Bayesian inference with posterior regularization and applications to infinite latent svms

by Jun Zhu, Ning Chen, Eric P. Xing, Tony Jebara - In arXiv:1210.1766v2 , 2013
"... Existing Bayesian models, especially nonparametric Bayesian methods, rely on specially conceived priors to incorporate domain knowledge for discovering improved latent represen-tations. While priors affect posterior distributions through Bayes ’ rule, imposing posterior regularization is arguably mo ..."
Abstract - Cited by 14 (9 self) - Add to MetaCart
Existing Bayesian models, especially nonparametric Bayesian methods, rely on specially conceived priors to incorporate domain knowledge for discovering improved latent represen-tations. While priors affect posterior distributions through Bayes ’ rule, imposing posterior regularization is arguably more direct and in some cases more natural and general. In this paper, we present regularized Bayesian inference (RegBayes), a novel computational frame-work that performs posterior inference with a regularization term on the desired post-data posterior distribution under an information theoretical formulation. RegBayes is more flex-ible than the procedure that elicits expert knowledge via priors, and it covers both directed Bayesian networks and undirected Markov networks. When the regularization is induced from a linear operator on the posterior distributions, such as the expectation operator, we present a general convex-analysis theorem to characterize the solution of RegBayes. Fur-thermore, we present two concrete examples of RegBayes, infinite latent support vector ma-chines (iLSVM) and multi-task infinite latent support vector machines (MT-iLSVM), which explore the large-margin idea in combination with a nonparametric Bayesian model for dis-

Gibbs max-margin topic models with data augmentation

by Jun Zhu, Ning Chen, Hugh Perkins, Bo Zhang, David Blei - Journal of Machine Learning Research (JMLR
"... Max-margin learning is a powerful approach to building classifiers and structured output predictors. Recent work on max-margin supervised topic models has successfully integrated it with Bayesian topic models to discover discriminative latent semantic structures and make accurate predictions for uns ..."
Abstract - Cited by 13 (5 self) - Add to MetaCart
Max-margin learning is a powerful approach to building classifiers and structured output predictors. Recent work on max-margin supervised topic models has successfully integrated it with Bayesian topic models to discover discriminative latent semantic structures and make accurate predictions for unseen testing data. However, the resulting learning prob-lems are usually hard to solve because of the non-smoothness of the margin loss. Existing approaches to building max-margin supervised topic models rely on an iterative procedure to solve multiple latent SVM subproblems with additional mean-field assumptions on the desired posterior distributions. This paper presents an alternative approach by defining a new max-margin loss. Namely, we present Gibbs max-margin supervised topic models, a latent variable Gibbs classifier to discover hidden topic representations for various tasks, including classification, regression and multi-task learning. Gibbs max-margin supervised topic models minimize an expected margin loss, which is an upper bound of the existing margin loss derived from an expected prediction rule. By introducing augmented variables and integrating out the Dirichlet variables analytically by conjugacy, we develop simple

Supervised classification with conditional Gaussian networks: Increasing the structure complexity from naive Bayes

by Aritz Pérez, Pedro Larrañaga, Iñaki Inza - INTERNATIONAL JOURNAL OF APPROXIMATE REASONING , 2006
"... Most of the Bayesian network-based classifiers are usually only able to handle discrete variables. However, most real-world domains involve continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. This work shows how disc ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
Most of the Bayesian network-based classifiers are usually only able to handle discrete variables. However, most real-world domains involve continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. This work shows how discrete classifier induction algorithms can be adapted to the conditional Gaussian network paradigm to deal with continuous variables without discretizing them. In addition, three novel classifier induction algorithms and two new propositions about mutual information are introduced. The classifier induction algorithms presented are ordered and grouped according to their structural complexity: naive Bayes, tree augmented naive Bayes, k-dependence Bayesian classifiers and semi naive Bayes. All the classifier induction algorithms are empirically evaluated using predictive accuracy, and they are compared to linear discriminant analysis, as a continuous classic statistical benchmark classifier. Besides, the accuracies for a set of state-of-the-art classifiers are included in order to justify the use of linear discriminant analysis as the benchmark algorithm. In order to understand the behavior of the conditional Gaussian network-based classifiers better, the results include bias-variance decomposition of the expected misclassification rate. The study suggests that semi naive Bayes structure based classifiers and, especially, the novel wrapper condensed semi naive Bayes backward, outperform the behavior of the rest of the presented classifiers. They also obtain quite competitive results compared to the state-of-the-art algorithms included.
(Show Context)

Citation Context

... performed for mixed variables in the work of Friedman et al. [18]. BN-based classifiers can be inducted in two ways depending on the distribution to be learned: generative or discriminative learning =-=[29,52]-=-. Generative classifiers learn a model of the joint probability function of the predictor variables and the class. They classify a new instance by using the Bayes rule to compute the posterior probabi...

Statistical Imitative Learning from Perceptual Data

by Tony Jebara , Alex Pentland - PROC. ICDL 02 , 2002
"... Imitative learning has recently piqued the interest of various fields including neuroscience, cognitive science and robotics. In computational behavior modeling and development, it promises an accessible framework for rapidly forming behavior models without tedious supervision or reinforcement. Give ..."
Abstract - Cited by 9 (0 self) - Add to MetaCart
Imitative learning has recently piqued the interest of various fields including neuroscience, cognitive science and robotics. In computational behavior modeling and development, it promises an accessible framework for rapidly forming behavior models without tedious supervision or reinforcement. Given the availability of lowcost wearable sensors, the robustness of real-time perception algorithms and the feasibility of archiving large amounts of audio-visual data, it is possible to unobtrusively archive the daily activities of a human teacher and his responses to external stimuli. We combine this data acquisition/representation process with statistical learning machinery (hidden Markov models) as well as discriminative estimation algorithms to form a behavioral model of a human teacher directly from the data set. The resulting system learns audio-visual interactive behavior from the human and his environment to produce an interactive autonomous agent. The agent subsequently exhibits simple audio-visual behaviors that appear coupled to real-world test stimuli.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University