• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Learning with Labelled and Unlabelled Data (2001)

by M Seeger
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 96
Next 10 →

Learning with local and global consistency

by Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, Bernhard Schölkopf - Advances in Neural Information Processing Systems 16 , 2004
"... We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic stru ..."
Abstract - Cited by 286 (20 self) - Add to MetaCart
We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data. 1

Manifold regularization: A geometric framework for learning from examples

by Mikhail Belkin, Partha Niyogi, Vikas Sindhwani, Peter Bartlett - Journal of Machine Learning Research , 2004
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning al ..."
Abstract - Cited by 197 (12 self) - Add to MetaCart
We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning algorithms and standard methods including Support Vector Machines and Regularized Least Squares can be obtained as special cases. We utilize properties of Reproducing Kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph-based approaches) we obtain a natural out-of-sample extension to novel examples and so are able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semi-supervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework. 1.

Learning from Labeled and Unlabeled Data with Label Propagation

by Xiaojin Zhu, Zoubin Ghahramani , 2002
"... We investigate the use of unlabeled data to help labeled data in classification. We propose a simple iterative algorithm, label propagation, to propagate labels through the dataset along high density areas defined by unlabeled data. We give the analysis of the algorithm, show its solution, and its c ..."
Abstract - Cited by 56 (0 self) - Add to MetaCart
We investigate the use of unlabeled data to help labeled data in classification. We propose a simple iterative algorithm, label propagation, to propagate labels through the dataset along high density areas defined by unlabeled data. We give the analysis of the algorithm, show its solution, and its connection to several other algorithms. We also show how to learn parameters by minimum spanning tree heuristic and entropy minimization, and the algorithm's ability to do feature selection. Experiment results are promising.

On Manifold Regularization

by Mikhail Belkin, Partha Niyogi, Vikas Sindhwani , 2005
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learni ..."
Abstract - Cited by 48 (1 self) - Add to MetaCart
We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning algorithms and standard methods including Support Vector Machines and Regularized Least Squares can be obtained as special cases. We utilize properties of Reproducing Kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph based approaches) we obtain a natural out-of-sample extension to novel examples and are thus able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semisupervised algorithms are able to use unlabeled data effectively. In the absence of labeled examples, our framework gives rise to a regularized form of spectral clustering with an out-of-sample extension.

Semi-supervised protein classification using cluster kernels

by Jason Weston, Christina Leslie, Dengyong Zhou, Andre Elisseeff, William Stafford Noble , 2003
"... A key issue in supervised protein classification is the representation of input sequences of amino acids. Recent work using string kernels for protein data has achieved state-of-the-art classification performance. However, such representations are based only on labeled data — examples with known 3D ..."
Abstract - Cited by 36 (3 self) - Add to MetaCart
A key issue in supervised protein classification is the representation of input sequences of amino acids. Recent work using string kernels for protein data has achieved state-of-the-art classification performance. However, such representations are based only on labeled data — examples with known 3D structures, organized into structural classes — while in practice, unlabeled data is far more plentiful. In this work, we develop simple and scalable cluster kernel techniques for incorporating unlabeled data into the representation of protein sequences. We show that our methods greatly improve the classification performance of string kernels and outperform standard approaches for using unlabeled data, such as adding close homologs of the positive examples to the training data. We achieve equal or superior performance to previously presented cluster kernel methods while achieving far greater computational efficiency.

Transfer Learning for Image Classification with Sparse Prototype Representations

by Ariadna Quattoni, Michael Collins, Trevor Darrell
"... To learn a new visual category from few examples, prior knowledge from unlabeled data as well as previous related categories may be useful. We develop a new method for transfer learning which exploits available unlabeled data and an arbitrary kernel function; we form a representation based on kernel ..."
Abstract - Cited by 29 (5 self) - Add to MetaCart
To learn a new visual category from few examples, prior knowledge from unlabeled data as well as previous related categories may be useful. We develop a new method for transfer learning which exploits available unlabeled data and an arbitrary kernel function; we form a representation based on kernel distances to a large set of unlabeled data points. To transfer knowledge from previous related problems we observe that a category might be learnable using only a small subset of reference prototypes. Related problems may share a significant number of relevant prototypes; we find such a concise representation by performing a joint loss minimization over the training sets of related problems with a shared regularization penalty that minimizes the total number of prototypes involved in the approximation. This optimization problem can be formulated as a linear program that can be solved efficiently. We conduct experiments on a news-topic prediction task where the goal is to predict whether an image belongs to a particular news topic. Our results show that when only few examples are available for training a target topic, leveraging knowledge learnt from other topics can significantly improve performance.

Covariance Kernels from Bayesian Generative Models

by Matthias Seeger - Advances in Neural Information Processing Systems 14 , 2000
"... We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of ..."
Abstract - Cited by 28 (2 self) - Add to MetaCart
We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of factor analyzers in order to attack classification problems in high-dimensional spaces where labeled data is sparse, but unlabeled data is abundant.

Unlabeled Data Can Degrade Classification Performance of Generative Classifiers

by Fabio G. Cozman, Ira Cohen - in Fifteenth International Florida Artificial Intelligence Society Conference , 2002
"... This paper analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that unlabeled data can degrade the performance of a classifier when there are discrepancies ..."
Abstract - Cited by 27 (7 self) - Add to MetaCart
This paper analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that unlabeled data can degrade the performance of a classifier when there are discrepancies between modeling assumptions used to build the classifier and the actual model that generates the data

Semi-Supervised Learning: From Gaussian Fields to Gaussian Processes

by Xiaojin Zhu, John Lafferty, Zoubin Ghahramani - School of CS, CMU , 2003
"... We show that the Gaussian random fields and harmonic energy minimizing function framework for semi-supervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian. We derive hyperparameter learning with evidence maximization, and give an em ..."
Abstract - Cited by 25 (1 self) - Add to MetaCart
We show that the Gaussian random fields and harmonic energy minimizing function framework for semi-supervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian. We derive hyperparameter learning with evidence maximization, and give an empirical study of various ways to parameterize the graph weights.

Co-EM Support Vector Learning

by Ulf Brefeld, Tobias Scheffer - In Proceedings of the International Conference on Machine Learning , 2004
"... Multi-view algorithms, such as co-training and co-EM, utilize unlabeled data when the available attributes can be split into independent and compatible subsets. Co-EM outperforms co-training for many problems, but it requires the underlying learner to estimate class probabilities, and to learn ..."
Abstract - Cited by 24 (5 self) - Add to MetaCart
Multi-view algorithms, such as co-training and co-EM, utilize unlabeled data when the available attributes can be split into independent and compatible subsets. Co-EM outperforms co-training for many problems, but it requires the underlying learner to estimate class probabilities, and to learn from probabilistically labeled data. Therefore, coEM has so far only been studied with naive Bayesian learners. We cast linear classifiers into a probabilistic framework and develop a co-EM version of the Support Vector Machine.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University