Results 1  10
of
58
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 757 (8 self)
 Add to MetaCart
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Large scale transductive svms
 JMLR
"... We show how the ConcaveConvex Procedure can be applied to Transductive SVMs, which traditionally require solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach. Software is a ..."
Abstract

Cited by 92 (5 self)
 Add to MetaCart
(Show Context)
We show how the ConcaveConvex Procedure can be applied to Transductive SVMs, which traditionally require solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach. Software is available at
Online Domain Adaptation of a PreTrained Cascade of Classifiers
"... Many classifiers are trained with massive training sets only to be applied at test time on data from a different distribution. How can we rapidly and simply adapt a classifier to a new test distribution, even when we do not have access to the original training data? We present an online approach fo ..."
Abstract

Cited by 43 (1 self)
 Add to MetaCart
(Show Context)
Many classifiers are trained with massive training sets only to be applied at test time on data from a different distribution. How can we rapidly and simply adapt a classifier to a new test distribution, even when we do not have access to the original training data? We present an online approach for rapidly adapting a “black box ” classifier to a new test data set without retraining the classifier or examining the original optimization criterion. Assuming the original classifier outputs a continuous number for which a threshold gives the class, we reclassify points near the original boundary using a Gaussian process regression scheme. We show how this general procedure can be used in the context of a classifier cascade, demonstrating performance that far exceeds stateoftheart results in face detection on a standard data set. We also draw connections to work in semisupervised learning, domain adaptation, and information regularization. 1.
Relational learning with Gaussian processes
 In NIPS 19
, 2007
"... Correlation between instances is often modelled via a kernel function using input attributes of the instances. Relational knowledge can further reveal additional pairwise correlations between variables of interest. In this paper, we develop a class of models which incorporates both reciprocal relat ..."
Abstract

Cited by 42 (9 self)
 Add to MetaCart
(Show Context)
Correlation between instances is often modelled via a kernel function using input attributes of the instances. Relational knowledge can further reveal additional pairwise correlations between variables of interest. In this paper, we develop a class of models which incorporates both reciprocal relational information and input attributes using Gaussian process techniques. This approach provides a novel nonparametric Bayesian framework with a datadependent covariance function for supervised learning tasks. We also apply this framework to semisupervised learning. Experimental results on several real world data sets verify the usefulness of this algorithm. 1
Learning ContextSensitive Shape Similarity by Graph Transduction
, 2010
"... Shape similarity and shape retrieval are very important topics in computer vision. The recent progress in this domain has been mostly driven by designing smart shape descriptors for providing better similarity measure between pairs of shapes. In this paper, we provide a new perspective to this probl ..."
Abstract

Cited by 41 (7 self)
 Add to MetaCart
Shape similarity and shape retrieval are very important topics in computer vision. The recent progress in this domain has been mostly driven by designing smart shape descriptors for providing better similarity measure between pairs of shapes. In this paper, we provide a new perspective to this problem by considering the existing shapes as a group, and study their similarity measures to the query shape in a graph structure. Our method is general and can be built on top of any existing shape similarity measure. For a given similarity measure, a new similarity is learned through graph transduction. The new similarity is learned iteratively so that the neighbors of a given shape influence its final similarity to the query. The basic idea here is related to PageRank ranking, which forms a foundation of Google Web search. The presented experimental results demonstrate that the proposed approach yields significant improvements over the stateofart shape matching algorithms. We obtained a retrieval rate of 91.61 percent on the MPEG7 data set, which is the highest ever reported in the literature. Moreover, the learned similarity by the proposed method also achieves promising improvements on both shape classification and shape clustering.
The Joint Manifold Model for Semisupervised Multivalued Regression
"... Many computer vision tasks may be expressed as the problem of learning a mapping between image space and a parameter space. For example, in human body pose estimation, recent research has directly modelled the mapping from image features (z) to joint angles (θ). Fitting such models requires training ..."
Abstract

Cited by 40 (5 self)
 Add to MetaCart
(Show Context)
Many computer vision tasks may be expressed as the problem of learning a mapping between image space and a parameter space. For example, in human body pose estimation, recent research has directly modelled the mapping from image features (z) to joint angles (θ). Fitting such models requires training data in the form of labelled (z, θ) pairs, from which are learned the conditional densities p(θz). Inference is then simple: given test image features z, the conditional p(θz) is immediately computed. However large amounts of training data are required to fit the models, particularly in the case where the spaces are high dimensional. We show how the use of unlabelled data—samples from the marginal distributions p(z) and p(θ)—may be used to improve fitting. This is valuable because it is often significantly easier to obtain unlabelled than labelled samples. We use a Gaussian process latent variable model to learn the mapping from a shared latent lowdimensional manifold to the feature and parameter spaces. This extends existing approaches to (a) use unlabelled data, and (b) represent onetomany mappings. Experiments on synthetic and real problems demonstrate how the use of unlabelled data improves over existing techniques. In our comparisons, we include existing approaches that are explicitly semisupervised as well as those which implicitly make use of unlabelled examples. 1.
Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes
 Advances in Neural Information Processing systems (NIPS20
, 2007
"... Abstract We show how to use unlabeled data and a deep belief net (DBN) to learn a goodcovariance kernel for a Gaussian process. We first learn a deep generative model of the unlabeled data using the fast, greedy algorithm introduced by [7]. If thedata is highdimensional and highlystructured, a Gau ..."
Abstract

Cited by 37 (3 self)
 Add to MetaCart
Abstract We show how to use unlabeled data and a deep belief net (DBN) to learn a goodcovariance kernel for a Gaussian process. We first learn a deep generative model of the unlabeled data using the fast, greedy algorithm introduced by [7]. If thedata is highdimensional and highlystructured, a Gaussian kernel applied to the top layer of features in the DBN works much better than a similar kernel appliedto the raw input. Performance at both regression and classification can then be further improved by using backpropagation through the DBN to discriminativelyfinetune the covariance kernel. 1 Introduction Gaussian processes (GP's) are a widely used method for Bayesian nonlinear nonparametric regression and classification [13, 16]. GP's are based on defining a similarity or kernel function that encodes prior knowledge of the smoothness of the underlying process that is being modeled. Because of their flexibility and computational simplicity, GP's have been successfully used in many areas of machine learning. Many realworld applications are characterized by highdimensional, highlystructured data with alarge supply of unlabeled data but a very limited supply of labeled data. Applications such as information retrieval and machine vision are examples where unlabeled data is readily available. GP'sare discriminative models by nature and within the standard regression or classification scenario, unlabeled data is of no use. Given a set of i.i.d. labeled input vectors Xl = {xn}Nn=1 and theirassociated target labels {
SemiBoost: Boosting for Semisupervised Learning
 TO APPEAR IN THE IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... Semisupervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classificati ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
(Show Context)
Semisupervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semisupervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemisupervised learning algorithm that wraps around the underlying supervised algorithm, and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semisupervised learning, termed as SemiBoost. The key advantages of the proposed semisupervised learning approach are: (a) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, (b) efficient computation by the iterative boosting algorithm, and (c) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different datasets, and on text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the stateoftheart semisupervised learning algorithms.
Efficient GraphBased SemiSupervised Learning of Structured Tagging Models
"... We describe a new scalable algorithm for semisupervised training of conditional random fields (CRF) and its application to partofspeech (POS) tagging. The algorithm uses a similarity graph to encourage similar ngrams to have similar POS tags. We demonstrate the efficacy of our approach on a domai ..."
Abstract

Cited by 30 (1 self)
 Add to MetaCart
(Show Context)
We describe a new scalable algorithm for semisupervised training of conditional random fields (CRF) and its application to partofspeech (POS) tagging. The algorithm uses a similarity graph to encourage similar ngrams to have similar POS tags. We demonstrate the efficacy of our approach on a domain adaptation task, where we assume that we have access to large amounts of unlabeled data from the target domain, but no additional labeled data. The similarity graph is used during training to smooth the state posteriors on the target domain. Standard inference can be used at test time. Our approach is able to scale to very large problems and yields significantly improved target domain accuracy. 1
Extensions of gaussian processes for ranking: semisupervised and active learning
 In NIPS Wksp on Learning to Rank
, 2005
"... Unlabelled examples in supervised learning tasks can be optimally exploited using semisupervised methods and active learning. We focus on ranking learning from pairwise instance preference to discuss these important extensions, semisupervised learning and active learning, in the probabilistic fram ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
(Show Context)
Unlabelled examples in supervised learning tasks can be optimally exploited using semisupervised methods and active learning. We focus on ranking learning from pairwise instance preference to discuss these important extensions, semisupervised learning and active learning, in the probabilistic framework of Gaussian processes. Numerical experiments demonstrate the capacities of these techniques. 1