Results 1 - 10
of
13
Semi-Supervised Learning Literature Survey
, 2006
"... We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter ..."
Abstract
-
Cited by 268 (7 self)
- Add to MetaCart
We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning
- In Proc. Int. Conf. Machine Learning
, 2005
"... Graph-based methods for semi-supervised learning have recently been shown to be promising for combining labeled and unlabeled data in classification problems. However, inference for graphbased methods often does not scale well to very large data sets, since it requires inversion of a large matrix or ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Graph-based methods for semi-supervised learning have recently been shown to be promising for combining labeled and unlabeled data in classification problems. However, inference for graphbased methods often does not scale well to very large data sets, since it requires inversion of a large matrix or solution of a large linear program. Moreover, such approaches are inherently transductive, giving predictions for only those points in the unlabeled set, and not for an arbitrary test point. In this paper a new approach is presented that preserves the strengths of graph-based semisupervised learning while overcoming the limitations of scalability and non-inductive inference, through a combination of generative mixture models and discriminative regularization using the graph Laplacian. Experimental results show that this approach preserves the accuracy of purely graph-based transductive methods when the data has “manifold structure, ” and at the same time achieves inductive learning with significantly reduced computational cost. 1.
Semi-Supervised Multitask Learning
"... A semi-supervised multitask learning (MTL) framework is presented, in which M parameterized semi-supervised classifiers, each associated with one of M partially labeled data manifolds, are learned jointly under the constraint of a softsharing prior imposed over the parameters of the classifiers. The ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
A semi-supervised multitask learning (MTL) framework is presented, in which M parameterized semi-supervised classifiers, each associated with one of M partially labeled data manifolds, are learned jointly under the constraint of a softsharing prior imposed over the parameters of the classifiers. The unlabeled data are utilized by basing classifier learning on neighborhoods, induced by a Markov random walk over a graph representation of each manifold. Experimental results on real data sets demonstrate that semi-supervised MTL yields significant improvements in generalization performance over either semi-supervised single-task learning (STL) or supervised MTL. 1
Bayesian Co-Training
"... We propose a Bayesian undirected graphical model for co-training, or more generally for semi-supervised multi-view learning. This makes explicit the previously unstated assumptions of a large class of co-training type algorithms, and also clarifies the circumstances under which these assumptions fai ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We propose a Bayesian undirected graphical model for co-training, or more generally for semi-supervised multi-view learning. This makes explicit the previously unstated assumptions of a large class of co-training type algorithms, and also clarifies the circumstances under which these assumptions fail. Building upon new insights from this model, we propose an improved method for co-training, which is a novel co-training kernel for Gaussian process classifiers. The resulting approach is convex and avoids local-maxima problems, unlike some previous multi-view learning methods. Furthermore, it can automatically estimate how much each view should be trusted, and thus accommodate noisy or unreliable views. Experiments on toy data and real world data sets illustrate the benefits of this approach. 1
An iterative algorithm for extending learners to a semisupervised setting
- The 2007 Joint Statistical Meetings (JSM
, 2007
"... In this paper, we present an iterative self-training algorithm, whose objective is to extend learners from a supervised setting into a semi-supervised setting. The algorithm is based on using the predicted values for observations where the response is missing (unlabeled data) and then incorporates t ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In this paper, we present an iterative self-training algorithm, whose objective is to extend learners from a supervised setting into a semi-supervised setting. The algorithm is based on using the predicted values for observations where the response is missing (unlabeled data) and then incorporates the predictions appropriately at subsequent stages. Convergence properties of the algorithm are investigated for particular learners, such as linear/logistic regression and linear smoothers with particular emphasis on kernel smoothers. Further, implementation issues of the algorithm with other learners such as generalized additive models, tree partitioning methods, partial least squares, etc. are also addressed. The connection between the proposed algorithm and graph-based semi-supervised learning methods is also discussed. The algorithm is illustrated on a number of real data sets using a varying degree of labeled responses. Keywords: Semi-supervised learning, linear smoothers, convergence, iterative algorithm
On Classification with Incomplete Data
"... We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic inte ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic inte-gration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both Expectation-Maximization (EM) and Variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semi-supervised case by incorporating graph-based regularization. The semi-supervised algorithm utilizes all available data — both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown. I.
ON MULTI-VIEW LEARNING WITH ADDITIVE MODELS
- SUBMITTED TO THE ANNALS OF APPLIED STATISTICS
"... In many scientific settings, data can be naturally partitioned into variable groupings called views. Common examples include environmental (1st view) and genetic information (2nd view) in ecological applications, chemical (1st view) and biological (2nd view) data in drug discovery. Viewed data also ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In many scientific settings, data can be naturally partitioned into variable groupings called views. Common examples include environmental (1st view) and genetic information (2nd view) in ecological applications, chemical (1st view) and biological (2nd view) data in drug discovery. Viewed data also occurs in text analysis and proteomics applications where one view consists of a graph with observations as the vertices and a weighted measure of pairwise similarity between observations as the edges. Further in several of these applications the observations can be partitioned into two sets, one where the response is observed (labeled) and the other where the response is not (unlabeled). The problem for simultaneously addressing viewed data and incorporating unlabeled observations in training is referred to as multi-view transductive learning. In this work we introduce and study a comprehensive generalized fixed point additive modeling framework for multi-view transductive learning, where any view is represented by a linear smoother. The problem of view selection is discussed using a modified generalized Akaike Information Criterion, which provides an approach for testing the contribution of each view. An efficient implementation is provided for fitting these models with both backfitting and local-scoring type algorithms adjusted to semi-supervised graph-based learning. The proposed technique is assessed on both synthetic and real data sets and is shown to be competitive to stateof-the-art co-training and graph-based techniques.
SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON A MARKOV RANDOM FIELD AND SPARSE MULTINOMIAL LOGISTIC REGRESSION
"... This paper introduces a new semi-supervised classification and segmentation approach tailored to hyperspectral images. The posterior distributions of the classes are modeled by the multinomial logistic regression. The contextual information inherent to the spatial configuration of the image pixels i ..."
Abstract
- Add to MetaCart
This paper introduces a new semi-supervised classification and segmentation approach tailored to hyperspectral images. The posterior distributions of the classes are modeled by the multinomial logistic regression. The contextual information inherent to the spatial configuration of the image pixels is modeled by a Multi-Level Logistic (MLL) Markov-Gibbs random field. The multinomial logistic regressors, assumed to be random vectors with independent Laplacian components, are learned using the recently introduced LOR-SAL algorithm. The maximum a posteriori (MAP) segmentation is computed via the α-Expansion algorithm, a powerful graph cut based approach to integer optimization. The effectiveness of the proposed methodology is illustrated by classifying simulated and real data sets. Comparisons with state-of-art methods are also included. 1.
Hyperspectral Image Segmentation Using a New Bayesian Approach with Active Learning
"... This paper introduces a new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of two main steps: (a) learning, for each class label, the posterior probability distributions using a multinomial logistic regression model; (b) segmenting the hyperspec ..."
Abstract
- Add to MetaCart
This paper introduces a new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of two main steps: (a) learning, for each class label, the posterior probability distributions using a multinomial logistic regression model; (b) segmenting the hyperspectral image based on the posterior probability distribution learned in step (a) and on a multi-level logistic prior which encodes the spatial information. The multinomial logistic regressors are learned by using the recently introduced logistic regression via splitting and augmented Lagrangian (LORSAL) algorithm. The maximum a posteriori segmentation is efficiently computed by the α-Expansion min-cut based integer optimization algorithm. Aiming at reducing the costs of acquiring large training sets, active learning is performed using a mutual information based criterion. The state-of-the-art performance of the proposed approach is illustrated using both simulated and real hyperspectral data sets in a number of experimental comparisons with recently introduced hyperspectral image classification methods. Index Terms Hyperspectral image segmentation, sparse multinomial logistic regression, ill-posed problems, graph cuts, integer optimization, mutual information, active learning. I.
Active Learning of Features and Labels
"... Co-training improves multi-view classifier learning by enforcing internal consistency between the predicted classes of unlabeled objects based on different views (different sets of features for characterizing the same object). In some applications, due to the cost involved in data acquisition, only ..."
Abstract
- Add to MetaCart
Co-training improves multi-view classifier learning by enforcing internal consistency between the predicted classes of unlabeled objects based on different views (different sets of features for characterizing the same object). In some applications, due to the cost involved in data acquisition, only a subset of features may be obtained for many unlabeled objects. Observing additional features of objects that were earlier incompletely characterized, increases the data available for cotraining, hence improving the classification accuracy. This paper addresses the problem of active learning of features: which additional features should be acquired of incompletely characterized objects in order to maximize the accuracy of the learned classifier? Our method, which extends previous techniques for the active learning of labels, is experimentally shown to be effective in a real-life multi-sensor mine detection problem. 1.

