Results 1 - 10
of
17
Semi-Supervised Learning Literature Survey
, 2006
"... We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter ..."
Abstract
-
Cited by 268 (7 self)
- Add to MetaCart
We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Efficient co-regularised least squares regression
- in ICML’06
, 2006
"... In many applications, unlabelled examples are inexpensive and easy to obtain. Semisupervised approaches try to utilise such examples to reduce the predictive error. In this paper, we investigate a semi-supervised least squares regression algorithm based on the co-learning approach. Similar to other ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
In many applications, unlabelled examples are inexpensive and easy to obtain. Semisupervised approaches try to utilise such examples to reduce the predictive error. In this paper, we investigate a semi-supervised least squares regression algorithm based on the co-learning approach. Similar to other semisupervised algorithms, our base algorithm has cubic runtime complexity in the number of unlabelled examples. To be able to handle larger sets of unlabelled examples, we devise a semi-parametric variant that scales linearly in the number of unlabelled examples. Experiments show a significant error reduction by co-regularisation and a large runtime improvement for the semi-parametric approximation. Last but not least, we propose a distributed procedure that can be applied without collecting all data at a single site. 1.
Analyzing co-training style algorithms
- in Proceedings of the 18th European Conference on Machine Learning
, 2007
"... Abstract. Co-training is a semi-supervised learning paradigm which trains two learners respectively from two different views and lets the learners label some unlabeled examples for each other. In this paper, we present a new PAC analysis on co-training style algorithms. We show that the co-training ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Abstract. Co-training is a semi-supervised learning paradigm which trains two learners respectively from two different views and lets the learners label some unlabeled examples for each other. In this paper, we present a new PAC analysis on co-training style algorithms. We show that the co-training process can succeed even without two views, given that the two learners have large difference, which explains the success of some co-training style algorithms that do not require two views. Moreover, we theoretically explain that why the co-training process could not improve the performance further after a number of rounds, and present a rough estimation on the appropriate round to terminate co-training to avoid some wasteful learning rounds. 1
Semisupervised regression with order preferences
, 2006
"... Following a discussion on the general form of regularization for semi-supervised learning, we propose a semi-supervised regression algorithm. It is based on the assumption that we have certain order preferences on unlabeled data (e.g., point x1 has a larger target value than x2). Semi-supervised lea ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Following a discussion on the general form of regularization for semi-supervised learning, we propose a semi-supervised regression algorithm. It is based on the assumption that we have certain order preferences on unlabeled data (e.g., point x1 has a larger target value than x2). Semi-supervised learning consists of enforcing the order preferences as regularization in a risk minimization framework. The optimization problem can be effectively solved by a linear program. Experiments show that the proposed semi-supervised regression outperforms standard regression. 1 Semi-supervised learning as regularization on unlabeled data Semi-supervised learning works when its assumption on unlabeled data, often expressed as regularization, fits the reality of the problem domain. In this paper we first generalize the regularization formulation of some common semi-supervised learning approaches, namely manifold regularization, semi-supervised support vector machines, and multi-view learning [1, 2, 3]. Regularization for each individual approach is not new. However these approaches have been studied largely in isolation. Our general form serves as a bridge to connect them, and to inspire novel semi-supervised approaches. As an example of the latter, we propose a novel algorithm for semi-supervised regression. The proposed regression algorithm is able to incorporate domain knowledge about the relative order of target values on unlabeled points. It thus differs from, and complements, existing semi-supervised regression methods, which do not use such domain knowledge but require multiple views [4, 5]. Let us review the three common semi-supervised learning methods. Manifold regularization [6, 7] generalizes
Improve Computer-Aided Diagnosis with Machine Learning Techniques Using Undiagnosed Samples
"... In computer-aided diagnosis, machine learning techniques have been widely applied to learn hypothesis from diagnosed samples in order to assist the medical experts in making diagnosis. To learn a well-performed hypothesis, a large amount of diagnosed samples are required. Although the samples can b ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In computer-aided diagnosis, machine learning techniques have been widely applied to learn hypothesis from diagnosed samples in order to assist the medical experts in making diagnosis. To learn a well-performed hypothesis, a large amount of diagnosed samples are required. Although the samples can be easily collected from routine medical examinations, it is usually impossible for the medical experts to make diagnosis for each of the collected samples. If hypothesis could be learned in presence of a large amount of undiagnosed samples, the heavy burden on the medical experts could be released. In this paper, a new semi-supervised learning algorithm named Co-Forest is proposed. It extends the co-training paradigm by using a well-known ensemble method named Random Forest, which enables Co-Forest to estimate the labeling confidence of undiagnosed samples and produce the final hypothesis easily. Experiments on benchmark data sets verify the effectiveness of the proposed algorithm. Case studies on three medical data sets and a successful application to microcalcification detection for breast cancer diagnosis show that undiagnosed samples are helpful in building computer-aided diagnosis systems, and Co-Forest is able to enhance the performance of the hypothesis learned on only a small amount of diagnosed samples by utilizing the available undiagnosed samples.
Estimation of mixture models using Co-EM
- In Proceedings of the ICML Workshop on Learning with Multiple Views
, 2005
"... We study estimation of mixture models for problems in which multiple views of the instances are available. Examples of this setting include clustering web pages or research papers that have intrinsic (text) and extrinsic (references) attributes. Our optimization criterion quantifies the likelihood a ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We study estimation of mixture models for problems in which multiple views of the instances are available. Examples of this setting include clustering web pages or research papers that have intrinsic (text) and extrinsic (references) attributes. Our optimization criterion quantifies the likelihood and the consensus among models in the individual views; maximizing this consensus minimizes a bound on the risk of assigning an instance to an incorrect mixture component. We derive an algorithm that maximizes this criterion. Empirically, we observe that the resulting clustering method incurs a lower cluster entropy than regular EM for web pages, research papers, and many text collections. 1.
Semi-supervised Learning with Data Calibration for Long-Term Time Series Forecasting
"... Many time series prediction methods have focused on single step or short term prediction problems due to the inherent difficulty in controlling the propagation of errors from one prediction step to the next step. Yet, there is a broad range of applications such as climate impact assessments and urba ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Many time series prediction methods have focused on single step or short term prediction problems due to the inherent difficulty in controlling the propagation of errors from one prediction step to the next step. Yet, there is a broad range of applications such as climate impact assessments and urban growth planning that require long term forecasting capabilities for strategic decision making. Training an accurate model that produces reliable long term predictions would require an extensive amount of historical data, which are either unavailable or expensive to acquire. For some of these domains, there are alternative ways to generate potential scenarios for the future using computer-driven simulation models, such as global climate and traffic demand models. However, the data generated by these models are currently utilized in a supervised learning setting, where a predictive model trained on past observations is used to estimate the future values. In this paper, we present a semisupervised learning framework for long-term time series forecasting based on Hidden Markov Model Regression. A covariance alignment method is also developed to deal with the issue of inconsistencies between historical and model simulation data. We evaluated our approach on data sets from a variety of domains, including climate modeling. Our experimental results demonstrate the efficacy of the approach compared to other supervised learning methods for longterm time series forecasting.
1 Semi-supervised Learning of Joint Density Models for Human Pose Estimation
"... Learning regression models (for example for body pose estimation, or BPE) currently requires large numbers of training examples—pairs of the form (image, pose parameters). These examples are difficult to obtain for many problems, demanding considerable effort in manual labelling. However it is easy ..."
Abstract
- Add to MetaCart
Learning regression models (for example for body pose estimation, or BPE) currently requires large numbers of training examples—pairs of the form (image, pose parameters). These examples are difficult to obtain for many problems, demanding considerable effort in manual labelling. However it is easy to obtain unlabelled examples—in BPE, simply by collecting many images, and by sampling many poses using motion capture. We show how the use of unlabelled examples can improve the performance of such estimators, making better use of the difficult-to-obtain training examples. Because the distribution of parameters conditioned on a given image is often multimodal, conventional regression models must be extended to allow for multiple modes. Such extensions have to date had a pre-set number of modes, independent of the contents of the input image, and amount to fitting several regressors simultaneously. Our framework models instead the joint distribution of images and poses, so the conditional estimates are inherently multimodal, and the number of modes is a function of the joint-space complexity, rather than of the maximum number of output modes. We demonstrate the improvements obtainable by using unlabelled samples on synthetic examples and on a real pose estimation problem, and demonstrate in both cases the additional accuracy provided by the use of unlabelled data. 1

