Results 1  10
of
91
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 782 (8 self)
 Add to MetaCart
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Generalized expectation criteria for semisupervised learning of conditional random fields
 In In Proc. ACL, pages 870 – 878
, 2008
"... This paper presents a semisupervised training method for linearchain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model’s distri ..."
Abstract

Cited by 108 (11 self)
 Add to MetaCart
(Show Context)
This paper presents a semisupervised training method for linearchain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model’s distribution on unlabeled data matches a target distribution. We induce target conditional probability distributions of labels given features from both annotated feature occurrences in context and adhoc feature majority label assignment. The use of generalized expectation criteria allows for a dramatic reduction in annotation time by shifting from traditional instancelabeling to featurelabeling, and the methods presented outperform traditional CRF training and other semisupervised methods when limited human effort is available. 1
A survey of kernel and spectral methods for clustering,”
 Pattern Recognit.,
, 2008
"... Abstract Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a ..."
Abstract

Cited by 88 (5 self)
 Add to MetaCart
(Show Context)
Abstract Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., Kmeans, SOM and Neural Gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel Kmeans clustering algorithm.
BThe rendezvous algorithm: Multiclass semisupervised learning with Markov random walks
 in Proc. Int. Conf
, 2007
"... We consider the problem of multiclass classification where both labeled and unlabeled data points are given. We introduce and demonstrate a new approach for estimating a distribution over the missing labels where data points are viewed as nodes of a graph, and pairwise similarities are used to deriv ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of multiclass classification where both labeled and unlabeled data points are given. We introduce and demonstrate a new approach for estimating a distribution over the missing labels where data points are viewed as nodes of a graph, and pairwise similarities are used to derive a transition probability matrix P for a Markov random walk between them. The algorithm associates each point with a particle which moves between points according to P. Labeled points are set to be absorbing states of the Markov random walk, and the probability of each particle to be absorbed by the different labeled points, as the number of steps increases, is then used to derive a distribution over the associated missing label. A computationally efficient algorithm to implement this is derived and demonstrated on both real and artificial data sets, including a numerical comparison with other methods. 1. introduction Semisupervised learning (SSL) is generally concerned with the following problem; given a set of samples {s1,..., sl, sl+1,..., sl+u} and the labels of the first l samples, {y1,..., yl}, estimate {yl+1,..., yl+u}, the labels of the rest of the points. The underlying assumption usually made is that the data is scattered such that it is correlated with the labels. For example, Maximum Variance Unfolding (Weinberger & Saul, 2004), Laplacian Eigenmaps (Belkin & Niyogi, 2003) and Laplacian RLS (Belkin et al., 2005) assume the effective number of dimensions occupied by the data is smaller than the input space dimension (the manifold
A Brief Survey on Sequence Classification
"... Sequence classification has a broad range of applications such as genomic analysis, information retrieval, health informatics, finance, and abnormal detection. Different from the classification task on feature vectors, sequences do not have explicit features. Even with sophisticated feature selectio ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
Sequence classification has a broad range of applications such as genomic analysis, information retrieval, health informatics, finance, and abnormal detection. Different from the classification task on feature vectors, sequences do not have explicit features. Even with sophisticated feature selection techniques, the dimensionality of potential features may still be very high and the sequential nature of features is difficult to capture. This makes sequence classification a more challenging task than classification on feature vectors. In this paper, we present a brief review of the existing work on sequence classification. We summarize the sequence classification in terms of methodologies and application domains. We also provide a review on several extensions of the sequence classification problem, such as early classification on sequences and semisupervised learning on sequences. 1.
Person Identification in Webcam Images: An Application of SemiSupervised Learning
 LEARNING, ICML2005 WORKSHOP ON LEARNING WITH PARTIALLY CLASSIFIED TRAINING DATA
, 2005
"... An application of semisupervised learning is made to the problem of person identification in low quality webcam images. Using a set of images of ten people collected over a period of four months, the person identification task is posed as a graphbased semisupervised learning problem, where only a ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
(Show Context)
An application of semisupervised learning is made to the problem of person identification in low quality webcam images. Using a set of images of ten people collected over a period of four months, the person identification task is posed as a graphbased semisupervised learning problem, where only a few training images are labeled. The importance of domain knowledge in graph construction is discussed, and experiments are presented that clearly show the advantage of semisupervised learning over standard supervised learning. The data used in the study is available to the research community to encourage further investigation of this problem.
Simple, robust, scalable semisupervised learning via expectation regularization
 The 24th International Conference on Machine Learning
, 2007
"... Although semisupervised learning has been an active area of research, its use in deployed applications is still relatively rare because the methods are often difficult to implement, fragile in tuning, or lacking in scalability. This paper presents expectation regularization, a semisupervised learn ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
Although semisupervised learning has been an active area of research, its use in deployed applications is still relatively rare because the methods are often difficult to implement, fragile in tuning, or lacking in scalability. This paper presents expectation regularization, a semisupervised learning method for exponential family parametric models that augments the traditional conditional labellikelihood objective function with an additional term that encourages model predictions on unlabeled data to match certain expectations—such as label priors. The method is extremely easy to implement, scales as well as logistic regression, and can handle nonindependent features. We present experiments on five different data sets, showing accuracy improvements over other semisupervised methods. 1.
SemiSupervised Learning Using Label Mean
"... SemiSupervised Support Vector Machines (S3VMs) typically directly estimate the label assignments for the unlabeled instances. This is often inefficient even with recent advances in the efficient training of the (supervised) SVM. In this paper, we show that S3VMs, with knowledge of the means of the ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
(Show Context)
SemiSupervised Support Vector Machines (S3VMs) typically directly estimate the label assignments for the unlabeled instances. This is often inefficient even with recent advances in the efficient training of the (supervised) SVM. In this paper, we show that S3VMs, with knowledge of the means of the class labels of the unlabeled data, is closely related to the supervised SVM with known labels on all the unlabeled data. This motivates us to first estimate the label means of the unlabeled data. Two versions of the meanS3VM, which work by maximizing the margin between the label means, are proposed. The first one is based on multiple kernel learning, while the second one is based on alternating optimization. Experiments show that both of the proposed algorithms achieve highly competitive and sometimes even the best performance as compared to the stateoftheart semisupervised learners. Moreover, they are more efficient than existing S3VMs. 1.
RolX: Structural Role Extraction & Mining in Large Graphs
"... Given a network, intuitively two nodes belong to the same role if they have similar structural behavior. Roles should be automatically determined from the data, and could be, for example, “cliquemembers, ” “peripherynodes, ” etc. Roles enable numerous novel and useful network mining tasks, such as ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
Given a network, intuitively two nodes belong to the same role if they have similar structural behavior. Roles should be automatically determined from the data, and could be, for example, “cliquemembers, ” “peripherynodes, ” etc. Roles enable numerous novel and useful network mining tasks, such as sensemaking, searching for similar nodes, and node classification. This paper addresses the question: Given a graph, how can we automatically discover roles for nodes? We propose RolX (Role eXtraction), a scalable (linear in the number of edges), unsupervised learning approach for automatically extracting structural roles from general network data. We demonstrate the effectiveness of RolX on several network mining tasks, from exploratory data analysis to network transfer learning. Moreover, we compare network role discovery with network community discovery. We highlight fundamental differences between the two (e.g., roles generalize across disconnected networks, communities do not); and show that the two approach are complimentary in nature.