Results 1  10
of
737
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 781 (8 self)
 Add to MetaCart
(Show Context)
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Learning with local and global consistency.
 In NIPS,
, 2003
"... Abstract We consider the general problem of learning from labeled and unlabeled data, which is often called semisupervised learning or transductive inference. A principled approach to semisupervised learning is to design a classifying function which is sufficiently smooth with respect to the intr ..."
Abstract

Cited by 669 (21 self)
 Add to MetaCart
(Show Context)
Abstract We consider the general problem of learning from labeled and unlabeled data, which is often called semisupervised learning or transductive inference. A principled approach to semisupervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data.
Manifold regularization: A geometric framework for learning from labeled and unlabeled examples
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning al ..."
Abstract

Cited by 576 (16 self)
 Add to MetaCart
(Show Context)
We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning algorithms and standard methods including Support Vector Machines and Regularized Least Squares can be obtained as special cases. We utilize properties of Reproducing Kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graphbased approaches) we obtain a natural outofsample extension to novel examples and so are able to handle both transductive and truly semisupervised settings. We present experimental evidence suggesting that our semisupervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework.
Random walks for image segmentation
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2006
"... A novel method is proposed for performing multilabel, interactive image segmentation. Given a small number of pixels with userdefined (or predefined) labels, one can analytically and quickly determine the probability that a random walker starting at each unlabeled pixel will first reach one of the ..."
Abstract

Cited by 388 (21 self)
 Add to MetaCart
(Show Context)
A novel method is proposed for performing multilabel, interactive image segmentation. Given a small number of pixels with userdefined (or predefined) labels, one can analytically and quickly determine the probability that a random walker starting at each unlabeled pixel will first reach one of the prelabeled pixels. By assigning each pixel to the label for which the greatest probability is calculated, a highquality image segmentation may be obtained. Theoretical properties of this algorithm are developed along with the corresponding connections to discrete potential theory and electrical circuits. This algorithm is formulated in discrete space (i.e., on a graph) using combinatorial analogues of standard operators and principles from continuous potential theory, allowing it to be applied in arbitrary dimension on arbitrary graphs.
SemiSupervised Learning on Riemannian Manifolds
, 2004
"... We consider the general problem of utilizing both labeled and unlabeled data to improve classification accuracy. Under the assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a partially labeled data set in a principled manner. ..."
Abstract

Cited by 194 (7 self)
 Add to MetaCart
We consider the general problem of utilizing both labeled and unlabeled data to improve classification accuracy. Under the assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a partially labeled data set in a principled manner. The central idea of our approach is that classification functions are naturally defined only on the submanifold in question rather than the total ambient space. Using the LaplaceBeltrami operator one produces a basis (the Laplacian Eigenmaps) for a Hilbert space of square integrable functions on the submanifold. To recover such a basis, only unlabeled examples are required. Once such a basis is obtained, training can be performed using the labeled data set. Our algorithm models the manifold using the adjacency graph for the data and approximates the LaplaceBeltrami operator by the graph Laplacian. We provide details of the algorithm, its theoretical justification, and several practical applications for image, speech, and text classification.
Fast random walk with restart and its applications
 In ICDM ’06: Proceedings of the 6th IEEE International Conference on Data Mining
, 2006
"... How closely related are two nodes in a graph? How to compute this score quickly, on huge, diskresident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captionin ..."
Abstract

Cited by 179 (19 self)
 Add to MetaCart
(Show Context)
How closely related are two nodes in a graph? How to compute this score quickly, on huge, diskresident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captioning of images, generalizations to the “connection subgraphs”, personalized PageRank, and many more. However, the straightforward implementations of RWR do not scale for large graphs, requiring either quadratic space and cubic precomputation time, or slow response time on queries. We propose fast solutions to this problem. The heart of our approach is to exploit two important properties shared by many real graphs: (a) linear correlations and (b) blockwise, communitylike structure. We exploit the linearity by using lowrank matrix approximation, and the community structure by graph partitioning, followed by the ShermanMorrison lemma for matrix inversion. Experimental results on the Corel image and the DBLP dabasets demonstrate that our proposed methods achieve significant savings over the straightforward implementations: they can save several orders of magnitude in precomputation and storage cost, and they achieve up to 150x speed up with 90%+ quality preservation. 1
SemiSupervised Classification by Low Density Separation
, 2005
"... We believe that the cluster assumption is key to successful semisupervised learning. Based on this, we propose three semisupervised algorithms: 1. deriving graphbased distances that emphazise low density regions between clusters, followed by training a standard SVM; 2. optimizing the Transd ..."
Abstract

Cited by 177 (9 self)
 Add to MetaCart
We believe that the cluster assumption is key to successful semisupervised learning. Based on this, we propose three semisupervised algorithms: 1. deriving graphbased distances that emphazise low density regions between clusters, followed by training a standard SVM; 2. optimizing the Transductive SVM objective function, which places the decision boundary in low density regions, by gradient descent; 3. combining the first two to make maximum use of the cluster assumption. We compare with state of the art algorithms and demonstrate superior accuracy for the latter two methods.
Towards a Theoretical Foundation for LaplacianBased Manifold Methods
, 2007
"... In recent years manifold methods have attracted a considerable amount of attention in machine learning. However most algorithms in that class may be termed “manifoldmotivated” as they lack any explicit theoretical guarantees. In this paper we take a step towards closing the gap between theory and p ..."
Abstract

Cited by 156 (12 self)
 Add to MetaCart
(Show Context)
In recent years manifold methods have attracted a considerable amount of attention in machine learning. However most algorithms in that class may be termed “manifoldmotivated” as they lack any explicit theoretical guarantees. In this paper we take a step towards closing the gap between theory and practice for a class of Laplacianbased manifold methods. These methods utilize the graph Laplacian associated to a data set for a variety of applications in semisupervised learning, clustering, data representation. We show that under certain conditions the graph Laplacian of a point cloud of data samples converges to the LaplaceBeltrami operator on the underlying manifold. Theorem 3.1 contains the first result showing convergence of a random graph Laplacian to the manifold Laplacian in the context of machine learning.
Ranking on Data Manifolds
 Advances in Neural Information Processing Systems 16
, 2004
"... The Google search engine has enjoyed huge success with its web page ranking algorithm, which exploits global, rather than local, hyperlink structure of the web using random walks. Here we propose a simple universal ranking algorithm for data lying in the Euclidean space, such as text or image data. ..."
Abstract

Cited by 146 (1 self)
 Add to MetaCart
(Show Context)
The Google search engine has enjoyed huge success with its web page ranking algorithm, which exploits global, rather than local, hyperlink structure of the web using random walks. Here we propose a simple universal ranking algorithm for data lying in the Euclidean space, such as text or image data. The core idea of our method is to rank the data with respect to the intrinsic manifold structure collectively revealed by a great amount of data. Encouraging experimental results from synthetic, image, and text data illustrate the validity of our method.
Regularization and semisupervised learning on large graphs
 In COLT
, 2004
"... Abstract. We consider the problem of labeling a partially labeled graph. This setting may arise in a number of situations from survey sampling to information retrieval to pattern recognition in manifold settings. It is also of potential practical importance, when the data is abundant, but labeling i ..."
Abstract

Cited by 145 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We consider the problem of labeling a partially labeled graph. This setting may arise in a number of situations from survey sampling to information retrieval to pattern recognition in manifold settings. It is also of potential practical importance, when the data is abundant, but labeling is expensive or requires human assistance. Our approach develops a framework for regularization on such graphs. The algorithms are very simple and involve solving a single, usually sparse, system of linear equations. Using the notion of algorithmic stability, we derive bounds on the generalization error and relate it to structural invariants of the graph. Some experimental results testing the performance of the regularization algorithm and the usefulness of the generalization bound are presented. 1