Results 1  10
of
77
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 782 (8 self)
 Add to MetaCart
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Classification in Networked Data: A toolkit and a univariate case study
, 2006
"... This paper is about classifying entities that are interlinked with entities for which the class is known. After surveying prior work, we present NetKit, a modular toolkit for classification in networked data, and a casestudy of its application to networked data used in prior machine learning resear ..."
Abstract

Cited by 200 (10 self)
 Add to MetaCart
This paper is about classifying entities that are interlinked with entities for which the class is known. After surveying prior work, we present NetKit, a modular toolkit for classification in networked data, and a casestudy of its application to networked data used in prior machine learning research. NetKit is based on a nodecentric framework in which classifiers comprise a local classifier, a relational classifier, and a collective inference procedure. Various existing nodecentric relational learning algorithms can be instantiated with appropriate choices for these components, and new combinations of components realize new algorithms. The case study focuses on univariate network classification, for which the only information used is the structure of class linkage in the network (i.e., only links and some class labels). To our knowledge, no work previously has evaluated systematically the power of classlinkage alone for classification in machine learning benchmark data sets. The results demonstrate that very simple networkclassification models perform quite well—well enough that they should be used regularly as baseline classifiers for studies of learning with networked data. The simplest method (which performs remarkably well) highlights the close correspondence between several existing methods introduced for different purposes—i.e., Gaussianfield classifiers, Hopfield networks, and relationalneighbor classifiers. The case study also shows that there are two sets of techniques that are preferable in different situations, namely when few versus many labels are known initially. We also demonstrate that link selection plays an important role similar to traditional feature selection.
Missl: Multipleinstance semisupervised learning
 In Proceedings of the International Conference on Machine Learning (ICML
, 2006
"... There has been much work on applying multipleinstance (MI) learning to contentbased image retrieval (CBIR) where the goal is to rank all images in a known repository using a small labeled data set. Most existing MI learning algorithms are nontransductive in that the images in the repository serve o ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
(Show Context)
There has been much work on applying multipleinstance (MI) learning to contentbased image retrieval (CBIR) where the goal is to rank all images in a known repository using a small labeled data set. Most existing MI learning algorithms are nontransductive in that the images in the repository serve only as test data and are not used in the learning process. We present MISSL (MultipleInstance SemiSupervised Learning) that transforms any MI problem into an input for a graphbased singleinstance semisupervised learning method that encodes the MI aspects of the problem simultaneously working at both the bag and point levels. Unlike most prior MI learning algorithms, MISSL makes use of the unlabeled data. 1.
Word sense disambiguation using label propagation based semisupervised learning
 Proceedings of the ACL
, 2005
"... Shortage of manually sensetagged data is an obstacle to supervised word sense disambiguation (WSD) methods. In this paper we investigate a label propagation based semisupervised learning algorithm for WSD, which combines unlabeled data with labeled data in learning process by representing labeled ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
(Show Context)
Shortage of manually sensetagged data is an obstacle to supervised word sense disambiguation (WSD) methods. In this paper we investigate a label propagation based semisupervised learning algorithm for WSD, which combines unlabeled data with labeled data in learning process by representing labeled and unlabeled examples as vertices in a weighted graph and iteratively propagating the label information from any vertex to nearby vertices until this process converges. This label propagation process realizes a global consistency assumption: similar examples should have similar labels. Our experimental results on benchmark corpora indicate that it consistently outperforms SVM when only very few labeled examples are available, and its performance is also better than monolingual bootstrapping, and comparable to bilingual bootstrapping. 1
Relation Extraction Using Label Propagation Based Semisupervised Learning
, 2006
"... Shortage of manually labeled data is an obstacle to supervised relation extraction methods. In this paper we investigate a graph based semisupervised learning algorithm, a label propagation (LP) algorithm, for relation extraction. It represents labeled and unlabeled examples and their distances as ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
Shortage of manually labeled data is an obstacle to supervised relation extraction methods. In this paper we investigate a graph based semisupervised learning algorithm, a label propagation (LP) algorithm, for relation extraction. It represents labeled and unlabeled examples and their distances as the nodes and the weights of edges of a graph, and tries to obtain a labeling function to satisfy two constraints: 1) it should be fixed on the labeled nodes, 2) it should be smooth on the whole graph. Experiment results on the ACE corpus showed that this LP algorithm achieves better performance than SVM when only very few labeled examples are available, and it also performs better than bootstrapping for the relation extraction task.
Improving learning in network data by combining explicit and mined links
 In AAAI
, 2007
"... This paper is about using multiple types of information for classification of networked data in a semisupervised setting: given a fully described network (nodes and edges) with known labels for some of the nodes, predict the labels of the remaining nodes. One method recently developed for doing suc ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
(Show Context)
This paper is about using multiple types of information for classification of networked data in a semisupervised setting: given a fully described network (nodes and edges) with known labels for some of the nodes, predict the labels of the remaining nodes. One method recently developed for doing such inference is a guiltbyassociation model. This method has been independently developed in two different settings–relational learning and semisupervised learning. In relational learning, the setting assumes that the networked data has explicit links such as hyperlinks between webpages or citations between research papers. The semisupervised setting assumes a corpus of nonrelational data and creates links based on similarity measures between the instances. Both use only the known labels in the network to predict the remaining labels but use very different information sources. The thesis of this paper is that if we combine these two types of links, the resulting network will carry more information than either type of link by itself. We test this thesis on six benchmark data sets, using a withinnetwork learning algorithm, where we show that we gain significant improvements in predictive performance by combining the links. We describe a principled way of combining multiple types of edges with different edgeweights and semantics using an objective graph measure called nodebased assortativity. We investigate the use of this measure to combine textmined links with explicit links and show that using our approach significantly improves performance of our classifier over naively combining these two types of links.
A Discriminative Model for SemiSupervised Learning
, 2008
"... Supervised learning — that is, learning from labeled examples — is an area of Machine Learning that has reached substantial maturity. It has generated generalpurpose and practicallysuccessful algorithms and the foundations are quite well understood and captured by theoretical frameworks such as th ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
(Show Context)
Supervised learning — that is, learning from labeled examples — is an area of Machine Learning that has reached substantial maturity. It has generated generalpurpose and practicallysuccessful algorithms and the foundations are quite well understood and captured by theoretical frameworks such as the PAClearning model and the Statistical Learning theory framework. However, for many contemporary practical problems such as classifying web pages or detecting spam, there is often additional information available in the form of unlabeled data, which is often much cheaper and more plentiful than labeled data. As a consequence, there has recently been substantial interest in semisupervised learning — using unlabeled data together with labeled data — since any useful information that reduces the amount of labeled data needed can be a significant benefit. Several techniques have been developed for doing this, along with experimental results on a variety of different learning problems. Unfortunately, the standard learning frameworks for reasoning about supervised learning do not capture the key aspects and the assumptions underlying these semisupervised learning methods. In this paper we describe an augmented version of the PAC model designed for semisupervised learning, that can be used to reason about many of the different approaches taken over the past
Active Learning on Trees and Graphs
"... We investigate the problem of active learning on a given tree whose nodes are assigned binary labels in an adversarial way. Inspired by recent results by Guillory and Bilmes, we characterize (up to constant factors) the optimal placement of queries so to minimize the mistakes made on the nonqueried ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
We investigate the problem of active learning on a given tree whose nodes are assigned binary labels in an adversarial way. Inspired by recent results by Guillory and Bilmes, we characterize (up to constant factors) the optimal placement of queries so to minimize the mistakes made on the nonqueried nodes. Our query selection algorithm is extremely efficient, and the optimal number of mistakes on the nonqueried nodes is achieved by a simple and efficient mincut classifier. Through a simple modification of the query selection algorithm we also show optimality (up to constant factors) with respect to the tradeoff between number of queries and number of mistakes on nonqueried nodes. By using spanning trees, our algorithms can be efficiently applied to general graphs, although the problem of finding optimal and efficient active learning algorithms for general graphs remains open. Towards this end, we provide a lower bound on the number of mistakes made on arbitrary graphs by any active learning algorithm using a number of queries which is up to a constant fraction of the graph size.
Fast Prediction on a Tree
"... Given an nvertex weighted tree with structural diameter S and a subset of m vertices, we present a technique to compute a corresponding m × m Gram matrix of the pseudoinverse of the graph Laplacian in O(n + m 2 + mS) time. We discuss the application of this technique to fast label prediction on a g ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
Given an nvertex weighted tree with structural diameter S and a subset of m vertices, we present a technique to compute a corresponding m × m Gram matrix of the pseudoinverse of the graph Laplacian in O(n + m 2 + mS) time. We discuss the application of this technique to fast label prediction on a generic graph. We approximate the graph with a spanning tree and then we predict with the kernel perceptron. We address the approximation of the graph with either a minimum spanning tree or a shortest path tree. The fast computation of the pseudoinverse enables us to address prediction problems on large graphs. We present experiments on two webspam classification tasks, one of which includes a graph with 400,000 vertices and more than 10,000,000 edges. The results indicate that the accuracy of our technique is competitive with previous methods using the full graph information. 1