Results 1  10
of
133
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 757 (8 self)
 Add to MetaCart
(Show Context)
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Randomwalk computation of similarities between nodes of a graph, with application to collaborative recommendation
 IEEE Transactions on Knowledge and Data Engineering
, 2006
"... Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average comm ..."
Abstract

Cited by 188 (19 self)
 Add to MetaCart
(Show Context)
Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average commute time, the pseudoinverse of the Laplacian matrix of the graph, etc.) that provide similarities between any pair of nodes, having the nice property of increasing when the number of paths connecting those elements increases and when the “length ” of paths decreases. It turns out that the square root of the average commute time is a Euclidean distance and that the pseudoinverse of the Laplacian matrix is a kernel matrix (its elements are inner products closely related to commute times). A principal component analysis (PCA) of the graph is introduced for computing the subspace projection of the node vectors in a manner that preserves as much variance as possible in terms of the Euclidean commutetime distance. This graph PCA provides a nice interpretation to the “Fiedler vector, ” widely used for graph partitioning. The model is evaluated on a collaborativerecommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. Experimental results on the MovieLens database show that the Laplacianbased similarities perform well in comparison with other methods. The model, which nicely fits into the socalled “statistical relational learning ” framework, could also be used to compute document or word similarities, and, more generally, it could be applied to machinelearning and patternrecognition tasks involving a relational database. Index Terms—Graph analysis, graph and database mining, collaborative recommendation, graph kernels, spectral clustering, Fiedler vector, proximity measures, statistical relational learning. 1
Semisupervised conditional random fields for improved sequence segmentation and labeling
 In International Committee on Computational Linguistics and the Association for Computational Linguistics
, 2006
"... We present a new semisupervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the struct ..."
Abstract

Cited by 78 (7 self)
 Add to MetaCart
(Show Context)
We present a new semisupervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein mentions in biological texts, and show that incorporating unlabeled data improves the performance of the supervised CRF in this case. 1
Spectral Clustering and Transductive Learning with Multiple Views
"... We consider spectral clustering and transductive inference for data with multiple views. A typical example is the web, which can be described by either the hyperlinks between web pages or the words occurring in web pages. When each view is represented as a graph, one may convexly combine the weight ..."
Abstract

Cited by 76 (2 self)
 Add to MetaCart
(Show Context)
We consider spectral clustering and transductive inference for data with multiple views. A typical example is the web, which can be described by either the hyperlinks between web pages or the words occurring in web pages. When each view is represented as a graph, one may convexly combine the weight matrices or the discrete Laplacians for each graph, and then proceed with existing clustering or classification techniques. Such a solution might sound natural, but its underlying principle is not clear. Unlike this kind of methodology, we develop multiview spectral clustering via generalizing the normalized cut from a single view to multiple views. We further build multiview transductive inference on the basis of multiview spectral clustering. Our framework leads to a mixture of Markov chains defined on every graph. The experimental evaluation on realworld web classification demonstrates promising results that validate our method. 1.
Learning with hypergraphs: Clustering, classification, and embedding
 Advances in Neural Information Processing Systems (NIPS) 19
, 2006
"... We usually endow the investigated objects with pairwise relationships, which can be illustrated as graphs. In many realworld problems, however, relationships among the objects of our interest are more complex than pairwise. Naively squeezing the complex relationships into pairwise ones will inevita ..."
Abstract

Cited by 73 (2 self)
 Add to MetaCart
(Show Context)
We usually endow the investigated objects with pairwise relationships, which can be illustrated as graphs. In many realworld problems, however, relationships among the objects of our interest are more complex than pairwise. Naively squeezing the complex relationships into pairwise ones will inevitably lead to loss of information which can be expected valuable for our learning tasks however. Therefore we consider using hypergraphs instead to completely represent complex relationships among the objects of our interest, and thus the problem of learning with hypergraphs arises. Our main contribution in this paper is to generalize the powerful methodology of spectral clustering which originally operates on undirected graphs to hypergraphs, and further develop algorithms for hypergraph embedding and transductive classification on the basis of the spectral hypergraph clustering approach. Our experiments on a number of benchmarks showed the advantages of hypergraphs over usual graphs. 1
Combining Content and Link for Classification using Matrix Factorization
, 2007
"... The world wide web contains rich textual contents that are interconnected via complex hyperlinks. This huge database violates the assumption held by most of conventional statistical methods that each web page is considered as an independent and identical sample. It is thus difficult to apply traditi ..."
Abstract

Cited by 67 (8 self)
 Add to MetaCart
The world wide web contains rich textual contents that are interconnected via complex hyperlinks. This huge database violates the assumption held by most of conventional statistical methods that each web page is considered as an independent and identical sample. It is thus difficult to apply traditional mining or learning methods for solving web mining problems, e.g., web page classification, by exploiting both the content and the link structure. The research in this direction has recently received considerable attention but are still in an early stage. Though a few methods exploit both the link structure or the content information, some of them combine the only authority information with the content information, and the others first decompose the link structure into hub and authority features, then apply them as additional document features. Being practically attractive for its great simplicity, this paper aims to design an algorithm that exploits both the content and linkage information, by carrying out a joint factorization on both the linkage adjacency matrix and the documentterm matrix, and derives a new representation for web pages in a lowdimensional factor space, without explicitly separating them as content, hub or authority factors. Further analysis can be performed based on the compact representation of web pages. In the experiments, the proposed method is compared with stateoftheart methods and demonstrates an excellent accuracy in hypertext classification on the WebKB and Cora benchmarks.
Community Evolution in Dynamic MultiMode Networks
 KDD'08
, 2008
"... A multimode network typically consists of multiple heterogeneous social actors among which various types of interactions could occur. Identifying communities in a multimode network can help understand the structural properties of the network, address the data shortage and unbalanced problems, and ..."
Abstract

Cited by 64 (15 self)
 Add to MetaCart
A multimode network typically consists of multiple heterogeneous social actors among which various types of interactions could occur. Identifying communities in a multimode network can help understand the structural properties of the network, address the data shortage and unbalanced problems, and assist tasks like targeted marketing and finding influential actors within or between groups. In general, a network and the membership of groups often evolve gradually. In a dynamic multimode network, both actor membership and interactions can evolve, which poses a challenging problem of identifying community evolution. In this work, we try to address this issue by employing the temporal information to analyze a multimode network. A spectral framework and its scalability issue are carefully studied. Experiments on both synthetic data and realworld large scale networks demonstrate the efficacy of our algorithm and suggest its generality in solving problems with complex relationships.
Ranking on graph data
 In ICML
, 2006
"... In ranking, one is given examples of order relationships among objects, and the goal is to learn from these examples a realvalued ranking function that induces a ranking or ordering over the object space. We consider the problem of learning such a ranking function when the data is represented as a ..."
Abstract

Cited by 41 (2 self)
 Add to MetaCart
(Show Context)
In ranking, one is given examples of order relationships among objects, and the goal is to learn from these examples a realvalued ranking function that induces a ranking or ordering over the object space. We consider the problem of learning such a ranking function when the data is represented as a graph, in which vertices correspond to objects and edges encode similarities between objects. Building on recent developments in regularization theory for graphs and corresponding Laplacianbased methods for classification, we develop an algorithmic framework for learning ranking functions on graph data. We provide generalization guarantees for our algorithms via recent results based on the notion of algorithmic stability, and give experimental evidence of the potential benefits of our framework. 1.
Learning multiple graphs for document recommendations
 In WWW
, 2008
"... The Web offers rich relational data with different semantics. In this paper, we address the problem of document recommendation in a digital library, where the documents in question are networked by citations and are associated with other entities by various relations. Due to the sparsity of a single ..."
Abstract

Cited by 40 (5 self)
 Add to MetaCart
(Show Context)
The Web offers rich relational data with different semantics. In this paper, we address the problem of document recommendation in a digital library, where the documents in question are networked by citations and are associated with other entities by various relations. Due to the sparsity of a single graph and noise in graph construction, we propose a new method for combining multiple graphs to measure document similarities, where different factorization strategies are used based on the nature of different graphs. In particular, the new method seeks a single lowdimensional embedding of documents that captures their relative similarities in a latent space. Based on the obtained embedding, a new recommendation framework is developed using semisupervised learning on graphs. In addition, we address the scalability issue and propose an incremental algorithm. The new incremental method significantly improves the efficiency by calculating the embedding for new incoming documents only. The new batch and incremental methods are evaluated on two real world datasets prepared from CiteSeer. Experiments demonstrate significant quality improvement for our batch method and significant efficiency improvement with tolerable quality loss for our incremental method.