Results 1  10
of
11
A local algorithm for finding wellconnected clusters
 CoRR
, 2013
"... Motivated by applications of largescale graph clustering, we study randomwalkbased local algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. In particular, we develop a method with better theoretical guarantee compared to all previous work, b ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Motivated by applications of largescale graph clustering, we study randomwalkbased local algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. In particular, we develop a method with better theoretical guarantee compared to all previous work, both in terms of the clustering accuracy and the conductance of the output set. We also prove that our analysis is tight, and perform empirical evaluation to support our theory on both synthetic and real data. More specifically, our method outperforms prior work when the cluster is wellconnected. In fact, the better it is wellconnected inside, the more significant improvement we can obtain. Our results shed light on why in practice some randomwalkbased algorithms perform better than its previous theory, and help guide future research about local clustering. 1.
Coinciding walk kernels: Parallel absorbing random walks for learning with graphs and few labels
 In Asian Conference on Machine Learning
, 2013
"... Exploiting autocorrelation for nodelabel prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in presentday tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cw ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Exploiting autocorrelation for nodelabel prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in presentday tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk), a novel kernel leveraging labelstructure similarity – the idea that nodes with similarly arranged labels in their local neighbourhoods are likely to have the same label – for learning problems on partially labeled graphs. Inspired by the success of random walk based schemes for the construction of graph kernels, cwk is defined in terms of the probability that the labels encountered during parallel random walks coincide. In addition to its intuitive probabilistic interpretation, coinciding walk kernels outperform existing kernel and walkbased methods on the task of nodelabel prediction in sparsely labeled graphs with high labelstructure similarity. We also show that computing cwks is faster than many stateoftheart kernels on graphs. We evaluate cwks on several realworld networks, including cocitation and coauthor graphs, as well as a graph of interlinked populated places extracted from the dbpedia knowledge base.
Analyzing the Harmonic Structure in GraphBased Learning
"... We find that various wellknown graphbased models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such st ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
We find that various wellknown graphbased models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such structure help reveal important properties of the target function over a graph. In this paper, we show that the variation of the target function across a cut can be upper and lower bounded by the ratio of its harmonic loss and the cut cost. We use this to develop an analytical tool and analyze five popular graphbased models: absorbing random walks, partially absorbing random walks, hitting times, pseudoinverse of the graph Laplacian, and eigenvectors of the Laplacian matrices. Our analysis sheds new insights into several open questions related to these models, and provides theoretical justifications and guidelines for their practical use. Simulations on synthetic and real datasets confirm the potential of the proposed theory and tool. 1
Graphbased Semisupervised Learning: Realizing Pointwise Smoothness Probabilistically Yuan Fang † ‡
"... As the central notion in semisupervised learning, smoothness is often realized on a graph representation of the data. In this paper, we study two complementary dimensions of smoothness: its pointwise nature and probabilistic modeling. While no existing graphbased work exploits them in conjunctio ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
As the central notion in semisupervised learning, smoothness is often realized on a graph representation of the data. In this paper, we study two complementary dimensions of smoothness: its pointwise nature and probabilistic modeling. While no existing graphbased work exploits them in conjunction, we encompass both in a novel framework of Probabilistic Graphbased Pointwise Smoothness (PGP), building upon two foundational models of data closeness and label coupling. This new form of smoothness axiomatizes a set of probability constraints, which ultimately enables class prediction. Theoretically, we provide an error and robustness analysis of PGP. Empirically, we conduct extensive experiments to show the advantages of PGP. 1.
Coinciding Walk Kernels
"... Exploiting autocorrelation for nodelabel prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in presentday tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Exploiting autocorrelation for nodelabel prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in presentday tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk), a novel kernel leveraging labelstructure similarity – the idea that nodes with similarly arranged labels in their local neighbourhoods are likely to have the same label – for learning problems on partially labeled graphs. Inspired by the success of random walk based schemes for the construction of graph kernels, cwk is defined in terms of the probability that the labels encountered during parallel random walks coincide. In addition to its intuitive probabilistic interpretation, coinciding walk kernels outperform stateoftheart kernel and walkbased methods on the task of nodelabel prediction in sparsely labeled graphs. We also show that computing cwks is faster than many stateoftheart kernels on graphs. We evaluate cwks on several realworld networks, including cocitation and coauthor graphs, as well as a network of interlinked populated places extracted from the dbpedia knowledge base. 1.
Σoptimality for active learning on Gaussian random fields
 In Advances in Neural Information Processing Systems 26
, 2013
"... A common classifier for unlabeled nodes on undirected graphs uses label propagation from the labeled nodes, equivalent to the harmonic predictor on Gaussian random fields (GRFs). For active learning on GRFs, the commonly used Voptimality criterion queries nodes that reduce the L2 (regression) los ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
A common classifier for unlabeled nodes on undirected graphs uses label propagation from the labeled nodes, equivalent to the harmonic predictor on Gaussian random fields (GRFs). For active learning on GRFs, the commonly used Voptimality criterion queries nodes that reduce the L2 (regression) loss. Voptimality satisfies a submodularity property showing that greedy reduction produces a (1 − 1/e) globally optimal solution. However, L2 loss may not characterise the true nature of 0/1 loss in classification problems and thus may not be the best choice for active learning. We consider a new criterion we call Σoptimality, which queries the node that minimizes the sum of the elements in the predictive covariance. Σoptimality directly optimizes the risk of the surveying problem, which is to determine the proportion of nodes belonging to one class. In this paper we extend submodularity guarantees from Voptimality to Σoptimality using properties specific to GRFs. We further show that GRFs satisfy the suppressorfree condition in addition to the conditional independence inherited from Markov random fields. We test Σoptimality on realworld graphs with both synthetic and real data and show that it outperforms Voptimality and other related methods on classification. 1
Scaling Graphbased Semi Supervised Learning to Large Number of Labels Using CountMin Sketch
"... Graphbased Semisupervised learning (SSL) algorithms have been successfully used in a large number of applications. These methods classify initially unlabeled nodes by propagating label information over the structure of graph starting from seed nodes. Graphbased SSL algorithms usually scale linea ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Graphbased Semisupervised learning (SSL) algorithms have been successfully used in a large number of applications. These methods classify initially unlabeled nodes by propagating label information over the structure of graph starting from seed nodes. Graphbased SSL algorithms usually scale linearly with the number of distinct labels (m), and require O(m) space on each node. Unfortunately, there exist many applications of practical significance with very large m over large graphs, demanding better space and time complexity. In this paper, we propose MADSketch, a novel graphbased SSL algorithm which compactly stores label distribution on each node using Countmin Sketch, a randomized data structure. We present theoretical analysis showing that under mild conditions, MADSketch can reduce space complexity at each node from O(m) to O(logm), and achieve similar savings in time complexity as well. We support our analysis through experiments on multiple real world datasets. We observe that MADSketch achieves similar performance as existing stateoftheart graphbased SSL algorithms, while requiring smaller memory footprint and at the same time achieving up to 10x speedup. We find that MADSketch is able to scale to datasets with one million labels, which is beyond the scope of existing graphbased SSL algorithms.
Designing Fast Absorbing Markov Chains
"... Markov Chains are a fundamental tool for the analysis of real world phenomena and randomized algorithms. Given a graph with some specified sink nodes and an initial probability distribution, we consider the problem of designing an absorbing Markov Chain that minimizes the time required to reach a ..."
Abstract
 Add to MetaCart
(Show Context)
Markov Chains are a fundamental tool for the analysis of real world phenomena and randomized algorithms. Given a graph with some specified sink nodes and an initial probability distribution, we consider the problem of designing an absorbing Markov Chain that minimizes the time required to reach a sink node, by selecting transition probabilities subject to some natural regularity constraints. By exploiting the Markovian structure, we obtain closed form expressions for the objective function as well as its gradient, which can be thus evaluated efficiently without any simulation of the underlying process and fed to a gradientbased optimization package. For the special case of designing reversible Markov Chains, we show that global optimum can be efficiently computed by exploiting convexity. We demonstrate how our method can be used for the evaluation and design of local search methods tailored for certain domains.
Supplementary Material for “Analyzing the Harmonic Structure in GraphBased Learning”∗
"... We find that various wellknown graphbased models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such st ..."
Abstract
 Add to MetaCart
(Show Context)
We find that various wellknown graphbased models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such structure help reveal important properties of the target function over a graph. In this paper, we show that the variation of the target function across a cut can be upper and lower bounded by the ratio of its harmonic loss and the cut cost. We use this to develop an analytical tool and analyze five popular graphbased models: absorbing random walks, partially absorbing random walks, hitting times, pseudoinverse of the graph Laplacian, and eigenvectors of the Laplacian matrices. Our analysis sheds new insights into several open questions related to these models, and provides theoretical justifications and guidelines for their practical use. Simulations on synthetic and real datasets confirm the potential of the proposed theory and tool. 1
Propagation Kernels
, 2014
"... We introduce propagation kernels, a general graphkernel framework for eXciently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage earlystage distributions from propagation schemes such as ra ..."
Abstract
 Add to MetaCart
We introduce propagation kernels, a general graphkernel framework for eXciently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage earlystage distributions from propagation schemes such as random walks to capture structural information encoded in node labels, attributes, and edge information. This has two beneVts. First, oUtheshelf propagation schemes can be used to naturally construct kernels for many graph types, including labeled, partially labeled, unlabeled, directed, and attributed graphs. Second, by leveraging existing eXcient and informative propagation schemes, propagation kernels can be considerably faster than stateoftheart approaches without sacriVcing predictive performance. We will also show that if the graphs at hand have a regular structure, for instance when modeling image or video data, one can exploit this regularity to scale the kernel computation to large databases of graphs with thousands of nodes. We support our contributions by exhaustive experiments on a number of realworld graphs from a variety of application domains. 1