Results 1 - 10
of
11
A local algorithm for finding wellconnected clusters
- CoRR
, 2013
"... Motivated by applications of large-scale graph clustering, we study random-walkbased local algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. In particular, we develop a method with better theoretical guarantee compared to all previous work, b ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Motivated by applications of large-scale graph clustering, we study random-walkbased local algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. In particular, we develop a method with better theoretical guarantee compared to all previous work, both in terms of the clustering accuracy and the conductance of the output set. We also prove that our analysis is tight, and perform empirical evaluation to support our theory on both synthetic and real data. More specifically, our method outperforms prior work when the cluster is well-connected. In fact, the better it is well-connected inside, the more significant improvement we can obtain. Our results shed light on why in practice some random-walk-based algorithms perform better than its previous theory, and help guide future research about local clustering. 1.
Coinciding walk kernels: Parallel absorbing random walks for learning with graphs and few labels
- In Asian Conference on Machine Learning
, 2013
"... Exploiting autocorrelation for node-label prediction in networked data has led to great suc-cess. However, when dealing with sparsely labeled networks, common in present-day tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cw ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Exploiting autocorrelation for node-label prediction in networked data has led to great suc-cess. However, when dealing with sparsely labeled networks, common in present-day tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk), a novel kernel leveraging label-structure similarity – the idea that nodes with similarly arranged labels in their local neighbourhoods are likely to have the same label – for learning problems on partially labeled graphs. Inspired by the success of random walk based schemes for the construction of graph kernels, cwk is defined in terms of the probability that the labels encountered during parallel random walks coincide. In addition to its intuitive probabilistic interpretation, coinciding walk kernels outperform existing kernel- and walk-based methods on the task of node-label prediction in sparsely labeled graphs with high label-structure similarity. We also show that computing cwks is faster than many state-of-the-art kernels on graphs. We evaluate cwks on several real-world networks, including cocitation and coauthor graphs, as well as a graph of interlinked populated places extracted from the dbpedia knowledge base.
Analyzing the Harmonic Structure in Graph-Based Learning
"... We find that various well-known graph-based models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such st ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
We find that various well-known graph-based models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such structure help reveal im-portant properties of the target function over a graph. In this paper, we show that the variation of the target function across a cut can be upper and lower bounded by the ratio of its harmonic loss and the cut cost. We use this to develop an analytical tool and analyze five popular graph-based models: absorbing random walks, par-tially absorbing random walks, hitting times, pseudo-inverse of the graph Lapla-cian, and eigenvectors of the Laplacian matrices. Our analysis sheds new insights into several open questions related to these models, and provides theoretical justi-fications and guidelines for their practical use. Simulations on synthetic and real datasets confirm the potential of the proposed theory and tool. 1
Graph-based Semi-supervised Learning: Realizing Pointwise Smoothness Probabilistically Yuan Fang † ‡
"... As the central notion in semi-supervised learn-ing, smoothness is often realized on a graph rep-resentation of the data. In this paper, we study two complementary dimensions of smoothness: its pointwise nature and probabilistic modeling. While no existing graph-based work exploits them in conjunctio ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
As the central notion in semi-supervised learn-ing, smoothness is often realized on a graph rep-resentation of the data. In this paper, we study two complementary dimensions of smoothness: its pointwise nature and probabilistic modeling. While no existing graph-based work exploits them in conjunction, we encompass both in a novel framework of Probabilistic Graph-based Pointwise Smoothness (PGP), building upon two foundational models of data closeness and label coupling. This new form of smoothness axiom-atizes a set of probability constraints, which ul-timately enables class prediction. Theoretically, we provide an error and robustness analysis of PGP. Empirically, we conduct extensive experi-ments to show the advantages of PGP. 1.
Coinciding Walk Kernels
"... Exploiting autocorrelation for node-label prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in present-day tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Exploiting autocorrelation for node-label prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in present-day tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk), a novel kernel leveraging label-structure similarity – the idea that nodes with similarly arranged labels in their local neighbourhoods are likely to have the same label – for learning problems on partially labeled graphs. Inspired by the success of random walk based schemes for the construction of graph kernels, cwk is defined in terms of the probability that the labels encountered during parallel random walks coincide. In addition to its intuitive probabilistic interpretation, coinciding walk kernels outperform state-ofthe-art kernel- and walk-based methods on the task of nodelabel prediction in sparsely labeled graphs. We also show that computing cwks is faster than many state-of-the-art kernels on graphs. We evaluate cwks on several real-world networks, including cocitation and coauthor graphs, as well as a network of interlinked populated places extracted from the dbpedia knowledge base. 1.
Σ-optimality for active learning on Gaussian random fields
- In Advances in Neural Information Processing Systems 26
, 2013
"... A common classifier for unlabeled nodes on undirected graphs uses label propaga-tion from the labeled nodes, equivalent to the harmonic predictor on Gaussian ran-dom fields (GRFs). For active learning on GRFs, the commonly used V-optimality criterion queries nodes that reduce the L2 (regression) los ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
A common classifier for unlabeled nodes on undirected graphs uses label propaga-tion from the labeled nodes, equivalent to the harmonic predictor on Gaussian ran-dom fields (GRFs). For active learning on GRFs, the commonly used V-optimality criterion queries nodes that reduce the L2 (regression) loss. V-optimality satis-fies a submodularity property showing that greedy reduction produces a (1 − 1/e) globally optimal solution. However, L2 loss may not characterise the true nature of 0/1 loss in classification problems and thus may not be the best choice for active learning. We consider a new criterion we call Σ-optimality, which queries the node that minimizes the sum of the elements in the predictive covariance. Σ-optimality directly optimizes the risk of the surveying problem, which is to determine the proportion of nodes belonging to one class. In this paper we extend submodularity guarantees from V-optimality to Σ-optimality using properties specific to GRFs. We further show that GRFs satisfy the suppressor-free condition in addition to the conditional independence inherited from Markov random fields. We test Σ-optimality on real-world graphs with both synthetic and real data and show that it outperforms V-optimality and other related methods on classification. 1
Scaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch
"... Graph-based Semi-supervised learning (SSL) algorithms have been successfully used in a large number of applications. These methods classify initially unlabeled nodes by propa-gating label information over the structure of graph starting from seed nodes. Graph-based SSL algorithms usually scale linea ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Graph-based Semi-supervised learning (SSL) algorithms have been successfully used in a large number of applications. These methods classify initially unlabeled nodes by propa-gating label information over the structure of graph starting from seed nodes. Graph-based SSL algorithms usually scale linearly with the number of distinct labels (m), and require O(m) space on each node. Unfortunately, there exist many applications of practical sig-nificance with very large m over large graphs, demanding better space and time complexity. In this paper, we propose MAD-Sketch, a novel graph-based SSL algorithm which compactly stores label distribution on each node using Count-min Sketch, a random-ized data structure. We present theoretical analysis showing that under mild conditions, MAD-Sketch can reduce space complexity at each node from O(m) to O(logm), and achieve similar savings in time complexity as well. We support our analysis through exper-iments on multiple real world datasets. We observe that MAD-Sketch achieves simi-lar performance as existing state-of-the-art graph-based SSL algorithms, while requir-ing smaller memory footprint and at the same time achieving up to 10x speedup. We find that MAD-Sketch is able to scale to datasets with one million labels, which is be-yond the scope of existing graph-based SSL algorithms.
Designing Fast Absorbing Markov Chains
"... Markov Chains are a fundamental tool for the analysis of real world phenomena and randomized algorithms. Given a graph with some specified sink nodes and an initial probability dis-tribution, we consider the problem of designing an absorb-ing Markov Chain that minimizes the time required to reach a ..."
Abstract
- Add to MetaCart
(Show Context)
Markov Chains are a fundamental tool for the analysis of real world phenomena and randomized algorithms. Given a graph with some specified sink nodes and an initial probability dis-tribution, we consider the problem of designing an absorb-ing Markov Chain that minimizes the time required to reach a sink node, by selecting transition probabilities subject to some natural regularity constraints. By exploiting the Marko-vian structure, we obtain closed form expressions for the ob-jective function as well as its gradient, which can be thus evaluated efficiently without any simulation of the underly-ing process and fed to a gradient-based optimization package. For the special case of designing reversible Markov Chains, we show that global optimum can be efficiently computed by exploiting convexity. We demonstrate how our method can be used for the evaluation and design of local search methods tailored for certain domains.
Supplementary Material for “Analyzing the Harmonic Structure in Graph-Based Learning”∗
"... We find that various well-known graph-based models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such st ..."
Abstract
- Add to MetaCart
(Show Context)
We find that various well-known graph-based models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such structure help reveal im-portant properties of the target function over a graph. In this paper, we show that the variation of the target function across a cut can be upper and lower bounded by the ratio of its harmonic loss and the cut cost. We use this to develop an analytical tool and analyze five popular graph-based models: absorbing random walks, par-tially absorbing random walks, hitting times, pseudo-inverse of the graph Lapla-cian, and eigenvectors of the Laplacian matrices. Our analysis sheds new insights into several open questions related to these models, and provides theoretical justi-fications and guidelines for their practical use. Simulations on synthetic and real datasets confirm the potential of the proposed theory and tool. 1
Propagation Kernels
, 2014
"... We introduce propagation kernels, a general graph-kernel framework for eXciently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage early-stage distributions from propagation schemes such as ra ..."
Abstract
- Add to MetaCart
We introduce propagation kernels, a general graph-kernel framework for eXciently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage early-stage distributions from propagation schemes such as random walks to capture structural information encoded in node labels, attributes, and edge information. This has two beneVts. First, oU-the-shelf propagation schemes can be used to naturally construct kernels for many graph types, including labeled, partially labeled, unlabeled, directed, and attributed graphs. Second, by leveraging existing eXcient and informative propaga-tion schemes, propagation kernels can be considerably faster than state-of-the-art approaches without sacriVcing predictive performance. We will also show that if the graphs at hand have a regular structure, for instance when modeling image or video data, one can exploit this regularity to scale the kernel computation to large databases of graphs with thousands of nodes. We support our contributions by exhaustive experiments on a number of real-world graphs from a variety of application domains. 1