Results 1 -
7 of
7
Active Learning on Trees and Graphs
"... We investigate the problem of active learning on a given tree whose nodes are assigned binary labels in an adversarial way. Inspired by recent results by Guillory and Bilmes, we characterize (up to constant factors) the optimal placement of queries so to minimize the mistakes made on the non-queried ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We investigate the problem of active learning on a given tree whose nodes are assigned binary labels in an adversarial way. Inspired by recent results by Guillory and Bilmes, we characterize (up to constant factors) the optimal placement of queries so to minimize the mistakes made on the non-queried nodes. Our query selection algorithm is extremely efficient, and the optimal number of mistakes on the non-queried nodes is achieved by a simple and efficient mincut classifier. Through asimplemodificationofthequeryselectionalgorithmwealsoshowoptimality(uptoconstant factors) with respect to the trade-off between number of queries and number of mistakes on nonqueried nodes. By using spanning trees, our algorithms can beefficientlyappliedtogeneralgraphs, although the problem of finding optimal and efficient active learning algorithms for general graphs remains open. Towards this end, we provide a lower bound on the numberofmistakesmade on arbitrary graphs by any active learning algorithm using a number of queries which is up to a constant fraction of the graph size. 1
Predicting the Labelling of a Graph via Minimum p-Seminorm Interpolation
"... We study the problem of predicting the labelling of a graph. The graph is given and a trial sequence of (vertex,label) pairs is then incrementally revealed to the learner. On each trial a vertex is queried and the learner predicts a boolean label. The true label is then returned. The learner’s goal ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We study the problem of predicting the labelling of a graph. The graph is given and a trial sequence of (vertex,label) pairs is then incrementally revealed to the learner. On each trial a vertex is queried and the learner predicts a boolean label. The true label is then returned. The learner’s goal is to minimise mistaken predictions. We propose minimum p-seminorm interpolation to solve this problem. To this end we give a p-seminorm on the space of graph labellings. Thus on every trial we predict using the labelling which minimises the p-seminorm and is also consistent with the revealed (vertex, label) pairs. When p = 2 this is the harmonic energy minimisation procedure of [22], also called (Laplacian) interpolated regularisation in [1]. In the limit as p → 1 this is equivalent to predicting with a label-consistent mincut. We give mistake bounds relative to a label-consistent mincut and a resistive cover of the graph. We say an edge is cut with respect to a labelling if the connected vertices have disagreeing labels. We find that minimising the p-seminorm with p = 1 + ɛ where ɛ → 0 as the graph diameter D → ∞ gives a bound of O(Φ 2 log D) versus a bound of O(ΦD) when p = 2 where Φ is the number of cut edges. 1
Random Spanning Trees and the Prediction of Weighted Graphs
"... We show that the mistake bound for predicting the nodes of an arbitrary weighted graph is characterized (up to logarithmic factors) by the cutsize of a random spanning tree of the graph. The cutsize is induced by the unknown adversarial labeling of the graph nodes. In deriving our characterization, ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We show that the mistake bound for predicting the nodes of an arbitrary weighted graph is characterized (up to logarithmic factors) by the cutsize of a random spanning tree of the graph. The cutsize is induced by the unknown adversarial labeling of the graph nodes. In deriving our characterization, we obtain a simple randomized algorithm achieving the optimal mistake bound on any weighted graph. Our algorithm draws a random spanning tree of the original graph and then predicts the nodes of this tree in constant amortized time and linear space. Experiments on real-world datasets show that our method compares well to both global (Perceptron) and local (label-propagation) methods, while being much faster. 1.
Getting lost in space: Large sample analysis of the commute distance
"... The commute distance between two vertices in a graph is the expected time it takes a random walk to travel from the first to the second vertex and back. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an exp ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The commute distance between two vertices in a graph is the expected time it takes a random walk to travel from the first to the second vertex and back. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an expression that does not take into account the structure of the graph at all and that is completely meaningless as a distance function on the graph. Consequently, the use of the raw commute distance for machine learning purposes is strongly discouraged for large graphs and in high dimensions. As an alternative we introduce the amplified commute distance that corrects for the undesired large sample effects. 1
Fast and Optimal Algorithms for Weighted Graph Prediction
"... We show that the mistake bound for predicting the nodes of an arbitrary weighted graph is characterized (up to logarithmic factors) by the weighted cutsize of a random spanning tree of the graph. The cutsize is induced by the unknown adversarial labeling of the graph nodes. In deriving our character ..."
Abstract
- Add to MetaCart
We show that the mistake bound for predicting the nodes of an arbitrary weighted graph is characterized (up to logarithmic factors) by the weighted cutsize of a random spanning tree of the graph. The cutsize is induced by the unknown adversarial labeling of the graph nodes. In deriving our characterization, we obtain a simple randomized algorithm achieving the optimal mistake bound on any graph. Our algorithm draws a random spanning tree of the original graph and then predicts the nodes of this tree in constant amortized time and linear space. Preliminary experiments on real-world datasets show that our method outperforms both global (Perceptron) and local (majority voting) methods. 1
Active Semi-Supervised Learning using Submodular Functions
"... We consider active, semi-supervised learning in an offline transductive setting. We show that a previously proposed error bound for active learning on undirected weighted graphs can be generalized by replacing graph cut with an arbitrary symmetric submodular function. Arbitrary non-symmetric submodu ..."
Abstract
- Add to MetaCart
We consider active, semi-supervised learning in an offline transductive setting. We show that a previously proposed error bound for active learning on undirected weighted graphs can be generalized by replacing graph cut with an arbitrary symmetric submodular function. Arbitrary non-symmetric submodular functions can be used via symmetrization. Different choices of submodular functions give different versions of the error bound that are appropriate for different kinds of problems. Moreover, the bound is deterministic and holds for adversarially chosen labels. We show exactly minimizing this error bound is NP-complete. However, we also introduce for any submodular function an associated active semi-supervised learning method that approximately minimizes the corresponding error bound. We show that the error bound is tight in the sense that there is no other bound of the same form which is better. Our theoretical results are supported by experiments on real data. 1
Abstract
"... Predicting the nodes of a given graph is a fascinating theoretical problem with applications in several domains. Since graph sparsification via spanning trees retains enough information while making the task much easier, trees are an important special case of this problem. Although it is known how t ..."
Abstract
- Add to MetaCart
Predicting the nodes of a given graph is a fascinating theoretical problem with applications in several domains. Since graph sparsification via spanning trees retains enough information while making the task much easier, trees are an important special case of this problem. Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully satisfactory algorithm is not available yet. We fill this hole and introduce an efficient node predictor, SHAZOO, which is nearly optimal on any weighted tree. Moreover, we show that SHAZOO can be viewed as a common nontrivial generalization of both previous approaches for unweighted trees and weighted lines. Experiments on real-world datasets confirm that SHAZOO performs well in that it fully exploits the structure of the input tree, and gets very close to (and sometimes better than) less scalable energy minimization methods. 1

