Results 11  20
of
48
Machine teaching: an inverse problem to machine learning and an approach toward optimal education
 THE TWENTYNINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI “BLUE SKY” SENIOR MEMBER PRESENTATION TRACK)
, 2015
"... I draw the reader’s attention to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing ed ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
I draw the reader’s attention to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing education and personnel training. The Socratic dialogue style aims to stimulate critical thinking.
Constrained fractional set programs and their application in local clustering and community detection
 In ICML
, 2013
"... local clustering and community detection ..."
Towards Realistic Team Formation in Social Networks based on Densest Subgraphs
"... Given a task T, a set of experts V with multiple skills and a social network G(V,W) reflecting the compatibility among the experts, team formation is the problem of identifying a team C ⊆ V that is both competent in performing the task T and compatible in working together. Existing methods for this ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Given a task T, a set of experts V with multiple skills and a social network G(V,W) reflecting the compatibility among the experts, team formation is the problem of identifying a team C ⊆ V that is both competent in performing the task T and compatible in working together. Existing methods for this problem make too restrictive assumptions and thus cannot model practical scenarios. The goal of this paper is to consider the team formation problem in a realistic setting and present a novel formulation based on densest subgraphs. Our formulation allows modeling of many natural requirements such as (i) inclusion of a designated team leader and/or a group of given experts, (ii) restriction of the size or more generally cost of the team (iii) enforcing locality of the team, e.g., in a geographical sense or social sense, etc. The proposed formulation leads to a generalized version of the classical densest subgraph problem with cardinality constraints (DSP), which is an NP hard problem and has many applications in social network analysis. In this paper, we present a new method for (approximately) solving the generalized DSP (GDSP). Our method, FORTE, is based on solving an equivalent continuous relaxation of GDSP. The solution found by our method has a quality guarantee and always satisfies the constraints of GDSP. Experiments show that the proposed formulation (GDSP) is useful in modeling a broader range of team formation problems and that our method produces more coherent and compact teams of high quality. We also show, with the help of an LP relaxation of GDSP, that our method gives close to optimal solutions to
Gauge optimization and duality
"... Abstract. Gauge functions significantly generalize the notion of a norm, and gauge optimization, as defined by [R. M. Freund, Math. Programming, 38 (1987), pp. 47–67], seeks the element of a convex set that is minimal with respect to a gauge function. This conceptually simple problem can be used to ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Gauge functions significantly generalize the notion of a norm, and gauge optimization, as defined by [R. M. Freund, Math. Programming, 38 (1987), pp. 47–67], seeks the element of a convex set that is minimal with respect to a gauge function. This conceptually simple problem can be used to model a remarkable array of useful problems, including a special case of conic optimization, and related problems that arise in machine learning and signal processing. The gauge structure of these problems allows for a special kind of duality framework. This paper explores the duality framework proposed by Freund, and proposes a particular form of the problem that exposes some useful properties of the gauge optimization framework (such as the variational properties of its value function), and yet maintains most of the generality of the abstract form of gauge optimization.
Fast Flux Discriminant for LargeScale Sparse Nonlinear Classification
"... In this paper, we propose a novel supervised learning method, Fast Flux Discriminant (FFD), for largescale nonlinear classification. Compared with other existing methods, FFD has unmatched advantages, as it attains the efficiency and interpretability of linear models as well as the accuracy of no ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a novel supervised learning method, Fast Flux Discriminant (FFD), for largescale nonlinear classification. Compared with other existing methods, FFD has unmatched advantages, as it attains the efficiency and interpretability of linear models as well as the accuracy of nonlinear models. It is also sparse and naturally handles mixed data types. It works by decomposing the kernel density estimation in the entire feature space into selected lowdimensional subspaces. Since there are many possible subspaces, we propose a submodular optimization framework for subspace selection. The selected subspace predictions are then transformed to new features on which a linear model can be learned. Besides, since the transformed features naturally expect nonnegative weights, we only require smooth optimization even with the `1 regularization. Unlike other nonlinear models such as kernel methods, the FFD model is interpretable as it gives importance weights on the original features. Its training and testing are also much faster than traditional kernel models. We carry out extensive empirical studies on realworld datasets and show that the proposed model achieves stateoftheart classification results with sparsity, interpretability, and exceptional scalability. Our model can be learned in minutes on datasets with millions of samples, for which most existing nonlinear methods will be prohibitively expensive in space and time.
The Total Variation on Hypergraphs Learning on Hypergraphs Revisited
"... Hypergraphs allow one to encode higherorder relationships in data and are thus a very flexible modeling tool. Current learning methods are either based on approximations of the hypergraphs via graphs or on tensor methods which are only applicable under special conditions. In this paper, we presen ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Hypergraphs allow one to encode higherorder relationships in data and are thus a very flexible modeling tool. Current learning methods are either based on approximations of the hypergraphs via graphs or on tensor methods which are only applicable under special conditions. In this paper, we present a new learning framework on hypergraphs which fully uses the hypergraph structure. The key element is a family of regularization functionals based on the total variation on hypergraphs. 1
Scalable variational inference in logsupermodular models
, 2015
"... We consider the problem of approximate Bayesian inference in logsupermodular models. These models encompass regular pairwise MRFs with binary variables, but allow to capture highorder interactions, which are intractable for existing approximate inference techniques such as belief propagation, mean ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of approximate Bayesian inference in logsupermodular models. These models encompass regular pairwise MRFs with binary variables, but allow to capture highorder interactions, which are intractable for existing approximate inference techniques such as belief propagation, mean field, and variants. We show that a recently proposed variational approach to inference in logsupermodular models –LFIELD – reduces to the widelystudied minimum norm problem for submodular minimization. This insight allows to leverage powerful existing tools, and hence to solve the variational problem orders of magnitude more efficiently than previously possible. We then provide another natural interpretation of LFIELD, demonstrating that it exactly minimizes a specific type of Rényi divergence measure. This insight sheds light on the nature of the variational approximations produced by LFIELD. Furthermore, we show how to perform parallel inference as message passing in a suitable factor graph at a linear convergence rate, without having to sum up over all the configurations of the factor. Finally, we apply our approach to a challenging image segmentation task. Our experiments confirm scalability of our approach, high quality of the marginals, and the benefit of incorporating higherorder potentials.
HigherOrder Inference for Multiclass Logsupermodular Models
"... Although shown to be a very powerful tool in computer vision, existing higherorder models are mostly restricted to computing MAP configuration for specific energy functions. In this thesis, we propose a multiclass model along with a variational marginal inference formulation for capturing higher ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Although shown to be a very powerful tool in computer vision, existing higherorder models are mostly restricted to computing MAP configuration for specific energy functions. In this thesis, we propose a multiclass model along with a variational marginal inference formulation for capturing higherorder logsupermodular interactions. Our modeling technique utilizes set functions by incorporating constraints that each variable is assigned to exactly one class. Marginal inference for our model can be done efficiently by either FrankWolfe or a softmovemaking algorithm, both of which are easily parallelized. To simutaneously address the associated MAP problem, we extend marginal inference formulation to a parameterized version as smoothed MAP inference. Accompanying the extension, we present a rigorous analysis on the efficiency and accuracy tradeoff by varying the smoothing strength. We evaluate the scalability and the effectiveness of our approach in the task of natural scene image segmentation, demonstrating stateoftheart performance for both