• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Machine teaching: an inverse problem to machine learning and an approach toward optimal education. (2015)

by X Zhu
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

The teaching dimension of linear learners.

by Ji Liu , Xiaojin Zhu - In Proceedings of The 33rd International Conference on Machine Learning, ICML ’16, , 2016
"... Abstract Teaching dimension is a learning theoretic quantity that specifies the minimum training set size to teach a target model to a learner. Previous studies on teaching dimension focused on version-space learners which maintain all hypotheses consistent with the training data, and cannot be app ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Abstract Teaching dimension is a learning theoretic quantity that specifies the minimum training set size to teach a target model to a learner. Previous studies on teaching dimension focused on version-space learners which maintain all hypotheses consistent with the training data, and cannot be applied to modern machine learners which select a specific hypothesis via optimization. This paper presents the first known teaching dimension for ridge regression, support vector machines, and logistic regression. We also exhibit optimal training sets that match these teaching dimensions. Our approach generalizes to other linear learners.
(Show Context)

Citation Context

... error, the teaching dimension is a constant TD=2 regardless of , while active learning would require O(log 1 ) queries which can be arbitrarily larger than TD. While the present paper focused on the theory of optimal teaching, there are practical applications, too. One such application is computer-aided personalized education. The human student is modeled by a computational cognitive model, or equivalently the learning algorithm. The educational goal is specified by the target model. The optimal teaching set is then well-defined, and represents the best personalized lesson for the student (Zhu, 2015, 2013; Khan et al., 2011). In one experiment, Patil et al. showed that real human students learn statistically significantly better under such optimal teaching set compared to an i.i.d. training set (Patil et al., 2014). Because contemporary cognitive models often employ optimization-based machine learners, our teaching dimension study helps to characterize these optimal lessons. Another application of optimal teaching is in computer security. In particular, optimal teaching is the mathematical formalism to study the so-called data poisoning attacks (Barreno et al., 2010; Mei and Zhu, 2015a,b...

The Security of Latent Dirichlet Allocation

by Shike Mei, Xiaojin Zhu
"... Latent Dirichlet allocation (LDA) is an in-creasingly popular tool for data analysis in many domains. If LDA output affects de-cision making (especially when money is in-volved), there is an incentive for attackers to compromise it. We ask the question: how can an attacker minimally poison the corpu ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Latent Dirichlet allocation (LDA) is an in-creasingly popular tool for data analysis in many domains. If LDA output affects de-cision making (especially when money is in-volved), there is an incentive for attackers to compromise it. We ask the question: how can an attacker minimally poison the corpus so that LDA produces topics that the attacker wants the LDA user to see? Answering this question is important to characterize such at-tacks, and to develop defenses in the future. We give a novel bilevel optimization formu-lation to identify the optimal poisoning at-tack. We present an efficient solution (up to local optima) using descent method and im-plicit functions. We demonstrate poisoning attacks on LDA with extensive experiments, and discuss possible defenses. 1
(Show Context)

Citation Context

...’s optimization problem. Eq (6) is called the lower-level task, which is nothing but the LDA learner’s optimization problem given the corpus M. Our framework is similar to machine teaching (Zhu 2013, =-=Zhu 2015-=-, Mei & Zhu 2015, Patil, Zhu, Kopec & Love 2014) where the teacher plays the role of the attacker in our framework. Unfortunately, bilevel programming is in general difficult. Furthermore, it is well-...

Becoming the Expert- Interactive Multi-Class Machine Teaching

by Edward Johns, Oisin Mac, Aodha Gabriel, J. Brostow
"... Compared to machines, humans are extremely good at classifying images into categories, especially when they possess prior knowledge of the categories at hand. If this prior information is not available, supervision in the form of teaching images is required. To learn categories more quickly, people ..."
Abstract - Add to MetaCart
Compared to machines, humans are extremely good at classifying images into categories, especially when they possess prior knowledge of the categories at hand. If this prior information is not available, supervision in the form of teaching images is required. To learn categories more quickly, people should see important and representative im-ages first, followed by less important images later – or not at all. However, image-importance is individual-specific, i.e. a teaching image is important to a student if it changes their overall ability to discriminate between classes. Further, stu-dents keep learning, so while image-importance depends on their current knowledge, it also varies with time. In this work we propose an Interactive Machine Teach-ing algorithm that enables a computer to teach challeng-ing visual concepts to a human. Our adaptive algorithm chooses, online, which labeled images from a teaching set should be shown to the student as they learn. We show that a teaching strategy that probabilistically models the student’s ability and progress, based on their correct and incorrect answers, produces better ‘experts’. We present results us-ing real human participants across several varied and chal-lenging real-world datasets. 1.
(Show Context)

Citation Context

...an be improved by better modeling the teaching process required to make them experts. The family of methods referred to as Machine Teaching offers a general solution to the problem of teaching humans =-=[43, 18, 42, 39, 33]-=-. Machine Teaching is not the same as Active Learning [38]. In Active Learning, the computer’s goal is to learn more accurate models given the smallest amount of supervision. This is achieved by caref...

Analysis of a Design Pattern for Teaching with Features and Labels

by Christopher Meek , Patrice Simard , Xiaojin Zhu
"... Abstract We study the task of teaching a machine to classify objects using features and labels. We introduce the Error-Driven-Featuring design pattern for teaching using features and labels in which a teacher prefers to introduce features only if they are needed. We analyze the potential risks and ..."
Abstract - Add to MetaCart
Abstract We study the task of teaching a machine to classify objects using features and labels. We introduce the Error-Driven-Featuring design pattern for teaching using features and labels in which a teacher prefers to introduce features only if they are needed. We analyze the potential risks and benefits of this teaching pattern through the use of teaching protocols, illustrative examples, and by providing bounds on the effort required for an optimal machine teacher using a linear learning algorithm, the most commonly used type of learners in interactive machine learning systems. Our analysis provides a deeper understanding of potential trade-offs of using different learning algorithms and between the effort required for featuring and labeling.
(Show Context)

Citation Context

...n cost. Other existing concepts include the exclusion dimension (Angluin 1994) and the unique specification dimension (Hedigus 1995) and the certificate size (Hellerstein et al 1996) which are similar to our invalidation cost. In addition, Liu et al (2016) define the teaching dimension of a hypothesis which is equivalent to the specification number and our concept specification cost. They also provide bounds on the concept specification cost for linear classifiers. Their results are related to our Proposition 7 but, unlike our result, assume that the space of objects is dense. In the terms of Zhu (2015), we provide the hypothesis specific teaching dimension for pool-based teaching. For many domains such as image classification, document classification and entity extraction and associated feature sets the assumption of a dense representation is unnatural (e.g., we cannot have a fractional number of words in a document). Like other work on classical teaching dimension, this work does not consider teaching with both labels and features. The other body of related work is active learning. The aim of this body of work is to develop algorithms to choose which items to label and the quality of an al...

Some Submodular Data-Poisoning Attacks on Machine Learners

by Shike Mei, Xiaojin Zhu , 2015
"... We study data-poisoning attacks using a machine teaching framework. For a family of NP-hard attack problems we pose them as submodular function maximization, thereby inheriting efficient greedy algorithms with theoretical guarantees. We demonstrate some attacks with experiments. 1 ..."
Abstract - Add to MetaCart
We study data-poisoning attacks using a machine teaching framework. For a family of NP-hard attack problems we pose them as submodular function maximization, thereby inheriting efficient greedy algorithms with theoretical guarantees. We demonstrate some attacks with experiments. 1
(Show Context)

Citation Context

...ttacks on machine learners. The framework accommodates a wide range of attack effectiveness measures, attacker effort measures, and victim machine learning algorithms. It is based on machine teaching =-=[17]-=-, which creates an optimal training data to make the learner learn the model in the teacher’s mind. The solutions for optimal attacks can be difficult (e.g. NP-hard in Section 3) depending on the atta...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University