Efficient noisetolerant learning from statistical queries
 JOURNAL OF THE ACM
, 1998
"... In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize a new but related model of learning from stat ..."
Cited by 357 (5 self)
In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize a new but related model of learning from statistical queries. Intuitively, in this model, a learning algorithm is forbidden to examine individual examples of the unknown target function, but is given access to an oracle providing estimates of probabilities over the sample space of random examples. One of our main results shows that any class of functions learnable from statistical queries is in fact learnable with classification noise in Valiant’s model, with a noise rate approaching the informationtheoretic barrier of 1/2. We then demonstrate the generality of the statistical query model, showing that practically every class learnable in Valiant’s model and its variants can also be learned in the new model (and thus can be learned in the presence of noise). A notable exception to this statement is the class of parity functions, which we prove is not learnable from statistical queries, and for which no noisetolerant algorithm is known.
Efficient Distributionfree Learning of Probabilistic Concepts
 Journal of Computer and System Sciences
, 1993
"... In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic c ..."
Cited by 212 (8 self)
In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic concepts (or pconcepts) may arise in situations such as weather prediction, where the measured variables and their accuracy are insufficient to determine the outcome with certainty. We adopt from the Valiant model of learning [27] the demands that learning algorithms be efficient and general in the sense that they perform well for a wide class of pconcepts and for any distribution over the domain. In addition to giving many efficient algorithms for learning natural classes of pconcepts, we study and develop in detail an underlying theory of learning pconcepts. 1 Introduction Consider the following scenarios: A meteorologist is attempting to predict tomorrow's weather as accurately as pos...
Learning ReadOnce Formulas with Queries
 J. ACM
, 1989
"... A readonce formula is a boolean formula in which each variable occurs at most once. Such formulas are also called ¯formulas or boolean trees. This paper treats the problem of exactly identifying an unknown readonce formula using specific kinds of queries. The main results are a polynomial time al ..."
Cited by 117 (19 self)
A readonce formula is a boolean formula in which each variable occurs at most once. Such formulas are also called ¯formulas or boolean trees. This paper treats the problem of exactly identifying an unknown readonce formula using specific kinds of queries. The main results are a polynomial time algorithm for exact identification of monotone readonce formulas using only membership queries, and a polynomial time algorithm for exact identification of general readonce formulas using equivalence and membership queries (a protocol based on the notion of a minimally adequate teacher [1]). Our results improve on Valiant's previous results for readonce formulas [26]. We also show that no polynomial time algorithm using only membership queries or only equivalence queries can exactly identify all readonce formulas. 1 Introduction The goal of computational learning theory is to define and study useful models of learning phenomena from an algorithmic point of view. Since there are a variety ...
Special Purpose Parallel Computing
 Lectures on Parallel Computation
, 1993
"... A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing ..."
Cited by 81 (6 self)
A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing [365] demonstrated that, in principle, a single general purpose sequential machine could be designed which would be capable of efficiently performing any computation which could be performed by a special purpose sequential machine. The importance of this universality result for subsequent practical developments in computing cannot be overstated. It showed that, for a given computational problem, the additional efficiency advantages which could be gained by designing a special purpose sequential machine for that problem would not be great. Around 1944, von Neumann produced a proposal [66, 389] for a general purpose storedprogram sequential computer which captured the fundamental principles of...
A Polynomial Approach to the Constructive Induction of . . .
 MACHINE LEARNING
, 1994
"... The representation formalism as well as the representation language is of great importance for the success of machine learning. The representation formalism should be expressive, efficient, useful, and applicable. Firstorder logic needs to be restricted in order to be efficient for inductive and de ..."
Cited by 71 (2 self)
The representation formalism as well as the representation language is of great importance for the success of machine learning. The representation formalism should be expressive, efficient, useful, and applicable. Firstorder logic needs to be restricted in order to be efficient for inductive and deductive reasoning. In the field of knowledge representation term subsumption formalisms have been developed which are efficient and expressive. In this paper, a learning algorithm, KLUSTER, is described which represents concept definitions in this formalism. KLUSTER enhances the representation language if this is necessary for the discrimination of concepts. Hence, KLUSTER is a constructive induction program. KLUSTER builds the most specific generalization and a most general discrimination in polynomial time. It embeds these concept learning problems into the overall task of learning a hierarchy of concepts.
On the Fourier Spectrum of Monotone Functions
, 1996
"... In this paper, monotone Boolean functions are studied using harmonic analysis on the cube. ..."
Cited by 61 (0 self)
In this paper, monotone Boolean functions are studied using harmonic analysis on the cube.
Forming Concepts for Fast Inference
 In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI92
, 1992
"... Knowledge compilation speeds inference by creating tractable approximations of a knowledge base, but this advantage is lost if the approximations are too large. We show how learning concept generalizations can allow for a more compact representation of the tractable theory. We also give a general in ..."
Cited by 51 (2 self)
Knowledge compilation speeds inference by creating tractable approximations of a knowledge base, but this advantage is lost if the approximations are too large. We show how learning concept generalizations can allow for a more compact representation of the tractable theory. We also give a general induction rule for generating such concept generalizations. Finally, we prove that unless NP ` nonuniform P, not all theories have small Horn least upperbound approximations. 1 Introduction Work in machine learning has traditionally been divided into two main camps: concept learning (e.g. [ Kearns, 1990 ] ) and speedup learning (e.g. [ Minton, 1988 ] ). The work reported in this paper bridges these two areas by showing how concept learning can be used to speed up inference by allowing a more compact and efficient representation of a knowledge base. We have been studying techniques for boosting the performance of knowledge representation systems by compiling expressive but intractable repre...
Statistical Queries and Faulty PAC Oracles
 In Proceedings of the Sixth Annual ACM Workshop on Computational Learning Theory
, 1993
"... In this paper we study learning in the PAC model of Valiant [18] in which the example oracle used for learning may be faulty in one of two ways: either by misclassifying the example or by distorting the distribution of examples. We first consider models in which examples are misclassified. Kearns [1 ..."
Cited by 40 (6 self)
In this paper we study learning in the PAC model of Valiant [18] in which the example oracle used for learning may be faulty in one of two ways: either by misclassifying the example or by distorting the distribution of examples. We first consider models in which examples are misclassified. Kearns [12] recently showed that efficient learning in a new model using statistical queries is a sufficient condition for PAC learning with classification noise. We show that efficient learning with statistical queries is sufficient for learning in the PAC model with malicious error rate proportional to the required statistical query accuracy. One application of this result is a new lower bound for tolerable malicious error in learning monomials of k literals. This is the first such bound which is independent of the number of irrelevant attributes n. We also use the statistical query model to give sufficient conditions for using distribution specific algorithms on distributions outside their prescr...
A Formal Definition of Intelligence Based on an Intensional Variant of Algorithmic Complexity
 In Proceedings of the International Symposium of Engineering of Intelligent Systems (EIS'98
, 1998
"... Machine Due to the current technology of the computers we can use, we have chosen an extremely abridged emulation of the machine that will effectively run the programs, instead of more proper languages, like lcalculus (or LISP). We have adapted the "toy RISC" machine of [Hernndez & H ..."
Cited by 38 (19 self)
Machine Due to the current technology of the computers we can use, we have chosen an extremely abridged emulation of the machine that will effectively run the programs, instead of more proper languages, like lcalculus (or LISP). We have adapted the "toy RISC" machine of [Hernndez & Hernndez 1993] with two remarkable features inherited from its objectoriented coding in C++: it is easily tunable for our needs, and it is efficient. We have made it even more reduced, removing any operand in the instruction set, even for the loop operations. We have only three registers which are AX (the accumulator), BX and CX. The operations Q b we have used for our experiment are in Table 1: LOOPTOP Decrements CX. If it is not equal to the first element jump to the program top.