Abstract:
Perceptron-like learning rules are known to require exponentially many correction steps in order to identify Boolean threshold functions exactly. We introduce criteria that are weaker than exact identification and investigate whether learning becomes significantly faster if exact identification is replaced by one of these criteria: PAC identification, order identification, and sign identification. PAC identification is based on the learning paradigm introduced by Valiant and known to be easier than exact identification. Order identification uses the fact that each threshold function induces an ordering relation on the input variables which can be represented by weights of linear size. Sign identification is based on a property of threshold functions known as unateness and requires only weights of constant size. We show that Perceptron-like learning rules cannot satisfy these criteria when the number of correction steps is to be bounded by a polynomial. We also present an exponential lower bound for order identification with the learning rules introduced by Littlestone. Our results show that efficiency imposes severe restrictions on what can be learned with local learning rules. 1
Citations
|
536
|
Learnability and the Vapnik-Chervonenkis Dimension
– Blumer, Ehrenfeucht, et al.
- 1989
|
|
529
|
Queries and concept learning
– Angluin
- 1988
|
|
511
|
Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm
– Littlestone
- 1988
|
|
418
|
A logical calculus of the ideas immanent in nervous activity
– McCulloch, Pitts
- 1943
|
|
389
|
The perceptron: A probabilistic model for information storage and organization in the brain
– Rosenblatt
- 1958
|
|
363
|
Perceptrons: An Introduction To Computational Geometry
– Minsky, Papert
- 1969
|
|
178
|
Computational limitations on learning from examples
– Pitt, Valiant
- 1988
|
|
153
|
Principles of neurodynamics. Perceptron and the Theory of Brain Mechanisms. Spartan Books. Washington D.C
– Rosenblatt
- 1962
|
|
135
|
Threshold logic and its applications
– Muroga
- 1971
|
|
69
|
On convergence proofs on perceptrons
– Novikoff
- 1962
|
|
49
|
The perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant
– Kivinen, Warmuth, et al.
- 1997
|
|
36
|
On the size of weights for threshold gates
– H˚astad
- 1994
|
|
29
|
Linear function neurons: structure and training
– Hampson, Volper
- 1986
|
|
28
|
How fast can a threshold gate learn
– Maass, Turán
- 1994
|
|
26
|
On specifying Boolean functions by labelled examples
– Anthony, Brightwell, et al.
- 1995
|
|
23
|
Perspectives of current research about the complexity of learning in neural nets
– Maass
- 1994
|
|
12
|
Investigating the distributional assumptions of the pac learning model
– Bartlett, Williamson
- 1991
|
|
10
|
Memory capacities of local rules for synaptic modification. A comparative review
– Palm
- 1991
|
|
6
|
Using the perceptron algorithm to find consistent hypotheses
– Anthony, Shawe-Taylor
- 1993
|
|
5
|
Unate truth functions
– McNaughton
- 1961
|
|
4
|
Circuit Complexity and Neural Networks, Foundations of Computing Series
– Parberry
- 1994
|
|
3
|
A lower bound on the number of corrections required for convergence of the single threshold gate adaptive procedure
– Lewis
- 1966
|
|
3
|
Boolean Functions Realizable with Single Threshold Devices
– Paull, McCluskey
- 1960
|
|
2
|
Neural networks---then and now
– Nagy
- 1991
|
|
2
|
On the size of weights for McCulloch-Pitts neurons
– Schmitt
- 1994
|