by David C. Noelle, Garrison W. Cottrell, Fred R. Wilms
ftp://condor.cnbc.cmu.edu:/pub/user/noelle/tr-s97.mono.ps.gz
Add To MetaCart
Abstract:
Connectionist attractor networks have played a central role in many cognitive models involving associative memory and soft constraint satisfaction. While early attractor networks used step activation functions, permitting the construction of attractors for only binary (or bipolar) patterns, much recent work has focused on networks with continuous sigmoidal activation functions. The incorporation of sigmoidal processing elements allows for the use of expressive real vector representations in attractor networks. The empirical studies reported here, however, reveal that the learning performance of sigmoidal attractor networks is best when such general real vectors are avoided---when training patterns are explicitly placed in the extreme corners of the network's activation space. Using binary (or bipolar) patterns produces benefits in the number of attractors learnable by a network, in the accuracy of the learned attractors, and in the amount of training required. These benefits persist under conditions of sparse patterns. Furthermore, these experiments show that the advantages of extremevalued patterns are not solely effects of the large separation between training patterns afforded by corner attractors. 1
Citations
|
2141
|
Learning Internal Representations by Error Propagation
– Rumelhart, Hinton, et al.
- 1986
|
|
1007
|
Neural networks and physical systems with emergent collective computational abilities
– Hopfield
- 1982
|
|
176
|
Understanding normal and impaired word reading: Computational principles in quasi-regular domains
– Plaut, McClelland, et al.
- 1996
|
|
117
|
Modeling brain function: The world of attractor neural networks
– Amit
- 1989
|
|
110
|
A fundamental tradeoff in knowledge representation and reasoning (revised version
– Levesque, Brachman
- 1985
|
|
94
|
Producing high-dimensional semantic spaces from lexical co-occurrence
– Lund, Burgess
- 1996
|
|
88
|
Schemata and sequential thought processes in PDP models
– Rumelhart, Smolensky, et al.
- 1986
|
|
64
|
Computing with neural circuits: a model
– Hopfield, Tank
- 1986
|
|
62
|
A learning rule for asynchronous perceptrons with feedback in a combinatorial environment
– Almeida
- 1987
|
|
40
|
Nonlinear dynamics in the resolution of lexical ambiguity: A parallel distributed processing account
– Kawamoto
- 1993
|
|
35
|
A composite holographic associative recall model
– Eich
- 1982
|
|
30
|
Recurrent backpropagation and the dynamical approach to adaptive neural computation
– Pineda
- 1989
|
|
24
|
Introduction to the theory of neural computation, volume 1 of Santa Fe Institute
– Hertz, Krogh, et al.
- 1991
|
|
13
|
Bayesian inference on visual grammars by neural nets that optimize
– Mjolsness
- 1991
|
|
11
|
Discrete multi-dimensional scaling
– Clouse
- 1996
|
|
11
|
On the computational utility of consciousness
– Mathis, D, et al.
- 1995
|
|
7
|
Quantifying inductive bias
– Haussler
- 1988
|
|
2
|
Lexical access with internet semantics. Presented at Using High-dimensional Semantic Spaces Derived from Large Text Corpora
– Clouse, Cottrell
- 1995
|
|
1
|
Statistics on Spheres, volume 6 of The University of
– Watson
- 1983
|