46 citations found. Retrieving documents...
Servan-Schreiber, D., Cleeremans, A., McClelland, J.L. 1991. Graded State Machines: The Representation of Temporal Contingencies in Simple Recurrent Networks.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Natural Language Recursion and Recurrent Neural Networks - Morten Christiansen Nick (1994)   (1 citation)  (Correct)

....sentences cannot be handled. We shall make a step towards addressing this challenge in the simulations below. 3 The Processing of Recursion in Recurrent Networks The issue of recursion has been addressed before within a connectionist framework. For example, both Elman (1991) and Cleeremans, Servan Schreiber McClelland (1991) have demonstrated the ability of Simple Recurrent Networks (SRN) to deal with right recursive structures (which can, however, be handled by an FSM) as well as limited instances of center embedded recursion. In addition, the latter form of recursion has been studied further by Weckerly Elman ....

Servan-Schreiber, D., Cleeremans, A. & McClelland, J. L. (1991) Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7, 161--193.


Forced Simple Recurrent Neural Networks and Grammatical Inference - Maskara (1993)   (4 citations)  (Correct)

....contains additional output units, which are trained to show the current input and the current context. The effect of the size of the training set for grammatical inference is also considered. The SRN has been shown to be effective when trained on an infinite (very large) set of positive examples [ Servan Schreiber et al. 1991 ] When a finite (small) set of positive training data is used, the SRN architectures demonstrate a lack of generalization capability. This problem is solved through a new training algorithm that uses both positive and negative examples of the sequences. Simulation results show that when there is ....

.... be found in [ Angluin and Smith, 1983 ] Following the development of the back propagation algorithm [ Rumelhart and McClelland, 1986 ] various recurrent neural architectures using back propagation have been shown to have some capability for grammatical inference when trained from examples [ Servan Schreiber et al. 1991, Pollack, 1991, Giles et al. 1990 ] The idea of training recurrent neural networks with back propagation was first introduced by Jordan [ 1986 ] In the recurrent neural network, the symbols of a sequence are presented sequentially as network inputs. Also, the output of a higher level layer ....

[Article contains additional citation context not shown here]

David Servan-Schreiber, Axel Cleeremans, and James L. McClelland. Graded state machine: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7(2/3):161--193, 1991.


Dynamical Models of Sentence Processing - Tabor, Tanenhaus (1999)   (6 citations)  (Correct)

....from the continuous representations of learning models to the discrete representations of linguistic models. The current results suggest that the VSG model provides an improvement over hierarchical clustering methods of discretizing connectionist representations (e.g. Elman, 1990; Pollack, 1990; Servan Schreiber, Cleeremans, and McClelland, 1991), for these provide no obvious way of picking out a linguistically or statistically relevant subset of a cluster hierarchy. More specifically, we extended the results of Tabor et al. 1997) by modeling reading times for thematic effects on processing sentences with reduced relatives and we showed ....

Servan-Schreiber, D., Cleeremans, A., & McClelland, J.L. (1991). Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7, 161--193.


Connectionist Learning of Natural Language Lexical Phonotactics - Stoianov, Nerbonne (1998)   (1 citation)  (Correct)

.... to find similar phonemes, formed in the hidden layers during the training. The training corpus is smaller, and no information about the learning and performance are given. Some other connectionist approaches applied to natural language learning [Lawrence95,96] Elman90] Cleerm89,93] Schreiber91] among others consider mainly the grammatical level and some toy sized problems are reported. As pioneering projects in connectionist language precessing, most of them were mainly aimed at comparison between different approaches, including NN s, to language learning. But usually, the attempts ....

Servan-Schreiber D., A.Cleeremans and J.L.McClelland. (1991). Graded state machines: The representation of temporal contiguincies in simple recurrent networks. Machine Learning, pages 161-193.


Refining Algorithms with Knowledge-Based Neural Networks.. - Maclin, Shavli (1992)   (10 citations)  (Correct)

....in Neural Networks Several researchers have proposed neural network architectures for incorporating information about state. The idea of retaining a state or context across training patterns occurs in work aimed at solving natural language problems [Cleeremans89, Elman90, Mozer91, Porat91, Servan Schreiber91, St. John90] Each of these approaches provides a mechanism for preserving one or more of the past activations of some units for the next input sequence. Jordan [Jordan86] and Elman [Elman90] introduced the particular recurrent network topology we use in FSKBANN. Their networks have a set of ....

D. Servan-Schreiber, A. Cleeremans and J. McClelland, "Graded state machines: The representation of temporal contingencies in simple recurrent networks", Machine Learning 7, 2/3 (1991), .


Boundary Effects in the Linguistic Representations of Simple.. - Reilly (1993)   (1 citation)  (Correct)

....are vectors of activation values, typically of high dimension, which vary as a function of time. Consequently, the processing of an input sequence can be thought of as the traversal of a trajectory through a sequence of states in this representational state space. Elman (1990, 1991) and others (ServanSchreiber, Cleeremans, McClelland, 1991) have explored the capacity of these representations primarily from a linguistic point of view. To a large extent this work has been in response to the agenda set by Fodor and Pylyshyn s (1988) critique of connectionism, in which they claimed, among other things, that connectionist representations ....

Servan-Schreiber, D., Cleeremans, A., &, McClelland, J. L. 1991. Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning 7:161-193.


Learning Simple Phonotactics - Tjong, Sang, Nerbonne (1999)   (1 citation)  (Correct)

....] The basic version of such a network contains three layers of cells: an input layer, an output layer and a hidden layer. It is able to process timedependent patterns because of the extra backward connections from the hidden layer to the input layer. In experiments described in [ CSSM89 ] [ SSCM91 ] and [ Cle93 ] Axel Cleeremans, David Servan Schreiber and James McClelland trained a network to recognize strings which were generated using a small grammar that was originally used by [ Reb76 ] They trained an SRN to predict the next character in a sequence of 60; 000 strings which were ....

D. Servan-Schreiber, A. Cleeremans, and J.L. McClelland. Graded state machines: The representation of temporal continguincies in simple recurrent networks. Machine Learning, pages 161--193, 1991.


Toward a Connectionist Model of Recursion in Human.. - Christiansen, Chater (1999)   (25 citations)  (Correct)

.... in press; Cottrell Plunkett, 1991; Elman, 1990, 1991; Norris, 1990; Shillcock, Levy Chater, 1991) Moreover, SRNs seem well suited to learning finite state grammars (e.g. Cleeremans, Servan Schreiber McClelland, 1989; Giles, Miller, Chen, Chen, Sun Lee, 1992; Giles Omlin, 1993; Servan Schreiber, Cleeremans McClelland, 1991). But relatively little headway has been made towards grammars involving complex recursion that are beyond simple finite state devices. Previous efforts in modeling complex recursion have fallen within two general categories: simulations using language like grammar fragments and simulations ....

Servan-Schreiber, D., Cleeremans, A. & McClelland, J. L. (1991). Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7, 161--193.


Representation of Finite State Automata in Recurrent.. - Frasconi, Gori.. (1996)   (25 citations)  (Correct)

....the system s initial conditions. Basically, the apparent automata behavior arising from the learning of sequences of relatively small length, may change to more complex dynamics for longer sequences. To some extent, the techniques for extracting automata after learning (Cleeremans et al. 1989; Servan Schreiber et al. 1991; Giles et al. 1992b; Watrous and Kuhn, 1992) are interesting attempts to overcome this problem. For example, Giles et al. 1992b) report explicitly that the extracted automaton can exhibit better performance than the recurrent network from which it has been extracted. However, an implicit ....

....the learning algorithm gets stuck in a local minimum of the function V . In these cases the points of the state trajectory are not necessarily clustered round the hypercube vertices, and consequently, the automata extraction is more involved. Following other researchers (Cleeremans et al. 1989; Servan Schreiber et al. 1991; Giles et al. 1992b; Watrous Kuhn, 1992) an automaton was extracted from each trained network using the following clustering algorithm based on k mean: The parameter d c represents the minimum tolerated distance between two cluster centers. For all the experiments on Tomita s languages d c ....

[Article contains additional citation context not shown here]

Servan-Schreiber, D., Cleeremans, A., and McClelland, J. L. (1991). Graded state machines: the representation of temporal contingencies in simple recurrent networks. Machine Learning, 7(2/3):161--194. Special issue on Connectionist Approaches to Language Learning.


Stable Encoding of Large Finite-State Automata in Recurrent.. - Omlin, Giles (1996)   (4 citations)  (Correct)

....We empirically demonstrate the existence of extreme DFAs for which the weight strength scales with DFA size. 1 INTRODUCTION It is possible to train recurrent neural networks to behave like deterministic finite state automata [Elman, 1990, Frasconi et al. 1991, Giles et al. 1992, Pollack, 1991, Servan Schreiber et al. 1991, Watrous and Kuhn, 1992] The internal representation of learned DFA states can deteriorate due to the dynamical nature of recurrent networks making predictions about the generalization performance of trained recurrent networks difficult [Zeng et al. 1993] Methods for constructing DFAs in ....

Servan-Schreiber, D., Cleeremans, A., and McClelland, J. (1991). Graded state machine: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7:161.


ASCOC: A Locally Recurrent Neural Network Model for.. - Dit-Yan Yeung (1994)   (Correct)

....to as locally recurrent neural networks (LRNNs) here, are primarily feedforward networks, with the exception that feedback connections exist between some limited sets or layers of units. Up to now, almost all RNN models proposed for formal language learning are LRNNs, including both first order [3, 1, 4, 7] and second order [5, 2, 8, 12] networks. LRNNs are easier to analyze because of their limited feedback connections and layered structures. These networks have been demonstrated to be capable of solving some nontrivial formal language learning problems. In this paper, our focus is on using ....

....Unlike the feedforward connections, these feedback connections do not have trainable weights. In fact, C is simply a copy of H with one time step delay, i.e. C(t) H(t Gamma1) 2. 2 Training The SRN model has been demonstrated to be capable of learning regular grammars from example strings [7]. The Reber grammar [6] shown in Figure 2(a) is one of the benchmark problems used. s S s 2 s 1 s 3 s 4 s s 6 7 E B P T X T X V V S P (a) The Reber grammar S B P T X X V V P S P T X X V V P E T T P P (b) A grammar with embedded structures Figure 2: Finite state automata for two regular ....

[Article contains additional citation context not shown here]

D. Servan-Schreiber, A. Cleeremans, and J.L. McClelland, "Graded state machines: the representation of temporal contingencies in simple recurrent networks," Machine Learning, vol. 7, pp. 161--193, 1991.


Efficient Connectionist Representations of Syntactic Parse Trees .. - Ho, Chan   (Correct)

.... Figure 3: Holistic parsing paradigm Compared with other grammatical inference approaches, CPP is unique in several ways : ffl Syntactic parsing as addressed by CPP is significantly more difficult than the classification problem tackled by [8, 23, 31] and the prediction problem attempted by [7, 18, 26]. Firstly, CPP aims at tackling context free grammars instead of regular languages. In general, to parse context free languages, external memory is required [13] In contrast to other approaches (such as SPEC [19] and the Neural Network Pushdown Automata [28] CPP is not equipped with a stack. ....

D. Servan-Schreiber, A. Cleeremans, and J. L. McCleeland. Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7:161--193, 1991.


Constructing Deterministic Finite-State Automata in Recurrent.. - Omlin, Giles (1996)   (32 citations)  (Correct)

....sigmoidal discriminant functions are used. Continuous sigmoids offer other advantages besides their use in gradientbased training algorithms; they also permit analog VLSI implementation, the 1 See, for example, Elman [1990] Frasconi et al. 1991] Giles et al. 1991; 1992] Pollack [1991] Servan Schreiber et al. 1991], and Watrous and Kuhn [1992] 938 C. W. OMLIN AND C. L. GILES foundations necessary for the universal approximation theories of neural networks, the interpretation of neural network outputs as a posteriori probability estimates, etc. For more details, see Haykin [1994] Stability of an ....

SERVAN-SCHREIBER, D., CLEEREMANS, A., AND MCCLELLAND, J. 1991. Graded state machine: The representation of temporal contingencies in simple recurrent networks. Mach. Learn. 7, 161.


Constructing Deterministic Finite-State Automata in Recurrent.. - Omlin, Giles (1996)   (32 citations)  (Correct)

....encoding methodology would permit rules to be mapped into neural network VLSI chips, offering the potential of greatly increasing the versatility of neural network implementations. 1. 2 Background Recurrent neural networks can be trained to behave like deterministic finite state automata (DFAs) [4, 6, 9, 11, 25, 26, 31]. The dynamical nature of recurrent networks can cause the internal representation of learned DFA states to deteriorate for long strings [32] therefore, it can be difficult to make predictions about the generalization performance of trained recurrent networks. Recently, we have developed a simple ....

D. Servan-Schreiber, A. Cleeremans, and J. McClelland, "Graded state machine: The representation of temporal contingencies in simple recurrent networks," Machine Learning, vol. 7, p. 161, 1991.


Dynamic Adaptation of Recurrent Neural Network Architectures.. - Omlin, Giles   (Correct)

....and symbolic knowledge representation. Furthermore, automata and their languages can model a large class of dynamical processes, and no feature extraction is generally needed for learning. 4 Recurrent Neural Network We use discrete time, recurrent networks with weights W ijk to learn grammars ([7, 9, 18, 37, 38, 45, 48]) A network accepts a time ordered sequence of inputs and evolves with dynamics defined by the following equations: S (t 1) i = g( Xi i b i ) Xi i j X j;k W ijk S (t) j I (t) k ; where g is a sigmoid discriminant function and b i is the bias associated with hidden recurrent state ....

D. Servan-Schreiber, A. Cleeremans, and J. McClelland, "Graded state machine: The representation of temporal contingencies in simple recurrent networks," Machine Learning, vol. 7, p. 161, 1991.


Distributed Representations and Nested Compositional Structure - Plate (1994)   (25 citations)  (Correct)

....and the other is for storing information about previous clauses. Sopena 46 Hidden units Output units Input units Context units Copy E P V X S T B B T S X V P E End Begin #2 #4 #5 #3 #1 #0 T S P T S V P X V X B E Figure 2.21: Servan Schreiber et al. s grammar and recurrent network. Adapted from Servan Schreiber et al. [1991]. found that it was necessary to delete most of the recurrent connections (leaving less than 5 connectivity between context and hidden layers) in order for the network to be able learn to retain information over several timesteps. The output for network is organized as a set of case roles. As ....

....whether the task allows for clauses to be forgotten once they are complete. The network can perform anaphora resolution, but Sopena does not give enough details to determine whether a referent lies outside of the current and incomplete clauses. Servan Schreiber et al.: Graded State Machines Servan Schreiber, Cleeremans, and McClelland [1991] conduct a deeper investigation into the grammar learning abilities of recurrent networks like those used by Elman. They trained a network to learn a regular grammar from examples, and showed that the fully trained network has learned to be a perfect recognizer for the language. Their analysis of ....

[Article contains additional citation context not shown here]

Servan-Schreiber, D., Cleeremans, A., and McClelland, J. L. 1991. Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7(2/3):161--194.


The Serial Reaction Time Task: Learning Without.. - Boyer, Destrebecqz..   Self-citation (Cleeremans)   (Correct)

.... performance as it changes over training, and shows that the network becomes progressively able to predict perfectly which elements are possible at each serial position (bottom row) For instance, the network perfectly predicts that 6 is the only possible successor of 12345 As described in Servan Schreiber, Cleeremans McClelland, 1991), the development of sequence knowledge in the SRN involves a gradually increasing sensitivity to the sequential constraints contained in an increasingly large and self developed representation of the temporal context defined by previous elements of the sequence. Initially, the network learns to ....

Servan-Schreiber, D., Cleeremans, A. & McClelland, J.L. (1991). Graded State Machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning , 7, 161--193.


Principles for Implicit Learning - Cleeremans (1996)   (3 citations)  Self-citation (Cleeremans)   (Correct)

....based on exemplars can nevertheless be sufficient to produce rule like behavior and rule like representations. For instance, an SRN trained on only some of the strings that may possibly be generated from a finite state grammar will generalize to the infinite set of all possible instances (see Servan Schreiber, Cleeremans McClelland, 1991). It will sometimes develop internal representations that are organized in clusters, with each cluster representing a node of the grammar as abstract a representation as could be. In other cases, however, the network s internal representations tend to be organized in numerous very small ....

....in clusters, with each cluster representing a node of the grammar as abstract a representation as could be. In other cases, however, the network s internal representations tend to be organized in numerous very small clusters that each correspond to one or to a few training instances (see Servan Schreiber et al. 1991; Cleeremans, 1993; for detailed examples) The SRN has often been described as processing fragmentary information. This is descriptively correct, but it is not how things work inside the network. The network does not develop a database of subsequences that it can consult and ponder about as a ....

[Article contains additional citation context not shown here]

Servan-Schreiber, D., Cleeremans, A. & McClelland, J.L. (1991). Graded State Machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7, 161--193.


Using Periodically Attentive Units to Extend the Temporal.. - O'Connell   (Correct)

No context found.

Servan-Schreiber, D., Cleeremans, A., McClelland, J.L. 1991. Graded State Machines: The Representation of Temporal Contingencies in Simple Recurrent Networks.


Conclusion - Vi Summary As   (Correct)

No context found.

A. Cleeremans, D. Servan-Schreiber, and J.L. McClelland. Graded state machines: The representation of temporal contingencies in feedback networks. In D. E. Rumelhart Y. Chauvin, editor, Backpropagation: Theory, Architectures, and Applications. Lawrence Erlbaum Associates, Inc, Hillsdale, NJ, 1995. 3


Graded grammaticality in Prediction Fractal Machines - Parfitt, Tino, Dorffner (2000)   (Correct)

No context found.

D. Servan-Schreiber et al (1989). Graded state machines: The representation of temporal contingencies in Simple Recurrent Networks. In Advances in Neural Information Processing Systems, 643--652.


Dynamical Models of Sentence Processing - Tabor, Tanenhaus (1997)   (6 citations)  (Correct)

No context found.

REFERENCES 46 Servan-Schreiber, D., Cleeremans, A., & McClelland, J.L. (1991). Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7, 161--193.


Recurrent Autoassociative Networks: Developing Distributed.. - Stoianov   (Correct)

No context found.

Servan-Schreiber D., Cleeremans, A., and McClelland, J. L., Graded state machines: the representation of temporal contingencies in simple recurrent networks, Machine Learning, 7, 161, 1991.


Machine Learning of Phonotactics: Bibliography - Sang (1998)   (Correct)

No context found.

Servan-Schreiber, D., A. Cleeremans and J.L. McClelland (1991). `Graded state machines: The representation of temporal continguincies in SImple Recurrent Networks '. In: Machine Learning, , pp 161--193, 1991.


ASCOC: A Recurrent Neural Network Model for Grammatical Inference - Yeung, Yeung (1994)   (Correct)

No context found.

Servan-Schreiber, D., Cleeremans, A., & McClelland, J.L. (1991). Graded state machines: the representation of temporal contingencies in simple recurrent networks. Machine Learning, 7, 161--193.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC