Results 1  10
of
20
Rule Extraction from Recurrent Neural Networks: a Taxonomy and Review
 Neural Computation
, 2005
"... this paper, the progress of this development is reviewed and analysed in detail. In order to structure the survey and to evaluate the techniques, a taxonomy, specifically designed for this purpose, has been developed. Moreover, important open research issues are identified, that, if addressed pr ..."
Abstract

Cited by 37 (5 self)
 Add to MetaCart
this paper, the progress of this development is reviewed and analysed in detail. In order to structure the survey and to evaluate the techniques, a taxonomy, specifically designed for this purpose, has been developed. Moreover, important open research issues are identified, that, if addressed properly, possibly can give the field a significant push forward
Learning to predict a contextfree language: Analysis of dynamics in recurrent hidden units
, 1999
"... Recurrent neural network processing of regular language is reasonably well understood. Recent work has examined the less familiar question of contextfree languages. Previous results regarding the language a suggest that while it is possible for a small recurrent network to process contextfree ..."
Abstract

Cited by 22 (12 self)
 Add to MetaCart
Recurrent neural network processing of regular language is reasonably well understood. Recent work has examined the less familiar question of contextfree languages. Previous results regarding the language a suggest that while it is possible for a small recurrent network to process contextfree languages, learning them is di#cult. This paper considers the reasons underlying this di#culty by considering the relationship between the dynamics of the network and weightspace. We are able to show that the dynamics required for the solution lie in a region of weightspace close to a bifurcation point where small changes in weights may result in radically di#erent network behaviour. Furthermore, we show that the error gradient information in this region is highly irregular. We conclude that any gradientbased learning method will experience di#culty in learning the language due to the nature of the space, and that a more promising approach to improving learning performance may be to make weight changes in a nonindependent manner.
Hill Climbing in Recurrent Neural Networks for Learning the a^n b^n c^n Language
"... A simple recurrent neural network is trained on a onestep look ahead prediction task for symbol sequences of the contextsensitive a n b n c n language. Using an evolutionary hill climbing strategy for incremental learning the network learns to predict sequences of strings up to depth n = 12. ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
A simple recurrent neural network is trained on a onestep look ahead prediction task for symbol sequences of the contextsensitive a n b n c n language. Using an evolutionary hill climbing strategy for incremental learning the network learns to predict sequences of strings up to depth n = 12. Experiments and the algorithms used are described. The activation of the hidden units of the trained network is displayed in a 3D graph and analysed.
Representation Beyond Finite States: Alternatives to PushDown Automata
 IN: KOLEN AND KREMER
, 2001
"... It has been well established that Dynamical Recurrent Networks (DRNs) can act as deterministic finitestate automata (DFAs  see Chapters 6 and 7). A DRN can reliably represent the states of a DFA as regions in its state space, and the DFA transitions as transitions between these regions. Howeve ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
It has been well established that Dynamical Recurrent Networks (DRNs) can act as deterministic finitestate automata (DFAs  see Chapters 6 and 7). A DRN can reliably represent the states of a DFA as regions in its state space, and the DFA transitions as transitions between these regions. However, as we shall see in this chapter, DRNs can learn to process languages which are nonregular (and therefore cannot be processed by any DFA). Moreover, DRNs are capable of generalizing in ways which go beyond the DFA framework. We will show how DRNs can learn to predict contextfree and contextsensitive languages, making use of the transient dynamics as the network activations move towards an attractor or away from a repeller. The resulting trajectory can be thought of as analogous to winding up a spring in one dimension and unwinding it in another. In contrast to pushdown automata, which rely on unbounded external memory, DRNs must instead rely on arbi
Incremental Training of First Order Recurrent Neural Networks to Predict a ContextSensitive Language
, 2003
"... In recent years it has been shown that first order recurrent neural networks trained by gradientdescent can learn not only regular but also simple contextfree and contextsensitive languages. However, the success rate was generally low and severe instability issues were encountered. The present st ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
In recent years it has been shown that first order recurrent neural networks trained by gradientdescent can learn not only regular but also simple contextfree and contextsensitive languages. However, the success rate was generally low and severe instability issues were encountered. The present study examines the hypothesis that a combination of evolutionary hill climbing with incremental learning and a wellbalanced training set enables first order recurrent networks to reliably learn contextfree and mildly contextsensitive languages. In particular, we trained the networks to predict symbols in string sequences of the contextsensitive language Preprint submitted to Neural Networks 10 January 2003 1}. Comparative experiments with and without incremental learning indicated that incremental learning can accelerate and facilitate training. Furthermore, incrementally trained networks generally resulted in monotonic trajectories in hidden unit activation space, while the trajectories of nonincrementally trained networks were oscillating. The nonincrementally trained networks were more likely to generalise.
On the Origins of Linguistic Structure: Computational models of the evolution of language
, 2001
"... This thesis explores a perspective for explaining the origins of linguistic structure that is based on considerations beyond the constraints of the language acquisition device. In contrast to the theory of Universal Grammar proposed by Chomsky, this perspective considers how the processes of languag ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
This thesis explores a perspective for explaining the origins of linguistic structure that is based on considerations beyond the constraints of the language acquisition device. In contrast to the theory of Universal Grammar proposed by Chomsky, this perspective considers how the processes of language acquisition and use create a dynamical system that is capable of adapting linguistic structure to the inductive biases of learners. In this view it is possible to conceive of language adapting to aid its own survival: those languages that are more reliably and easily acquired will tend to persist for longer than their less easily learned counterparts. Thus, linguistic structures are seen as emergent, adaptive phenomena rather than preordained features of language.
Learning the Dynamics of Embedded Clauses
, 2000
"... Recent work by Siegelmann has shown that the computational power of neural networks matches that of Turing Machines. Proofs are based on a fractal encoding of states to simulate the memory and operations of stacks. In the present work, it is shown that similar stacklike dynamics can be learned ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Recent work by Siegelmann has shown that the computational power of neural networks matches that of Turing Machines. Proofs are based on a fractal encoding of states to simulate the memory and operations of stacks. In the present work, it is shown that similar stacklike dynamics can be learned in recurrent neural networks from simple sequence prediction tasks. Two main types of network solutions are found and described qualitatively as dynamical systems: damped oscillation and entangled spiraling around fixed points. The potential and limitations of each solution type are established in terms of generalization on two different contextfree languages. Both solution types constitute novel stack implementations  generally in line with Siegelmann 's theoretical work  which supply insights into how embedded structures of languages can be handled in analog hardware. Keywords: Recurrent neural network, dynamical system, stack, learning, contextfree language, analysis, computa...
Evolving contextfree language predictors
 IN GECCO2000: PROCEEDINGS OF THE GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE
, 2000
"... Recurrent neural networks can represent and process simple contextfree languages. However, the difficulty of finding with gradientbased learning appropriate weights for contextfree language prediction motivates an investigation on the applicability of evolutionary algorithms. By empirical s ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Recurrent neural networks can represent and process simple contextfree languages. However, the difficulty of finding with gradientbased learning appropriate weights for contextfree language prediction motivates an investigation on the applicability of evolutionary algorithms. By empirical studies, an evolutionary algorithm proves to be more reliable in finding prediction solutions to a simple CFL. Moreover, the evolutionary algorithm demonstrates greater diversity by making use of a larger repertoire of dynamical behaviors for solving the problem.
On Learning Context Free and Context Sensitive Languages
 IEEE Transactions on Neural Networks
, 2002
"... Contrary to recent claims, the Long Short Term Memory is not the only neural network which learns a context sensitive language. Both Simple Recurrent Network and Sequential Cascaded Network are able to generalize beyond training data but by utilizing di#erent dynamics. Di#erences in performance and ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Contrary to recent claims, the Long Short Term Memory is not the only neural network which learns a context sensitive language. Both Simple Recurrent Network and Sequential Cascaded Network are able to generalize beyond training data but by utilizing di#erent dynamics. Di#erences in performance and dynamics are discussed. Keywords: Recurrent neural network, language, prediction. 1
The Crystallizing Substochastic Sequential Machine Extractor  CrySSMEx
 CrySSMEx. Neural Computation
, 2006
"... This article presents an algorithm, CrySSMEx, for extracting minimal finite state machine descriptions of dynamic systems such as recurrent neural networks. Unlike previous algorithms, CrySSMEx is parameter free and deterministic, and it efficiently generates a series of increasingly refined models. ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
This article presents an algorithm, CrySSMEx, for extracting minimal finite state machine descriptions of dynamic systems such as recurrent neural networks. Unlike previous algorithms, CrySSMEx is parameter free and deterministic, and it efficiently generates a series of increasingly refined models. A novel finite stochastic model of dynamic systems and a novel vector quantization function have been developed to take into account the state space dynamics of the system. The experiments show that (a) extraction from systems that can be described as regular grammars is trivial, (b) extraction from highdimensional systems is feasible and (c) extraction of approximative models from chaotic systems is possible. The results are promising, but an analysis of shortcomings suggests some possible further improvements. Some largely overlooked connections, of the field of rule extraction from recurrent neural networks, to other fields are also identified.