| Giles, C. L., Horne, B. G. & Lin, T. (1995), `Learning a class of large finite state machines with a recurrent neural network', Neural Networks 8(9), 1359--1365. |
....machine learning, syntactic pattern recognition, neural networks, computational learning theory, natural language processing, and related areas. The interested reader is referred to [34] 44] 54] and [70] for surveys of grammar inference in general and to [3] 5] 11] 16] 18] [17], 29] 37] 39] 55] 60] 61] 64] 66] 82] 84] 87] 93] and [102] 104] for recent results on grammar inference using neural networks. The neural architecture for syntax analysis that is proposed in this paper does not appear to lend itself to use in grammar inference using ....
C. L. Giles, B. W. Horne, and T. Lin, "Learning a class of large finite state machines with a recurrent neural network," Neural Networks, vol. 8, no. 9, pp. 1359--1365, 1995.
.... 1992] There has been considerable work on extending the computational capa3 bilities of recurrent neural network models by providing some form of external memory in the form of a tape [Williams and Zipser, 1989] or a stack [Berg, 1992; Fanty, 1986; Pollack, 1990; Das, Giles, and Sun, 1993; Giles, Horne and Lin, 1995; Hester, 1994; Miikkulainen, 1995; Schulenberg, 1992; Siegelman, 1991; Sun et al. 1993; Zeng et al. 1994] To the best of our knowledge, to date, most of the research on neural architectures for syntax analysis with the exception of [Siegelman, 1991; Pollack, 1987] which explore ....
.... to date, most of the research on neural architectures for syntax analysis with the exception of [Siegelman, 1991; Pollack, 1987] which explore connectionist Turing machine models which simulate a stack using binary representations of a fractional number; and [Chen and Honavar, 1994; Omlin and Giles, 1995] which focus on neural network realizations of finite state automata (with the latter emphasizing fault tolerance aspects) focus on the investigation of neural networks that are designed to learn to parse particular classes of syntactic structures (e.g. strings from deterministic ....
Giles, L., Horne, B. W., and Lin, T. Learning a Class of Large Finite State Machines With a Recurrent Neural Network. UMIACS TR 94-94. To appear in Neural Networks, 1995.
.... input facilitates the 1 For example, x(t) x(t 1) x(t 2) x(t N 1) form the inputs for a delay embedding of the previous N values of a series [62, 52] 2 Rules can also be extracted from feedforward networks [25, 41, 56, 4, 49, 29] and other types of recurrent neural networks [20] however the recurrent network approach and deterministic finite state automata extraction seem particularly suitable for a time series problem. 2 extraction of rules from the trained networks. Furthermore, it can be argued that the instantiation of rules from predictors not only gives the user ....
C. Lee Giles, B.G. Horne, and T. Lin. Learning a class of large finite state machines with a recurrent neural network. Neural Networks, 8(9):1359--1365, 1995.
....noisy times series prediction problem considered is the prediction of foreign exchange rates. A brief overview of foreign exchange rates is presented in the next section. 2 Rules can also be extracted from feedforward networks [25, 41, 56, 4, 49, 29] and other types of recurrent neural networks [20] however the recurrent network approach and deterministic finite state automata extraction seem particularly suitable for a time series problem. 3 1.2 Foreign Exchange Rates The foreign exchange market as of 1997 is the world s largest market, with more than US 1.3 trillion changing hands ....
C. Lee Giles, B.G. Horne, and T. Lin. Learning a class of large finite state machines with a recurrent neural network. Neural Networks, 8(9):1359--1365, 1995.
.... Introduction In the neural network language induction literature, induction of finite state automata is commonly thought of as the domain of recurrent network architectures [Cleeremans et al. 1989] Elman, 1991] Pollack, 1991] Giles et al. 1992] Watrous and Kuhn, 1992] However, recent work [Giles et al. 1994] has shown that a restricted class of recurrent nets can learn a subclass of finite state automata called finite memory automata. In this paper, we show that feedforward only architectures can represent and learn a class of automata, DeBruijn automata. These automata accept strings of arbitrary ....
Giles, C. L., Horne, B. G., and Lin, T. (1994). Learning a class of large finite state machines with a recurrent neural network. Technical Report UMIACS-TR94 -94 and CS-TR-3328, Intitute for Advanced Computer Studies, Univerity of Maryland.
....state. A proof is presented in the appendix. II. Learning a large DMM An interesting demonstration of this understanding of TDNNs and DMMs is to learn a DMM with many states using a small subset of the possible training examples. This section is similar in spirit to Giles, Horne, and Lin [10] in which the authors show that a discrete time recurrent neural network is capable of learning a large finite memory machine (FMM) a larger subset of the FSMs. The machine learned here is a DMM of order 11. It is defined by the function given in equation 2 which maps from recent inputs (or state ....
C. Lee Giles, Bill G. Horne, and T. Lin, "Learning a class of large finite state machines with a recurrent neural network," Neural Networks, vol. 8, no. 9, pp. 1359--1365, 1995.
....systems. The algorithm is currently only practical for relatively small grammars (Pereira Schabes 1992) stacks. The grammars learned were not large. Our task differs from these in that the grammar is considerably more complex. Recently large regular grammars have been learned by RNNs (Giles, Horne Lin 1995, Clouse, Giles, Horne Cottrell 1994) However, these grammars have unusual properties when implemented as sequential machines in the sense that they have little logic. It has been shown that RNNs have the representational power required for hierarchical solutions (Elman 1991) and that they are ....
....end of the language spectrum when compared to English and we expect a model trained on the English data to perform poorly on the Japanese data. Indeed, all models attain 50 or less correct classification on average for the Japanese data. More details can be found in Sect. 9 and (Lawrence, Giles Fong 1995). 3 Following classical GB theory, these classes are synthesized from the theta grids of individual predicates via the Canonical Structural Realization (CSR) mechanism of Pesetsky (Pesetsky 1982) 4 For an output range of 0 to 1 and softmax outputs. 4 Nearest Neighbors In the ....
Giles, C. L., Horne, B. & Lin, T. (1995), `Learning a class of large finite state machines with a recurrent neural network', Neural Networks 8(9), 1359--1365.
....order of hundreds of states. However, it has recently been reported that certain subclasses of DFAs with on the order of hundreds and even thousands of states can be learned rather Figure 3: 2048 state FSM learned by a neural network. easily with restricted feedback recurrent neural networks [9, 20]. A. Finite Memory Machines DFAs with the property that their present state can always be uniquely determined from the knowledge of the last p inputs and the last q outputs for all possible sequences of length max(p; q) are called finite memory machines (FMMs) 33] The size of the largest ....
.... the property that their present state can always be uniquely determined from the knowledge of the last p inputs and the last q outputs for all possible sequences of length max(p; q) are called finite memory machines (FMMs) 33] The size of the largest learned machine had approximately 2000 states [20]. The characteristic property of large DMMs that can be learned easily with recurrent networks is that they can be implemented in sequential machines with little logic. B. Definite Memory Machines FMMs with input order p 0 and output order q = 0 are called definite memory machines (DMMs) ....
C. Giles, B. Horne, and T. Lin, "Learning a class of large finite state machines with a recurrent neural network," Neural Networks, 1995. In press.
....networks are trained using weight decay [31] All experiments were trained using Back Propagation Through Time (BPTT) 68] 4. 1 Grammatical Inference: Learning A 512 state Finite Memory Machine NARX networks have been shown to be able to simulate and learn a class of finite state machines [8, 21], called respectively definite and finite memory machines. When being trained on strings which are encoded as temporal sequences, NARX networks are able to learn rather large (hundreds to thousands of states) machines provided that they have enough memory and the logic implementation is not too ....
....had enough degrees of freedom to learn the large machine within a reasonable amount of time. The networks were trained with Back Propagation Through Time (BPTT) algorithm at the learning rate of 0:1 and weight decay of 0:001. The training time was set to 5000 epochs. For more details, see [20, 21]. For each of 50 experiments, the weights were randomly initialized within the range of [ Gamma0:5; 0:5] The average training time was approximately 600 epochs. After training, the trained networks were tested on the remaining strings of the complete set. A zero error rate showed that the ....
C.L. Giles, B.G. Horne, and T. Lin. Learning a class of large finite state machines with a recurrent neural network. Neural Networks, 8(9):1359--1365, 1995.
....in Figure 2. The depth, d, of the machine is 9. The training set was 300 strings randomly chosen from the complete set. The complete set, which consists of all strings of length from 1 to d 1 (10 in this case) are shown to be able to sufficiently identify a finite memory machine with depth d [20] . The strings were encoded such that input values of 0s and 1s and target output labels negative and positive corresponded to floating point values of 0:0 and 1:0 respectively. Initially, before pruning the NARX networks were chosen to have 4 hidden nodes, 10 input taps, and 10 output taps. ....
....had enough degrees of freedom to learn the large machine within a reasonable amount of time. The networks were trained with Back Propagation Through Time (BPTT) algorithm at the learning rate of 0:1 and weight decay of 0:001. The training time was set to 5000 epochs. For more details, see [20, 21]. For each of 50 experiments, the weights were randomly initialized within the range of [ Gamma0:5; 0:5] The average training time was approximately 600 epochs. After training, the trained networks were tested on the remaining strings of the complete set. A zero error rate showed that the ....
C. L. Giles, B. G. Horne, and T. Lin. Learning a class of large finite state machines with a recurrent neural network. Technical Report UMIACS--TR--94--94 and CS--TR--3328, Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, 1994.
....extremely large. Thus, it may be prohibitively difficult to search with gradient descent learning algorithms. So far, experience indicates that it is difficult to learn even small FSMs from example strings in either of these types of networks (unless the FSM has little logic in its implementation [10]) Often, a solution is found that classifies the training set perfectly, but the network in fact learns a chaotic system which cannot necessarily be equated with any finite state machine [19] We also showed some related results that NARX networks with neurons with hard limiting nonlinearities ....
C.L. Giles, B.G. Horne, and T. Lin. Learning a class of large finite state machines with a recurrent neural network. Technical Report UMIACS--TR--94--94 and CS--TR--3328, Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, 1994.
....extremely large. Thus, it may be prohibitively difficult to search with gradient descent learning algorithms. So far, experience indicates that it is difficult to learn even small FSMs from example strings in either of these types of networks (unless the FSM has little logic in its implementation [25]) Often, a solution is found that classifies the training set perfectly, but the network in fact learns a chaotic system which cannot necessarily be equated with any finite state machine [26] We also showed some related results that NARX networks with neurons with hard limiting nonlinearities ....
C. Giles, B. Horne, and T. Lin, "Learning a class of large finite state machines with a recurrent neural network," Tech. Rep. UMIACS--TR--94--94 and CS--TR--3328, Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, 1994.
No context found.
Giles, C. L., Horne, B. G. & Lin, T. (1995), `Learning a class of large finite state machines with a recurrent neural network', Neural Networks 8(9), 1359--1365.
No context found.
C. Giles, B. Horne, and T. Lin. Learning a class of large finite state machines with a recurrent neural network. Neural Networks, 8(5):1359--1365, 1995.
No context found.
C.L. Giles, B.G. Horne, and T. Lin, "Learning a Class of Large Finite State Machines with a Recurrent Neural Network," Neural Networks, vol. 8 no. 9, 1995, pp. 1359--1365.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC