20 citations found. Retrieving documents...
C. W. Omlin and C. L. Giles. Pruning recurrent neural networks for improved generalization performance. Technical Report Tech Report No 93-6, Computer Science Department, Rensselaer Polytechnic Institute, April 1993.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Meta-Learning Evolutionary Artificial Neural Networks - Abraham (2003)   (Correct)

....layers and interconnection between them. Several methods have been proposed to automatically construct ANNs for reduction in network complexity that is to determine the appropriate number of hidden units, layers, etc. Topological optimization algorithms such as Extentron [9] Upstart [35] Pruning [63] [75] and Cascade Correlation [29] etc. got its own limitations. The interest in evolutionary search procedures for designing ANN architecture has been growing in recent years as they can evolve towards the optimal architecture without outside interference, thus eliminating the tedious trial and ....

.... which add complexity to the network starting from a very simple architecture until the entire network is able to learn the task [35] 56] 59] Destructive algorithms start with large architectures and remove nodes and interconnections until the ANN is no longer able to perform its task [63] [75] Then the last removal is undone. Figure 16 demonstrates how typical neural network architecture could be directly encoded and how the genotype is represented. For an optimal network, the required node transfer function (Gaussian, sigmoidal, etc. can be formulated as a global search ....

Omlin C W and Giles C L (1993), Pruning Recurrent neural networks for improved generalization performance, Tech. Report No 93-6, CS Department, Rensselaer Institute, Troy, NY.


Machine Translation using Neural Networks and.. - Castano, Casacuberta.. (1997)   (Correct)

....to those obtained for FSMs trained using 5,000 paired sentences. However, the NNs learned were bigger than the FSMs inferred. In addition, NNs required large amounts of training time (days and even weeks) in contrast with the low learning time (minutes) required by FSMs. Destructive methods [Omlin Giles, 1993] and a more compact (distributed) representation of the input and output alphabets should be explored to attempt to decrease the size of the NNs and, consequently, the learning time. Word categorization for both the input and output languages [Vilar et al. 1995] and injection of a priori ....

C.W. Omlin and C.L. Giles: "Pruning Recurrent Neural Networks for Improved Generalization Performance". Technical Report no. 93-6, Computer Science Department, Rensselaer Polytechnic Institute, Troy, N.Y. (1993)


An Algorithm for the Addition of Time-Delayed.. - Bone, Crucianu, de.. (2000)   (Correct)

....to these new connections By systematically adding FIR connections, each encompassing a whole range of delays, one obtains oversized networks which are slow to train and have poor generalization abilities. Various regularization techniques are then employed in order to improve generalization [9] [10], and this further increases the computational cost. 2. A constructive algorithm for time delayed connections We opted here for the alternative, constructive approach: start with a RNN having no time delayed connections and progressively add a few such connections. We choose the location and the ....

Giles, C.L. and C.W. Omlin, Pruning Recurrent Neural Networks for Improved Generalization Performance, IEEE Transactions on Neural Networks, 1994, 5(5): 848-851.


A Neuro-Symbolic Hybrid Intelligent Architecture with Applications - Ghosh, Taha (1999)   (Correct)

....the nal connectionist architecture, with the updated weights and links, can be viewed as a revised domain theory. It can be used to update the initial expert system with new learned concepts. Moreover, it can be converted back, if needed, to a rule based format to achieve the power of explanation [40, 12, 2]. Furthermore, 3 one can use an integrated decision maker to combine the decisions taken by the updated expert system and the trained connectionist architecture and provide the combined decisions to the user. The rest of this chapter describes in detail the di erent modules of HIA. In Section 2 ....

Giles, C. and Omlin, C. (1994). Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 5(5):848-851.


Combined Biological Metaphors - Boers, Sprinkhuizen-Kuyper (2001)   (1 citation)  (Correct)

....task. Considering the small size of the given examples and the fact that it is already difficult to design a suitable network architecture to solve them, it should be obvious that it is almost impossible to design network architectures for large problems by hand. That is why several authors (e.g. [14, 15, 16, 27, 29, 30, 32, 33]) have designed learning methods that automatically generate an architecture as part of the learning algorithm. For small problems, these methods give good results, but for large problems they have their limitations. We will describe these later in this chapter. Very little theoretical knowledge ....

....generalization, by decreasing the number of free variables of the network. Nodes or edges are removed until the network is no MIT Press Math6X9 1999 09 23:12:47 Page 7 8 Egbert J. W. Boers and Ida G. Sprinkhuizen Kuyper longer able to perform its task. Then the last removal is undone (e.g. [27, 32, 33]) The two described classes of algorithms can perform very well on small problems. But they are not capable to find network architectures for difficult tasks that need specific modular architectures in order to be able to learn those tasks. For both classes of algorithms we will explain why. ....

C. W. Omlin and C. L. Giles. Pruning recurrent neural networks for improved generalization performance. Revised Technical Report 93-6, Computer Science Department, Rensselaer Polytechnic Institute, Troy, N.Y., 1993.


Neural Networks Classifying Symbolic Data - Hammer (2000)   (Correct)

....performs this procedure on several different splittings of the training set in order to reduce the variance [18] Afterwards, the optimum parameters of this optimum architecture are determined via BPTS again. Additional regularization steps may be added, e.g. pruning methods or weight decay [6, 17]. Some theoretical properties are to be fulfilled such that this training method can succeed in principle. They will be discussed in the fourth section in more detail. Hence any function from trees into a real vector space can be learned via this algorithm. Compared to standard feed forward ....

C. L. Giles and C. W. Omlin. Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 5(5):848-851, 1994.


Local Structure Optimization in Evolutionary Generated Neural.. - Borst (1994)   (2 citations)  (Correct)

....so until,i.e. in most cases, the pruned net is no longer able to classify the training data correctly (usually the pruned net is shortly retrained) Then the previous net is taken to be the optimal, given the starting network. Destructive algorithms produce networks that generalize reasonable well [Omlin93], leaves us with the problem Algorithms that modify the structure Chapter3 16 that one should construct a starting network that is too large. Furthermore since the initial network will be large, a lot of training time will be necessary to train the initial network and retrain the intermediate ....

....a predefined place, see the first alinea of 5.1) What about destructive algorithms Some of these methods try to force small weights to zero (for example [Nguyen93] If a hidden unit does not receive any input signals or does not give any output signals it is removed completely. Omlin and Giles [Omlin93] presented a pruning algorithm that prunes the hidden units with the smallest input vector. Well if small incoming weights indicate that a unit of a module is not that important, maybe large incoming weights to a module do indicate that the module can not cope with all the work, that means: the ....

C.W. Omlin and C.L. Giles; Pruning recurrent Neural Networks for Improved Generalization Performance. Revised Technical Report No. 93-6, April 1993, Computer Science Department, Rensselaer Polytechnic Institute, Troy, N.Y., 1993.


A Hybrid Intelligent Architecture and Its Application to Water.. - Taha, Ghosh (1995)   (1 citation)  (Correct)

....final connectionist architecture, with the updated weights and links, can be viewed as a revised domain theory. It can be used to update the initial expert system with new learned concepts. Moreover, it can be converted back, if needed, to a rule based format to achieve the power of explanation [23, 31, 11, 1]. Furthermore, one can use an integrated decision maker to combine the decisions taken by the updated expert system and the trained connectionist architecture and provide the combined decisions to the user. The Hybrid Intelligent Architecture was implemented to control the water reservoirs of the ....

C.L Giles and C.W. Omlin. Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 5(5):848--851, 1994.


Evolving Artificial Neural Networks using the "Baldwin.. - Boers, Borst.. (1995)   (1 citation)  (Correct)

..... destructive algorithms, which start with large architectures and remove complexity, usually to improve generalization, by decreasing the number of free variables of the network. Nodes or edges are removed until the network is no longer able to perform its task. Then the last removal is undone [6, 20, 21]. Destructive algorithms leave us with the problem of finding an initial architecture. Existing constructive algorithms produce architectures that, with respect to their shape, are problem independent. Only the size of the produced architecture varies. Since the architecture of a network greatly ....

C.W. Omlin and C.L. Giles; Pruning recurrent neural networks for improved generalization performance. Revised Technical Report No. 93-6, Computer Science Department, Rensselaer Polytechnic Institute, Troy, N.Y., 1993.


An Evolutionary Algorithm that Constructs Recurrent.. - Angeline, Saunders.. (1994)   (81 citations)  (Correct)

....topology of nodes and links. Current methods to solve this task fall into two broad categories. Constructive algorithms initially assume a simple network and add nodes and links as warranted [2 8] while destructive methods start with a large network and prune off superfluous components [9 12]. Though these algorithms address the problem of topology acquisition, they do so in a highly constrained manner. Because they monotonically modify network structure, constructive and destructive methods limit the traversal of the available architectures in that once an architecture has been ....

C. W. Omlin and C. L. Giles, "Pruning recurrent neural networks for improved generalization performance," Technical Report Tech Report No 93-6, Computer Science Department, Rensselaer Polytechnic Institute, April 1993.


Synaptic Noise in Dynamically-driven Recurrent Neural Networks.. - Kam Jim (1994)   (4 citations)  Self-citation (Giles)   (Correct)

.... second derivative information to remove unimportant weights from the network [3] Weight decay was shown to improve generalization on feed forward networks by suppressing irrelevant components of the weight vector [13] Still another network simplification method is pruning, which has demonstrated [7] improvement in generalization in recurrent neural networks. 1.1 Previous Work on Training with Noise Previous research has investigated the effects of noise on feedforward neural networks. Training with noise can lead to more realistic biological models. For example in [8] noise temperature in ....

C.L. Giles and C.W. Omlin. Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 1994. Accepted for Publication.


Using Recurrent Neural Networks to Learn the Structure of.. - Goudreau, Giles (1995)   (1 citation)  Self-citation (Giles)   (Correct)

....30) It should be pointed out that the limited success of this approaches is due to the learning algorithms. Generally, the RNNs have rich representational capabilities. However, recent work has shown that certain types of large SSMs, with thousands of states, are learnable (Clouse et al. 1994; Giles et al. 1994) . Furthermore, the performance of the RNNs can sometimes be improved by using hints if partial information about the structure of the SSM is known (Giles Omlin, 1993) Other approaches that use neural networks for grammatical inference exist that will not be used in this paper. For example, ....

....of the weaknesses that is common to many neural network approaches: often it is not clear what size neural network would be best. One approach is to start with as many switch neurons as reasonably possible; if training is successful, then reduce the number of neurons using a destructive heuristic (Giles Omlin, 1994) . Assume that the values y 0 T i for 1 i K constitute the binary vector that represents the desired output vector. Training occurs for this table entry if, for any i (1 i K) we have fi fi fiy T i Gamma y 0 T i fi fi fi fi, where fi = 0:2 for our simulations. If it is determined that ....

[Article contains additional citation context not shown here]

Giles, C. & Omlin, C. (1994). Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 5 (5), 848-- 851.


Dynamic Adaptation of Recurrent Neural Network Architectures.. - Omlin, Giles   Self-citation (Giles Omlin)   (Correct)

....(or approximation) of the Hessian matrix 2 E w 2 which is computationally very expensive. A simple heuristic for pruning recurrent neural networks has been shown very effective at improving both a trained network s generalization performance as well as the quality of the extracted rules [21]: After sucessful training, the state neuron S i with the smallest incoming weight vector (i.e. P j;k W ijk ) is removed and the network is retrained using the same training set. This process is repeated until either a network with satisfactory generalization performance is obtained or until the ....

C. Giles and C. Omlin, "Pruning recurrent neural networks for improved generalization performance," IEEE Transactions on Neural Networks, vol. 5, no. 5, pp. 848--851, 1994.


A Delay Damage Model Selection Algorithm for NARX Neural.. - Lin, Giles, Horne, Kung   (1 citation)  Self-citation (Giles)   (Correct)

....those memory orders with small sensitivity measure after training. After pruning, the network is retrained. Of course, this procedure can be iterated. This method should be contrasted to other recurrent neural network pruning procedures where recurrent nodes are pruned based on output values [23] and where second order methods are used to prune input taps and single order feedback taps for fully recurrent neural networks [50] The sensitive measure of each memory order is calculated by estimating the second order derivative of the error function with respect to each memory order. Le Cun ....

C.L. Giles and C.W. Omlin. Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 5(5):848--851, 1994.


Using Recurrent Neural Networks to Learn the Structure of.. - Goudreau, Giles (1995)   (1 citation)  Self-citation (Giles)   (Correct)

....capabilities. However, recent work has shown that certain types of large SSMs, with thousands of states, are learnable (Giles Horne, 1994) Furthermore, the performance of the RNNs can sometimes be improved by using hints if partial information about the structure of the SSM is known (Giles Omlin, 1993) . Other approaches that use neural networks for grammatical inference exist that will not be used in this paper. For example, the use of update graphs has been proposed by Rivest and Schapire (Rivest Schapire, 1987a; Rivest Schapire, 1987b; Schapire, 1988) An update graph is an alternate ....

....of the weaknesses that is common to many neural network approaches: often it is not clear what size neural network would be best. One approach is to start with as many switch neurons as reasonably possible; if training is successful, then reduce the number of neurons using a destructive heuristic (Omlin Giles, 1993) . The SLRNN will work in the following manner. The input to the SLRNN will be an initial state vector and a series of input vectors. The initial state vector and the first input vector will be applied to the SLRNN at time step one, the second input vector will be applied at time step two, and so ....

Omlin, C. W. & Giles, C. L. (1993). Pruning recurrent neural networks for improved generalization performance. Technical Report TR 93-6, Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY.


Extraction of Rules from Discrete-Time Recurrent Neural Networks - Omlin, Giles (1996)   (36 citations)  Self-citation (Giles Omlin)   (Correct)

....over indefinite periods of time and no feature extraction is necessary for learning. Thus, grammatical inference is a good test bed in which to investigate issues related to the representation of symbolic knowledge in recurrent networks and to explore their computational capabilities. For example, Giles and Omlin (1994) show how grammar learning can be useful in quantifying the computational power of a pruning algorithm of recurrent neural networks. Jim et al. 1995) show that training noisy recurrent networks with grammatical strings is a good measure of the effect of different type of noise insertion ....

....The performance of recurrent networks on long temporal sequences is naturally explored by using grammatical strings (Manolios Fanelli 1994) Frasconi et al. 1995) used grammar learning and representation to explore a new recurrent network architecture. Furthermore, recent results by Giles et al. 1994) and Clouse et.al. 1994) imply that neural networks under certain constraints can learn extremely large grammars and may be competitive with other methods. The extraction of (symbolic) rules from both feed forward as well as recurrent neural networks has become an active area of research ....

Giles, C.L. & Omlin, C.W. (1994). Pruning Recurrent Neural Networks for Improved Generalization Performance. IEEE Transactions on Neural Networks. 5(5). 848-851.


An Analysis of Noise in Recurrent Neural Networks.. - Jim, Giles, Horne (1996)   (4 citations)  Self-citation (Giles)   (Correct)

....least two approaches: pruning and regularization. An example of pruning is optimal brain damage, which can improve generalization ability and speed of learning by using secondderivative information to remove unimportant weights from the network (Cun, Denker, and Solla [8] Also, Giles and Omlin [12] have demonstrated improvement in generalization of recurrent neural networks after pruning. Other pruning methods can be found in Reed [26] Weight decay is an example of a regularization method, and was shown by Krogh and Hertz [21] to improve generalization on feed forward networks by ....

C.L. Giles and C.W. Omlin. Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 5(5):848--851, 1994.


An Analysis of Noise in Recurrent Neural Networks.. - Jim, Giles, Horne (1996)   (4 citations)  Self-citation (Giles)   (Correct)

....least two approaches: pruning and regularization. An example of pruning is optimal brain damage, which can improve generalization ability and speed of learning by using secondderivative information to remove unimportant weights from the network (Cun, Denker, and Solla [12] Also, Giles and Omlin [16] has demonstrated improvement in generalization of recurrent neural networks after pruning. Other pruning methods can be found in Reed [30] Weight decay is an example of a regularization method, and was shown by Krogh and Hertz [25] to improve generalization on feed forward networks by ....

C.L. Giles and C.W. Omlin. Pruning recurrent neural networks for improved generalization performance. IEEE Transactions on Neural Networks, 5(5):848--851, 1994.


An Evolutionary Algorithm that Constructs Recurrent Neural.. - Angeline, al. (1993)   (81 citations)  (Correct)

No context found.

C. W. Omlin and C. L. Giles. Pruning recurrent neural networks for improved generalization performance. Technical Report Tech Report No 93-6, Computer Science Department, Rensselaer Polytechnic Institute, April 1993.


Apprentissage Dans Les Réseaux Récurrents Pour La Modélisation.. - Szilas (1995)   (Correct)

No context found.

Christian W. Omlin & C. Lee Giles. Pruning Recurrent Neural Networks for Improved Generalization Performance. Rapport Technique, Computer Science Departement, Rensselear Polytechnic Institute, Troy, N.Y, n 93-6, avril 1993.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC