| S. E. Fahlman. The recurrent cascade-correlation architecture. Technical Report CMU-CS-91-100, Carnegie Mellon, 1991. |
....Even with these settings, it took a lot of time to train networks, even though CC was designed to speed up learning. The results of experiments with the cascade correlation algorithm are shown in figure 4.11. T1rcc is an experiment with a recurrent version of cascade correlation (RCC, see [12]) It is only mentioned for completeness: more experiments 28 Figure 4.12: Results using three repeats. with RCC have been done, and the results were consistently worse than those of experiments with standard CC. Note the large variation in performance: apart from the near 100 performance on ....
Fahlman, S.E. (1991) `The recurrent cascade-correlation architecture ' Technical Report CMU-CS-91-100, School of Computer Science, Carnegie Mellon University. 37
....and repeated until the error is negligible. Cascade correlation learning is characterized by the fast learning because each time only one layer is trained. Another advantage is that it eliminates the needs to determine in advance the size of the network, especially, the number of hidden units. [71] extended the cascade correlation to classification of data sequence by using Elmanmodel [72] and a good results was obtained. 17] has investigated its possibility to learn acyclic graphs by using cascade correlation learning algorithm. Our experiments have been based on the learning algorithm ....
S. Fahlman, "The recurrent cascade-correlation architecture," Tech. Rep. Technical Report CMU-CS-91-100, Carnegie Mellon, 1991.
....length from only one time series. The network in Fig.2(b) implements the function f a (x) 1 , a occurs in x. For inputs of length k it is VC(ff a j a 2 IRg ) k. This shows that the VC dimension of perceptron Elman networks with 4 hidden neurons and recurrent cascade correlation networks [4] with 3 hidden neurons is infinite (Fig.2(c) Further any class has infinite VC dimension if it can at least tell whether one of an infinite number of single entries occured in a time series. In the perceptron case a restriction of the number of different possible entries to a finite number r ....
S. E. Fahlman. The recurrent cascade-correlation architecture. In Advances in Neural Information Processing Systems, volume 3, 1991.
....learn classification tasks that could not be solved in a reasonable amount of time by LRAAM based networks. Another improvement in this direction has been the development of the cascade correlation network for structure [48] 19] a generalization of recurrent cascade correlation for sequences [50]) which obtained better results with respect to the LRAAM based networks on a subset of the classification problems reported in [48] One advantage of the cascade correlation network for structure over backpropagation through structure networks is that the necessary number of hidden units is ....
S. E. Fahlman, "The recurrent cascade-correlation architecture," in Advances in Neural Information Processing Systems, R. P. Lippmann, J. E. Moody, and D. S. Touretzky, Eds. San Mateo, CA: Morgan Kaufmann, vol. 3, 1991, pp. 190--196.
.... regular string grammars it is more natural to use recurrent networks (which are essentially trainable deterministic finite automata) Various architectures have been used: simple first order recurrent networks (Elman 1990, Jordan 1988) more complex first order networks (Williams and Zipser 1989, Fahlman 1991); and second order recurrent networks (Giles et al. 1992) Elman (1992) has also applied recurrent networks to context free grammars and found that they can represent up to about three levels of recursive embedding; other authors (Kwasny and Faisal 1990, Das et al. 1993, Zeng et al. 1994) deal ....
Fahlman, S.E., 1991, The Recurrent Cascade-Correlation Architecture. CMU-CS-91100, School of Computer Science, Carnegie Mellon University.
....tilfreds med netvaerkets praestation, kan det hjaelpe at optraene en s#kaldt kandidatenhed, og derefter indsaette denne som skjult enhed i netvaerket. Til dette har man som naevnt brug for outputenhedens residualfejl, s# disse m# bestemmes og gemmes. Recurrent Cascade Correlation er beskrevet i [Fahlman 91] AEndringen i forhold til almindelig Cascade Correlation best#r i, at de skjulte enheder har lokalt feedback, se figur 2.2; saettes w s til nul, opn#s standard CC. Lad x(t) vaere en vektor indeholdende dels de eksterne inputs, dels outputs fra tidligere skjulte enheder til tiden t, og lad I ....
Fahlman, S. E. : The Recurrent CascadeCorrelation Architecture. I NIPS 3, Eds. D. S. Touretzsky et al. , 190--196. San Mateo, Morgan Kaufmann (1991).
.... time slicing paradigm two different neural network architectures were tried: the TimeSliced Recurrent Recognizer (TSRR) 2] using an Elman style recurrent network, and the Time Sliced Recurrent Cascade Correlation Network (TS RCCN) 3] using the Recurrent Cascade Correlation Architecture (RCC) [4]. The RCC is a powerful feature learner, and manipulating its original structure some improvements were gained in its ability to generalize and recognize phonemes. This new structure was called the Parallel RCC [5] This paper presents the latest recognition results obtained with the ....
.... the time slicing paradigm two different neural network architectures were tried: the Time Sliced Recurrent Recognizer (TSRR) 2] using an Elman style recurrent network, and the Time Sliced Recurrent Cascade Correlation Network (TSRCCN) 3] using the Recurrent Cascade Correlation Architecture (RCC) [4]. The RCC is a powerful feature learner, and manipulating its original structure some improvements were gained in its ability to generalize and recognize phonemes. This new structure was called the Parallel RCC [5] This paper presents the latest recognition results obtained with the Parallel RCC ....
[Article contains additional citation context not shown here]
S. Fahlman, "The Recurrent Cascade-Correlation Architecture", Technical Report # CMU-CS-91-100, Carnegie Mellon University, Pittsburgh, U.S.A., May 1991.
....the entire phoneme at one time while training and running the network, as is the case of the existing approaches. 4 The TSRR employs Elman s recurrent network with error backpropagation [Rumelhart et al. 86] and the TS RCCN uses the Recurrent Cascade Correlation Network Architecture (RCC) Fahlman, 91] with promising results. Along with the Time Slicing Paradigm, this work also introduces the concept of Natural Connectionist Glue. The Connectionist Glue is a concept developed by Waibel et al. for the training of large modular neural networks. The connectionist glue is an extra group of ....
.... reason why this approach didn t work satisfactory is that the learning algorithm, Fahlman s Quickprop [Fahlman, 88] is not fully compatible with Elman s recurrent architecture [Fahlman, 94] The second architecture, which uses the Recurrent Cascade Correlation Network Architecture (RCC) Fahlman, 91] was adopted in order to reduce the necessary number of hidden units, and to tailor the size of the network according to the task. However, the principle remained the same in both implementations. Their purpose is to obtain an immediate hypothesis of the speech input without having to ....
[Article contains additional citation context not shown here]
S. Fahlman, "The Recurrent Cascade-Correlation Architecture", Technical Report # CMU-CS-91-100, Carnegie Mellon University, Pittsburgh, U.S.A., May 1991.
....for that matter remains an open problem. 3. 5 Recurrent Cascade Correlation Networks The recurrent cascade neural network architecture was designed to automatically generate a network architecture during learning which is large enough to 13 represent a set of spatio temporal training patterns [5]. This approach avoids the need to define a fixed network architecture prior to training. Recurrent cascade neural networks also have only local feedback connections, but unlike the architectures discussed in the previous section, are not layered in the strict sense of Definition 1.2 (see Figure ....
S. Fahlman, "The recurrent cascade-correlation architecture," in Advances in Neural Information Processing Systems 3 (R. Lippmann, J. Moody, and 26 D. Touretzky, eds.), (San Mateo, CA), pp. 190--196, Morgan Kaufmann Publishers, 1991.
....a structure closely related to recurrent networks built by the Unfolding ofTime approach (Minsky and Papert, 1969; Rumelhart et al. 1986) This will be a subject of further research. Another, different way to deal with temporal patterns will follow the recurrent cascade correlation approach (Fahlman, 1991) providing the cascade units with adaptive recurrent self connections (Mozer, 1989) 4.2 Preprocessing Any mapping of a feature vector can be interpreted as a preprocessing if it is followed by further operations. The cascade architectures can be regarded as methods to incrementally build a ....
Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In Lippmann, R. P., Moody, J. E., and Touretzky, D. S., editors, Advances in Neural Information Processing Systems 3, pages 190--196.
.... j 0:05) and using no momentum term (i.e. ff = 0) We are also investigating other methods of training the networks, such as Fahlman s quickprop (Fahlman, 1988) however, our early results tend to confirm Fahlman s report of the instabilty of quickprop and related methods on recurrent networks (Fahlman, 1991). The actual simulation programs for training and running XERIC networks are coded in the C language, and run on SUN SPARCstation 1 workstations. On a SPARCstation, XERIC networks train on the 1000 sentence corpora at about 1.5 hours per epoch. This relatively slow rate of speed is because of the ....
Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. Technical Report CMU-CS-91-100, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
....a fairly large network and dynamically remove unimportant connections or units [21, 14, 3] whereas constructive or growth methods start from a small network and dynamically grow the network. Since the latter usually require less computations, extensive research has been carried out in this area [1, 4, 8, 18, 17, 22, 12, 16, 15, 7, 13]. Another class of dynamic multilayer perceptrons is block feedback neural networks [27, 26] that can be learned incrementally. A constructive method that we study in detail in this paper is Fahlman and Lebiere s cascadecorrelation learning algorithm (CAS) 9, 10] This algorithm starts from a ....
S. E. Fahlman. The recurrent cascade-correlation architecture. In R. Lippmann, J. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 190--196. Morgan Kaufmann, San Mateo, CA, 1991.
....problem, the network is further grown, adding new hidden units which are connected (with frozen weights) with all the inputs and previously installed hidden units. The resulting network is a cascade of nodes. Fahlman extended the algorithm to classification of sequences, obtaining good results [Fah91] In the following, we show that Cascade Correlation can further be extended to structures by using complex recursive neurons. For the sake of simplicity, we will discuss the case of acyclic graphs, leaving to the reader the extension to cyclic graphs (see Section 4.1, cyclic graphs) The output ....
S. E. Fahlman. The recurrent cascade-correlation architecture. Technical Report CMU-CS-91-100, Carnegie Mellon, 1991.
....The new residual error is calculated for training the next base model until a certain stopping criterion is met. Through this algorithm, training a large complicated model is reduced to training an individual base model sequentially. As an example, to train a two layer feedforward neural network [36], a neuron is treated as a base model. Since incremental algorithms reduce the task of training an entire neural network into training individual neurons sequentially, they have been shown to be one or two orders of magnitude faster than gradient descent based algorithms [36] 23] 46] 4.3.3 EM ....
....neural network [36] a neuron is treated as a base model. Since incremental algorithms reduce the task of training an entire neural network into training individual neurons sequentially, they have been shown to be one or two orders of magnitude faster than gradient descent based algorithms [36][23] 46] 4.3.3 EM Algorithms As the previous two algorithms were developed based on heuristics, fast training algorithms based on EM (expectation maximization) algorithms were formulated through a better established theoretical framework [58] 71] 69] 70] The EM algorithm was originated from ....
[Article contains additional citation context not shown here]
S.E. Fahlman. The recurrent cascade-correlation architecture. In NIPS, volume 3, 1991.
....indicated it should, suggesting there is still a bug in my code. I suspect from my results and from the Kadirkamanathan and Niranjan paper that the algorithm is sensitive to the threshold used to control when new units are added. 2.2 Recurrent Networks 1. Recurrent Cascade Correlation (RCC) [7] ( see figure 2.1 ) Similar to the cascade correlation algorithm, but with the output of each hidden unit fed back as an input to itself. 2. Simple Recurrent Networks (SRN) 4] see figure 2.2 ) Like backpropagation, but with the outputs of the hidden layer fed back as inputs to that layer. 3. ....
Fahlman, Scott E., "The Recurrent Cascade-Correlation Architecture", in Advances in Neural Information Processing Systems 3, Morgan Kaufman (1991).
....N 1 cannot solve the problem new hidden units are added which are connected (with frozen weights) with all the inputs and previously installed hidden units. The resulting network is a cascade of nodes. Fahlman extended the algorithm to the classification of sequences, obtaining good results [8]. In the following, we show that the Cascade Correlation can be further extended to structures by using generalized recursive neurons. For the sake of simplicity, we will discuss the case of acyclic graphs, leaving to the reader the extension to cyclic graphs (see Section 5.1, cyclic graphs) The ....
S. E. Fahlman. The recurrent cascade-correlation architecture. Technical Report CMU-CS-91-100, Carnegie Mellon, 1991.
....eventually converging to a program for f [6, 10] A class of functions S is Ex identifiable just in case there is a machine that Ex identifies each member of S. Here is a particularly simple example of a self referential coding trick. Let SD = f com 1 In other empirical work (for example, [42, 43, 41, 11, 12, 13, 16, 40, 39]) one pre trains on a succession of prior tasks to achieve success on a current task. Mastery of previous tasks provides useful context for the next. One sees similar attempts in animal training by shaping desired behavior through a succession of approximations [21, 15] e.g. to teach a dog to ....
S. Fahlman. The recurrent cascade-correlation architecture. In R. Lippmann, J. Moody, and D. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 190--196. Morgan Kaufmann Publishers, Inc., San Mateo, California, 1991.
....within a neural network, though the network is of fixed size and not capable of indefinite extension. The Recurrent Cascade Correlation (RCC) algorithm however is not so limited. This algorithm also learns sequences by adding new units, and it has been shown capable of incremental learning (Fahlman, 1991). It does not build explicit behavior hierarchies, however, and it cannot be used for reinforcement learning because of the intricacies of its training algorithm. Despite the fact that it was designed for tasks that required hierarchy learning, Method II has been tested on some traditional ....
....reinforcement learning was not used R(t) was constant for all t. Instead, the task was supervised. That is, the targets were given by a teacher at every time step. Traditional recurrent networks, including RCC, have been tested on the same task by other researchers (Cleeremans et al. 1989; Fahlman, 1991). The Method II algorithm learned the task approximately 100 times faster than the best results reported by the other researchers. That is, the other networks needed to see 100 times more training examples before learning the task. The number of new units in the network after learning completed ....
Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In Lippmann, R. P., Moody, J. E., and Touretzky, D. S., editors, Advances in Neural Information Processing Systems 3, pages 190--196. San Mateo, California: Morgan Kaufmann Publishers.
....of choosing architectures tailored to the task to be solved. The design criteria based on the RNA can be seen as an attempt to tune the network to the task. Other remarkable attempts are based on prior knowledge (e.g. 15] on the use of hints (e.g. 17] and on dynamic network generation (e.g. [11]) Finally, it is worth mentioning that some failures reported in finding optimal solutions may not be related to the presence of local minima, but to very flat plateaus that may led to numerical errors. For recurrent networks, this problem is even more serious than for feedforward networks, ....
S. E. Fahlman, "The Recurrent Cascade-Correlation Architecture," Technical Report CMUCS -91-100, May 1991.
No context found.
S. E. Fahlman. The recurrent cascade-correlation architecture. Technical Report CMU-CS-91-100, Carnegie Mellon, 1991.
No context found.
S. Fahlman. The recurrent cascade-correlation architecture. In R. Lippmann, J. Moody, and D. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 190-- 196. Morgan Kaufmann, San Mateo, CA, 1991.
No context found.
S.E. Fahlman. The recurrent cascade-correlation architecture. In R.P. Lippmann, J.E. Moody, and D.S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 190-- 196, San Mateo, CA, 1991. Morgan Kaufmann Publishers.
No context found.
S.E. Fahlman. The recurrent cascade-correlation architecture. In R.P. Lippmann, J.E. Moody, and D.S. Touretzky, editors, Advances in Neural Information Processing Systems, volume 3, pages 190--196, San Mateo, Denver, 1991. Morgan Kaufmann Publishers, Inc.
No context found.
Fahlman, S.E. (1991). The Recurrent Cascade-Correlation Architecture. CMU-CS-91-100, School of Computer Science, Carneige Mellon University, Pittsburgh.
No context found.
Fahlman, S.E. (1991). The recurrent cascade-correlation architecture. Technical Report CMU-CS-91-100, School of Computer Science, Carnegie Mellon University.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC