| M. C. Mozer. A focused backpropagation algorithm for temporal pattern recognition. Technical Report CRG-TR-88-3, Depts. of Psychology and Computer Science, University of Toronto, Toronto, Jun 1988. |
....one. BPS basically is the classical backpropagation on the multilayer network with a recursive computation only inside each dynamic neuron. Due to the architectural constraints, this algorithm does not implement BPTT, but is somewhere between BP and RTRL. The same approch was proposed by Mozer in [24] that indipendently derived a quite similar algorithm named Focused Back Propagation for a locally recurrent network analogous to the Auto Regressive MLP (AR MLP) 25] BPS was rediscovered in [19] where was derived for a structure that is a particular case of the output feedback LF MLN and was ....
....convergence. Also the BPS algorithm [13] can be obtained as particular case of CRBP when the architecture restriction that the dynamic units can only be placed in the first layer is given. The CRBP applied to the AR MLP can be also viewed as a generalization of Mozer s and Leighton Conrath s work [24,25]. Moreover if all the synapses contain only the MA part (I nm =0 for each n,m and l) the architecture reduces to FIR MLP and this algorithm reduces to the Temporal Back Propagation as in [7,9,21] Obviously, if all the synaptic filters have no memory (I nm =0 and =1 for each n,m and l) this ....
M.C. Mozer. A Focused Back-propagation Algorithm for Temporal Pattern Recognition. Tech Rep. CRG-TR-88-3, University of Toronto, 1988 and Complex Systems 3: 349-381, 1989.
....behave randomly. In this case we will assume that these quantities have been visible some time before, otherwise the mapping of the world is not a function any more. Thus, the world model has to be able to store context information through time. By using a recurrent network [Jor86, Elm88, Ghe89, Moz88, Pin87, Pea88, Pea90, RF87, WZ88, TS90] instead of a feed forward network as a model this problem can be overcome. We will focus on recurrent networks of the Elman type [Elm88] Fig. 2 b) which seem to be best suited for our purpose. If we speak of states in turn, we mean externally observable ....
M. C. Mozer. A focused backpropagation algorithm for temporal pattern recognition. Technical Report CRG-TR-88-3, Depts. of Psychology and Computer Science, University of Toronto, Toronto, Jun 1988.
....the output and error. These updates are computed using a sequence of calculations at each iteration. The weights are updated either after each iteration or after the final iteration of the epoch. This algorithm was proposed for discrete time RNNs by a number of different researchers [Kuhn, 1987; Mozer, 1988; Robinson and Fallside, 1987; Williams and Zipser, 1989] The continuous time version of recursive backpropagation was first proposed by Pineda [Pineda, 1988] The major disadvantage of this algorithm is that it requires an extensive amount of computation at each iteration. Another algorithm ....
Mozer, M. (1988). A focused back-propagation algorithm for temporal pattern recognition. Technical Report, Departments of Psychology and Computer Science, University of Toronto, Toronto.
....gradient computation techniques trade space vs. time locality. Interestingly, if the the recurrent connections are restricted to self loops for some neurons, both space and time locality can be achieved. Local gradient computation algorithms for such networks were independently derived by Mozer [38] and Gori, Bengio and De Mori [39] In a local feedback architecture (see Fig. 1, neurons are connected in layers like in feedforward nets, eventually using shortcut links (i.e. skipping one or more layers. Units directly connected to the external inputs may have a self loop link, thus ....
....(17) 10 Figure 1: Example of Local Feedback multi layered network. Many variant of the above equation are possible. For example y i (t Gamma 1) may be replaced by a i (t Gamma 1) obtaining an activation feedback connection that yields a linear dynamic behavior (such approach was pursued in [38]. Another variant is to introduce multiple delays, feeding back the past output [activation] of the unit at time t Gamma 2; t Gamma 3 and so on. This corresponds to a higher than first order nonlinear [linear] autoregressive model for the unit. A more general case is to use an autoregressive ....
M. Mozer, "A focused back-propagation algorithm for temporal pattern recognition," Complex Systems, vol. 3, pp. 349--381, 1989.
....that same unit receives feedback from other examples. When the don t cares line up, the weights to those units will never change. One possible fix, so called Back Propagation through time (Rumelhart et al. 1986) involves a complete unrolling of a recurrent loop and has had only modest success (Mozer, 1988), probably because of conflicts arising from equivalence constraints between interdependent layers. My fix involves a single backspace, unrolling the loop only once. For a particular string, this leads to the calculation of only one error term for each weight (and thus no conflict) as follows. ....
Mozer, M. (1988). A focused Back-propagation Algorithm for Temporal Pattern Recognition. CRG-Technical Report-88-3: University of Toronto.
....(see Section 3) 2.2 Spatio temporal networks Processing a spatio temporal signal requires a model capable of processing time varying signals. A number of researchers have proposed network models to represent and process such signals (e.g. Elman, 1990; Jordon, 1987; Lapedes and Farber, 1987; Mozer, 1989; Waibel et al. 1989; Watrous and Shastri, 1986) The connectionist model we employed was inspired by the Temporal Flow Model (TFM) which has achieved good results in speech recognition (Watrous, 1990; Watrous, 1991) TFM supports arbitrary link connectivity across layers, admits feedforward as ....
Mozer, M. (1989). A focused backpropagation algorithm for temporal pattern recognition. Complex Systems, 3:349--381.
....between input symbols and input neurons. In this class of recurrent network architectures, the recurrent connections are restricted to self loops. One advantage of these locally recurrent networks is that they can be trained with gradient descent algorithms which are local in both space and time [1, 17, 23]. 10 Figure 6: Locally Recurrent Networks: They consist of an input layer, a layer of dynamic neurons with self recurrent connections whose outputs are the inputs to a standard layered, feedforward neural networks. The time delays on the feedback connections are shown as squares. We use the ....
M. Mozer, "A focused backpropagation algorithm for temporal pattern recognition," Complex Systems, vol. 3, 1989.
....approach (Minsky and Papert, 1969; Rumelhart et al. 1986) This will be a subject of further research. Another, different way to deal with temporal patterns will follow the recurrent cascade correlation approach (Fahlman, 1991) providing the cascade units with adaptive recurrent self connections (Mozer, 1989). 4.2 Preprocessing Any mapping of a feature vector can be interpreted as a preprocessing if it is followed by further operations. The cascade architectures can be regarded as methods to incrementally build a powerful preprocessing device. The preprocessing capability of the single component ....
Mozer, M. (1989). A focused back-propagation algorithm for temporal pattern recognition. Complex Systems, 3:349--381.
....network through time, this approach is quite expensive in terms of memory consumption. Another drawback, as we mentioned earlier in Chapter 3, is the fact that tall networks with tied weights between layers, such as would result when trying to learn long sequences, tend to be difficult to train [Mozer, 1988; Pollack, 1990a] This is a point against the method of back propagation through time in general. Since we assume here that movies may be of arbitrary length, this is potentially a serious problem with this method. Time Delay Neural Networks Time delay neural networks, as presented in Chapter 3, ....
Michael Mozer, "A Focused Back-Propagation Algorithm for Temporal Pattern Recognition," Technical Report CRG-TR-88-3, Connectionist Research Group, University of Toronto, 1988.
....examples of input and desired output trajectories. One example of such a task is sequence classification, where the input is the sequence to be classified and the desired output is the correct classification, which is to be produced at the end of the sequence, as in some of the work reported by Mozer (1989; chapter , this volume] Another example is sequence production, as studied by Jordan (1986) in which the input is a constant pattern and the corresponding desired output is a timevarying sequence. More generally, both the input and desired output may be time varying, as in the prediction ....
....than fixed, they can form delay line structures when necessary while also being able to create flip flops or other memory structures capable of preserving state over potentially unbounded periods of time. This point has been emphasized in (Williams, 1990) and similar arguments have been made by Mozer (1989; chapter , this volume] There are a number of possible reasons to pursue the development of learning algorithms for recurrent networks, and these may involve a variety of possible constraints on the algorithms one might be willing to consider. For example, one might be interested in ....
[Article contains additional citation context not shown here]
Mozer, M. C. (1989). A focused back-propagation algorithm for temporal pattern recognition.
....strengths is its generality, but a corresponding weakness is its growing memory requirement when given an arbitrarily long training sequence. Other approaches to training recurrent nets to handle time varying input or output have been suggested or investigated by Jordan (1986) Bachrach (1988) Mozer (1988), Elman (1988) ServanSchreiber, Cleeremans, and McClelland (1988) Robinson and Fallside (1987) Stornetta, Hogg, and Huberman (1987) Gallant and King (1988) and Pearlmutter (1988) Many of these approaches use restricted architectures or are based on more computationally limited approximations ....
....while not suffering from its growing memory requirement in arbitrarily long training sequences. It coincides with an approach suggested in the system identification literature (McBride Narendra, 1965) for tuning the parameters of general dynamical systems. The work of Bachrach (1988) and Mozer (1988) represents special cases of the algorithm presented here, and Robinson and Fallside (1987) have given an alternative description of the full algorithm as well. However, to the best of our knowledge, none of these investigators has published an account of the behavior of this algorithm in ....
Mozer, M. C. (1988). A focused back-propagation algorithm for temporal pattern recognition (Tech.
....to as locally recurrent neural networks (LRNNs) here, are primarily feedforward networks, with the exception that feedback connections exist between some limited sets or layers of units. Up to now, almost all RNN models proposed for formal language learning are LRNNs, including both first order [3, 1, 4, 7] and second order [5, 2, 8, 12] networks. LRNNs are easier to analyze because of their limited feedback connections and layered structures. These networks have been demonstrated to be capable of solving some nontrivial formal language learning problems. In this paper, our focus is on using ....
M.C. Mozer, "A focused back-propagation algorithm for temporal pattern recognition," Complex Systems, vol. 3, pp. 349--381, 1989.
....time delay neural networks. The NET talk program [18] is a good example of a connectionist system using a buffer (or temporal window) in order to transform a temporal problem into its spatial representation. x(t) x(t 1) x(t 2) x(t 3) x(t 4) Figure 2.2: A Delay Line network architecture. Mozer [13] and others [19] have exposed the different drawbacks of these conventional methods. The first and obvious drawback is that the buffer must have a sufficient capacity to hold the longest possible input sequence. Similarly, a buffer of t element may be used to recognize an input pattern of greater ....
....sequence as being the same. From these facts, one can easily notice that spatial transformation of time sequences is not viable. A more flexible representation is needed. Despite of the drawbacks stated above, the following properties required by any model of temporal pattern recognition emerge [13]: ffl Some memory of the input history is required. This memory will retain information about the past inputs that are relevant for the future recognition or prediction. ffl A function must be specified to combine the current memory and the current input to form a new temporal context (memory) ....
[Article contains additional citation context not shown here]
M.C. Mozer. "A Focused Back-propagation Algorithm for Temporal Pattern Recognition". Complex Systems, vol 3:pp 349--381, 1989.
....an alternative approach is to propagate activity gradient information forward. This leads to a learning algorithm which we have called real time recurrent learning (RTRL) This algorithm has been independently derived in various forms by Robinson and Fallside (1987) Kuhn (1987) Bachrach (1988) Mozer (1988), and Williams and Zipser (1989a) and continuous time versions have been proposed by Gherrity (1989) and by Doya and Yoshizawa (1989) 5.1 The Algorithm For each k 2 U , i 2 U , j 2 U [ I, and t 0 t t 1 , we define p k ij (t) y k (t) w ij : 28) This quantity measures the sensitivity ....
....k ij (t) f 0 k (s k (t) 2 4 X l2UH w kl p l ij (t Gamma 1) ffi ik x j (t Gamma 1) 3 5 ; 39) which are just the RTRL equations (30) specialized to take into account the fact that w kl is 0 if l 2 UO . One noteworthy special case of this type of architecture has been investigated by Mozer (1988). For this architecture, the only connections allowed between units in the hidden stage are selfrecurrent connections. In this case, p k ij is 0 except when k = i. This algorithm can then be implemented in an entirely local fashion by regarding each p i ij value as being stored with w ij , ....
[Article contains additional citation context not shown here]
Mozer, M. C. (1988). A focused back-propagation algorithm for temporal pattern recognition (Technical Report). Toronto: University of Toronto, Departments of Psychology and Computer Science.
....such as the forward propagation algorithms [14, 23] are much more computationally expensive (for a fully connected recurrent network) but are local in time, i.e. they can be applied in an on line fashion, producing a partial gradient after each time step. Another algorithm was proposed [10, 18] for training constrained recurrent networks in which dynamic neurons with a single feedback to themselves have only incoming connections from the input layer. It is local in time like the forward propagation algorithms and it requires computation only proportional to the number of ....
Mozer M.C. "A focused back-propagation algorithm for temporal pattern recognition", Complex Systems, 3, 1989, pp. 349-391.
....devise efficient methods for computing the gradient. For the case of fully recurrent networks, some very interesting algorithms have been proposed in [19, 23, 30, 32, 33] while more efficient but restrictive algorithms have been devised for the case of networks having self loop connections only [14, 21, 29]. A potential problem, which is likely to affect practical applications, is that the learning process may be seriously plagued by the presence of local minima in the cost function. In general, there is no reason to exclude the presence of stationary points that may also be local minima. Obviously, ....
M. C. Mozer, "A focused backpropagation algorithm for temporal pattern recognition," Complex Systems, vol. 3, pp 349-381, 1989.
....initial plans. The Feed Forward Algorithm for Gradient Search in Action Space As mentioned above the environment is modeled by a non recurrent multilayer backpropagation network. This restriction is sufficient for our simulation results the extension of the algorithm to recurrent networks [3, 4, 5, 9, 13, 14, 15, 21] is straightforward and shown in [18] The external input of the world model network is a state vector s(t) and an action vector a(t) Both state and action vector are the external input I(t) of the model network; for all non input units this external input is 0. The output of the network ....
M. C. Mozer. A focused backpropagation algorithm for temporal pattern recognition. Technical Report CRGTR -88-3, Depts. of Psychology and Computer Science, University of Toronto, Toronto, Jun 1988.
....is frozen, along with all the other weights. Each new hidden unit is in effect a single state variable in a finite state machine that is built specifically for the task at hand. In this use of self recurrent connections only, the RCC model resembles the Focused Back Propagation algorithm of Mozer[Mozer, 1988]. The output, V(t) of each self recurrent unit is computed as follows: V(t) oe X i I i (t) w i V(t Gamma 1) w s where oe is some non linear squashing function applied to the weighted sum of inputs I plus the self weight, w s , times the previous output. In the studies described ....
Mozer, M. C. (1988) "A Focused Back-Propagation Algorithm for Temporal Pattern Recognition," Tech Report CRG-TR-88-3, Univ. of Toronto, Dept. of Psychology and Computer Science.
....approach and more traditional symbolic approaches to language processing will be discussed. Network Architecture Time is an important element in language, and so the question of how to represent serially ordered inputs is crucial. Various proposal have been advanced (for reviews, see Elman, 1990; Mozer, 1988). The approach taken here involves treating the network as a Elman Page 6 simple dynamical system in which previous states are made available as an additional input (Jordan, 1986) In Jordan s work, the network state at anyh point in time was a function of the input on the current time step, ....
Mozer, M. (1988). A focused back-propagation algorithm for temporal pattern recognition. Technical Report CRG-TR-88-3, Departments of Psychology and Computer Science, University of Toronto.
....to perform the task and it has been included primarily as a control case. 1. Frasconi Gori Soda (FGS) locally recurrent networks [16] A multilayer perceptron augmented with local feedback around each hidden node. The local output version has been used. The FGS network has also been studied by [43] the network is called FGS in this paper in line with [63] 2. Narendra and Parthasarathy [44] A recurrent network with feedback connections from each output node to all hidden nodes. The N P network architecture has also been studied by Jordan [33, 34] the network is called N P in this ....
M.C. Mozer. A focused backpropagation algorithm for temporal pattern recognition. Complex Systems, 3(4):349--381, August 1989.
....gradient computation techniques trade space vs. time locality. Interestingly, if the recurrent connections are restricted to self loops for some neurons, both space and time locality can be achieved. Local gradient computation algorithms for such networks were independently derived by Mozer [39] and Gori, Bengio and De Mori [40] In a local feedback architecture (see Fig. 1, neurons are connected in layers like in feedforward nets, eventually using shortcut links (i.e. skipping one or more layers. Units directly connected to the external inputs may have a self loop link, thus ....
.... ii y i (t Gamma 1) X j ij y j (t) y i (t) f(a i (t) 17) Many variant of the above equation are possible. For example y i (t Gamma 1) may be replaced by a i (t Gamma 1) obtaining an activation feedback connection that yields a linear dynamic behavior (such approach was pursued in [39]. Another variant is to introduce multiple delays, feeding back the past output [activation] of the unit at time t Gamma 2; t Gamma 3 and so on. This corresponds to a higher than first order nonlinear [linear] autoregressive model for the unit. A more general case is to use an autoregressive ....
M. Mozer, "A focused back-propagation algorithm for temporal pattern recognition," Complex Systems, vol. 3, pp. 349--381, 1989.
.... the error gradient are: 1) the backpropagation through time algorithm (Werbos, 1974; Rumelhart, Hinton, Williams, 1986; Robinson Fallside, 1987) extended to continuous time by Pearlmutter (1989) and (2) the real time recurrent learning algorithm (Robinson Fallside, 1987; Bachrach, 1988; Mozer, 1988; Williams Zipser, 1989a) The backpropagation through time algorithm can be derived from the more familiar backpropagation algorithm for multilayer networks by unfolding an arbitrary recurrent network into a multilayer feedforward network that grows by one layer on each time step. It is not ....
....ServanSchreiber, McClelland, 1989) but which can be learned by using the full gradient computation (Smith Zipser, 1989) The real time recurrent learning algorithm can also be restricted to certain specialized settings where its disadvantages disappear. In particular, Bachrach (1988) and Mozer (1988) have noted that for a single unit it reduces to an entirely local computation involving the update and storage of one additional number per weight. Mozer has applied this idea to develop an entirely local learning algorithm for certain restricted architectures in which the only recurrent ....
Mozer, M. C. (1988). A focused back-propagation algorithm for temporal pattern recognition (Technical Report). University of Toronto, Departments of Psychology and Computer Science.
....as changing or adverse weather conditions or periods of sudden lighting change as may be encountered when entering a tunnel [Pomerleau, 1995] 6. 3 Relations to Other Recurrent Neural Networks The use of the feedback connections proposed are related to other recurrent neural networks [Jordan, 1989][Mozer, 1989]. The largest difference between these and the architecture used here is in the problems which these architectures are designed to address. Most of the recurrent networks which have been explored in the literature have attempted to address the problem of sequence recognition and reproduction. The ....
Mozer, M.C. (1989) "A Focused Back-Propagation Algorithm for Temporal Pattern Recognition". Complex Systems 3, 349-381.
....complete gradient in fully recurrent networks. The real time recurrent learning (RTRL) algorithm [6, 7, 8] is local in time and produces a partial gradient after each time step, thus allowing on line weights updating. Another algorithm was proposed for training local feedback recurrent networks [9, 10]. It is also local in time, but requires computation only proportional to the number of weights, like back propagation through time. Local feedback recurrent networks are suitable for implementing short term memories but they have limited representational power for dealing with general sequences ....
M. Mozer, "A focused back-propagation algorithm for temporal pattern recognition," Complex Systems, vol. 3, pp. 349--381, 1989.
....sizes Weight decay does not work. It could only slow down the growth of cell states indirectly by decreasing the overall activity in the network. We tested several weight decay algorithms (Hinton, 1986) Weigend et al. 1991) without any encouraging results. Variants of focused backpropagation (Mozer, 1989) also do not work well. These let the internal state decay via a self connection whose weight is smaller than 1. But there is no principled way of designing appropriate decay constants: A potential gain for some tasks is paid for by a loss of ability to deal with arbitrary, unknown causal delays ....
Mozer, M. C. (1989). A focused backpropagation algorithm for temporal pattern processing. Complex Systems, 3:349--381.
....of the error has to be stored which increases the required storage linearly to the sequence length. This generally undesired effect was avoided by a modified mathematical approach, which propagated specific auxiliary gradients together with the activations through time, i.e. purely forward [32, 48, 26, 34]. Although these networks do not show an inversion of the propagation direction any more, the term backpropagation sometimes was retained. A Feed Forward Algorithm for Gradient Descent in CNs 4 In [32, 48] a feed forward gradient descent algorithm for fully recurrent connectionist networks is ....
....to mention that online learning is not an exact gradient descent procedure, but a heuristic which often increases the convergence speed. A Feed Forward Algorithm for Gradient Descent in CNs 18 5 Teacher Forcing In combination with online learning often the technique of teacher forcing is used [26, 48, 28, 13]. Imagine a recurrent network working through the time. One may argue that a desired subsequent state cannot be correctly computed if the current state already shows an error, since, in this case, even correctly adapted parameters will produce a wrong state. This happens because the last state ....
[Article contains additional citation context not shown here]
M. C. Mozer. A focused backpropagation algorithm for temporal pattern recognition. Technical Report CRG-TR-88-3, Depts. of Psychology and Computer Science, University of Toronto, Toronto, Jun 1988. A Feed-Forward Algorithm for Gradient Descent in CNs 41
....and could also discriminate those instances out from the entire music. 1. Introduction In the domain of artificial neural models, recurrence is probably the most studied approach for analysing time series 1 . We can find in the literature several such models, varying from supervised models [26, 22, 11] to unsupervised ones [9, 14] Our model is a recurrent one. It is based on the Kohonen s self organizing map (SOM) 15] It employs two SOMs endowed with time integrator mechanisms and placed hierarchically one over the other. This paper reports the results obtained by applying our model to the ....
M. C. Mozer. A focused backpropagation algorithm for temporal pattern recognition. Complex Systems, 3:349--381, 1989.
....connections include feedback loops, then this goal is achieved naturally. The state of the network will be some function of the current inputs plus the network s prior state. Various algorithms and architectures have been developed which exploit this insight (e.g. Elman, 1990; Jordan, 1986; Mozer, 1989; Pearlmutter, 1989; Rumelhart, Hinton, Williams, 1986) Figure 2 shows one architecture, the Simple Recurrent Network, which was used for the studies to be reported here. Insert Figure 2 In the SRN architecture, at time t hidden units receive external input, and also collateral input ....
Mozer, M.C. (1989). A focused back-propagation algorithm for temporal pattern Elman Page 37 recognition. Complex Systems, 3, 49-81.
....a minor third . Connectionist learning algorithms offer the potential of overcoming the various limitations of transition table approaches and Kohonen musical grammars. Connectionist algorithms are able to discover relevant structure and statistical regularities in sequences (e.g. Elman, 1990; Mozer, 1989). Indeed, connectionist algorithms can be viewed as an extension of the transition table approach, a point also noted by Dolson (1989) Just as the transition table approach uses a training set to calculate the probability of the next note in a sequence as a function of the previous notes, so does ....
.... network is trained on sequences in which one event predicts another, the relationship is not hard to learn if the two events are separated by only a few unrelated intervening events, but as the number of intervening events grows, a point is quickly reached where the relationship cannot be learned (Mozer, 1989, 1992, 1993; Schmidhuber, 1992) Bengio, Frasconi, and Simard (1993) present theoretical arguments for inherent limitations of learning in recurrent networks. This poses a serious limitation on the use of back propagation to induce musical structure in a note bynote prediction paradigm because ....
[Article contains additional citation context not shown here]
Mozer, M. C. (1989). A focused back-propagation algorithm for temporal pattern recognition. Complex Systems, 3, 349-381.
....the forms of memories discussed in the next three sections. Exponential trace memory An exponential trace memory is formed using the kernel function c i (t) 1 Gamma i ) t i ; where i lies in the interval [ Gamma1; 1] Figure 3b) This form of memory has been studied by Jordan (1987) Mozer (1989), and Stornetta, Hogg, and Huberman (1988) Unlike the delay line memory, the exponential trace memory does not sharply drop off at a fixed point in time; rather, the Mozer 5 strength of an input decays exponentially. This means that more recent inputs will always have greater strength than more ....
.... be implemented in a recurrent neural net architecture in which the x i and x 0 correspond to activities in two layer of hidden units (Figure 4a) Note that this architecture is quite similar to a fairly common recurrent neural net sequence processing architecture (Figure 4b; e.g. Elman, 1990; Mozer, 1989), which I ll refer to as the standard architecture. Third, for autopredictive tasks in which the target output, p( is a one step prediction of the input, i.e. p( x( 1) Mozer 8 input x(t) hidden 1 x (t) hidden 2 x(t) hidden 3 output y (t) input x(t) hidden 1 hidden 2 ....
[Article contains additional citation context not shown here]
Mozer, M. C. (1989). A focused back-propagation algorithm for temporal pattern recognition.
No context found.
M. C. Mozer. A focused backpropagation algorithm for temporal pattern recognition. Technical Report CRG-TR-88-3, Depts. of Psychology and Computer Science, University of Toronto, Toronto, Jun 1988.
No context found.
M. C. Mozer, A Focused Back/Propagation Algorithm for Temporal Pattern Recognition. Technical Report CRG-TR-88-3, 1988
No context found.
Mozer M.C. (1989). A Focused Backpropagation Algorithm for Temporal Pattern Recognition. Complex Systems 3, 349-381.
No context found.
Michael C. Mozer. A focused back-propagation algorithm for temporal pattern recognition, Rapport Technique, Univ. of Toronto, Dep ts Psychology and Computer Science, CRG-TR-88-3, 1988. 255
No context found.
M. C. Mozer. A focused backpropagation algorithm for temporal pattern processing. Complex Systems, 3:349--381, 1989.
No context found.
Mozer, M.C. (1989). A focused back-propagation algorithm for temporal pattern recognition. Complex Systems, 3, 349--381.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC