18 citations found. Retrieving documents...
P. F oldi ak. Adaptive network for optimal linear feature extraction. In Proceedings of the IEEE/INNS International Joint Conference on Neural Networks, volume 1, pages 401--5, New York, NY, 1989. IEEE Press. 813 P. F oldi ak. Forming sparse representations by local anti-Hebbian learning. Biological Cybernetics, 64:165--70, 1990.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Information Theoretic Approaches to Neural Network Learning - Plumbley (1997)   (Correct)

....cation, each would nd the same principal component of the input data. Some mechanism must be used to force the outputs to learn something di erent from each other. One possibility is to use a lateral inhibition network between the output neurons, which forces their outputs to be decorrelated [9]. An alternative is to modify the weight decay term of the Oja algorithm: this approach is used by William s Symmetric Error Correction (SEC) algorithm [25] Oja and Karhunen s M output PCA algorithm [14] and Sanger s Generalised Hebbian Algorithm (GHA) 20] In fact, these algorithms have much ....

P. Foldiak. Adaptive network for optimal linear feature extraction. In Proceedings of the International Joint Conference on Neural Networks, IJCNN-89, pages 401-405, New York, June 1989. IEEE Press.


Information Theory and Neural Networks - Taylor, Plumbley (1993)   (3 citations)  (Correct)

....cation, each would nd the same principal component of the input data. Some mechanism must be used to force the outputs to learn something di erent from each other. One possibility is to use a lateral inhibition network between the output neurons, which forces their outputs to be decorrelated [16]. An alternative is to modify the weight decay term of the Oja algorithm: this approach is used by William s Symmetric Error Correction (SEC) algorithm [48] Oja and Karhunen s M output PCA algorithm [24] and Sanger s Generalised Hebbian Algorithm (GHA) 39] In fact, these algorithms have much ....

P. Foldiak. Adaptive network for optimal linear feature extraction. In Proceedings of the International Joint Conference on Neural Networks, IJCNN-89, pages 401-405, Washington DC, 18-22 June 1989.


Natural Gradient Learning for Spatio-temporal.. - Choi, Amari, CICHOCKI (2000)   (Correct)

....# (k) # such that the correlation matrix of #(k) becomes the identity matrix, i.e. # ## = E##(k)# # (k)# = # . Various adaptive algorithms for spatial decorrelation have been developed. They include principal component analysis (PCA) neural networks [1] 2] the linear anti Hebbian rule [3], the Almeida Silva algorithm [4] Recently local and global adaptive algorithms for spatial decorrelation was analyzed in [5] The linear anti Hebbian rule [3] might be one of well known adaptive spatial decorrelation algorithms. In contrast to the Hebbian rule, the anti Hebbian rule is known to ....

....decorrelation have been developed. They include principal component analysis (PCA) neural networks [1] 2] the linear anti Hebbian rule [3] the Almeida Silva algorithm [4] Recently local and global adaptive algorithms for spatial decorrelation was analyzed in [5] The linear anti Hebbian rule [3] might be one of well known adaptive spatial decorrelation algorithms. In contrast to the Hebbian rule, the anti Hebbian rule is known to be a energy minimizer [6] The minimization of output energy leads to uncorrelated output variables. In [3] a linear feedbacknetwork was considered for spa ....

[Article contains additional citation context not shown here]

P.Foldiak, \Adaptivenetwork for optimal linear feature extraction," in ##### #### ##### ##### ###### ########, 1989, pp. 401-405.


Natural Gradient Learning for Spatio-temporal.. - Choi, Amari, Cichocki (2000)   (Correct)

.... yn (k) T such that the correlation matrix of y(k) becomes the identity matrix, i.e. R yy = Efy(k)y T (k)g = I . Various adaptive algorithms for spatial decorrelation have been developed. They include principal component analysis (PCA) neural networks [1] 2] the linear anti Hebbian rule [3], the Almeida Silva algorithm [4] Recently local and global adaptive algorithms for spatial decorrelation was analyzed in [5] The linear anti Hebbian rule [3] might be one of well known adaptive spatial decorrelation algorithms. In contrast to the Hebbian rule, the anti Hebbian rule is known to ....

....decorrelation have been developed. They include principal component analysis (PCA) neural networks [1] 2] the linear anti Hebbian rule [3] the Almeida Silva algorithm [4] Recently local and global adaptive algorithms for spatial decorrelation was analyzed in [5] The linear anti Hebbian rule [3] might be one of well known adaptive spatial decorrelation algorithms. In contrast to the Hebbian rule, the anti Hebbian rule is known to be a energy minimizer [6] The minimization of output energy leads to uncorrelated output variables. In [3] a linear feedback network was considered for spa ....

[Article contains additional citation context not shown here]

P. Foldiak, \Adaptive network for optimal linear feature extraction," in Proc. Int. Joint Conf. Neural Networks, 1989, pp. 401-405.


An Unsupervised Hybrid Network for Blind Separation of.. - Choi, Cichocki (1999)   (Correct)

....linear prediction for the overdetermined case (m n) C. The Unsupervised Hybrid Network The task of spatio temporal decorrelation and blind source separation is performed by an hybrid network as shown in Figure 1. For spatio temporal decorrelation, we extend the anti Hebbian learning rule [24] that was shown to be efficient in spatial decorrelation task. The first stage of the network is learned by spatio temporal anti Hebbian rule (25) and produce the output y(k) whose components are spatio temporally uncorrelated. The output y(k) is fed into the second stage which is implemented by a ....

....uncorrelated. The output y(k) is fed into the second stage which is implemented by a linear memoryless feedforward network and is transformed into z(k) whose components are statistically independent. III. Spatio temporal Decorrelation We first briefly review the anti Hebbian rule [24] that was shown to decorrelate two associated signals without instability problem if the learning rate is small enough. Then we extend the anti Hebbian rule to spatio temporal domain and derive the spatio temporal anti Hebbian rule from an information theoretic viewpoint. A. Anti Hebbian Rule ....

[Article contains additional citation context not shown here]

P. Foldi'ak, "Adaptive network for optimal linear feature extraction, " in International Joint Conference on Neural Networks, 1989, pp. 401--405.


A Hebbian/anti-Hebbian Network Which Optimizes Information.. - Plumbley (1993)   (2 citations)  (Correct)

....W = dW=dt, and xx T with Sigma x = E(xx T ) Let us consider first the combined network with direct recurrent lateral connections (Fig. 4(a) In this combined network we now have y = Wx Gamma V y or y = 1 V ) Gamma1 Wx: 9) This is similar to the PCA network introduced by Foldi ak [5] except that his network had no selfinhibitory connections, so the diagonal entries are fixed at zero. He suggested a combination of the Oja [9] update rule DeltaW = jW Gamma yx T Gamma ffdiag(yy T )W Delta (10) and the anti Hebbian decorrelating rule (6) This combination extracts the ....

P. Foldi'ak. Adaptive network for optimal linear feature extraction. In Proc. IJCNN-89, pages 401--405, Washington D.C., 18-22 June 1989.


Information Theory and Neural Network Learning Algorithms - Plumbley (1992)   (2 citations)  (Correct)

....to deal with in general, it is possible to analyse this in the case of spatially invariant statistics [8, 23] Atick and Redlich [24] tested this against real contrast sensitivity curves, and found a remarkable match. It may be that networks which learn to find decorrelated principal components [25, 18] may be an approximation to this optimum, if their outputs were normalized to have the same variance. 5. Conclusions Information theory can give a very useful insight into both supervised and unsupervised learning approaches. In particular, it is a very useful aid when dealing with unsupervised ....

P. Foldi'ak. Adaptive network for optimal linear feature extraction. In Proceedings of the International Joint Conference on Neural Networks, IJCNN-89, pages 401--405, Washington D.C., 18-22 June 1989.


Recurrent Neural Networks For Blind Separation of Sources - Amari, Cichocki, Yang (1995)   (10 citations)  (Correct)

....(1991) 9] and Matsuoka et al. (1995) 14] However, all these researchers did not use self inhibitory connections in their models. This is also the case in the novelty filter described by Kohonen (1984) 11] and de correlating network proposed by Barlow and Foldi ak (1989) 7] and Foldi ak (1989)[6]. In contrast to these models, our network is fully connected with self inhibitory connections. We shall show that these self loops play an essential role in improving the performance of the network in separating sources. For the model (1) we have developed the following on line learning ....

P. Foldi'ak. Adaptive network for optimal linear feature extraction. In Proc. IEEE/INNS Int. Joint Conf. Neural Net., volume 1, pages 401--405, 1989.


A Review of Dimension Reduction Techniques - Carreira-Perpiñán (1997)   (Correct)

....[2] showed that this network finds a basis of the subspace spanned by the first h PCs, not necessarily coincident with them 30 ; see [11, 15] for applications. ffl Networks based in Oja s rule [77] with some kind of decorrelating device (e.g. Kung and Diamantaras APEX [72] Foldi ak s network [32], Sanger s Generalised Hebbian Algorithm [88] W C n input units h n hidden, output units n output units Linear autoassociator APEX network Figure 9: Two examples of neural networks able to perform a principal component analysis of its training set: left, a linear autoassociator, trained by ....

P. F oldi' ak, Adaptive network for optimal linear feature extraction, in Proc. Int. J. Conf. on Neural Networks, vol. I, 1989, pp. 401--405.


Optimal Linear Compression Under Unreliable.. - Diamantaras, Hornik, ..   (Correct)

....representation, Hebbian learning I. Introduction The last 20 years have seen a great surge of research interest in the area of learning models for optimal linear data compression and feature extraction, relating to classical statistical methods such as Principal Component Analysis (PCA) 16] [6], 19] 1] 3] These rules can roughly be categorized into the ones which are derived from a Hebbian learning principle and those implemented on special types of multilayer perceptrons. The Hebbian rules are usually realized by single layer networks with perhaps lateral weights among the output ....

P. Foldi'ak, "Adaptive Network for Optimal Linear Feature Extraction ", in Proc. Int. Joint Conf. Neural Networks, vol. 1, pp. 401-406, Washington DC, 1989.


Bayesian Unsupervised Learning of Higher Order Structure - Lewicki (1996)   (17 citations)  (Correct)

....the next layer. A High Order Lines Problem. The first example illustrates that the algorithm can discover the underlying features in complicated patterns and that the higher layers can capture interesting higher order structure. The first dataset is a variant of the lines problem proposed by Foldi ak (1989). The patterns in the dataset are composed of horizontal and vertical lines as illustrated in figure 1. Note that, although A 0.6 0.1 0.1 0.1 0.1 B Figure 1: Dataset for the high order lines problem. A) Patterns are generated by selecting one of the pattern types according to the probabilities ....

Foldi'ak, P. (1989). Adaptive network for optimal linear feature extraction. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 401--405, Washington, D. C.


Adaptive Perceptual Pattern Recognition by Self-Organizing.. - Marshall (1995)   (5 citations)  (Correct)

....such inefficient structural states A parsimonious method (Marshall, 1989ab, 1990acdef, 1991, 1992ab) achieves the desired result, using only strictly local self organization processes. The technique involves imposing an anti Hebbian (Amari Takeuchi, 1978; Carlson, 1990; Easton Gordon, 1984; F oldi ak, 1989, 1990, 1992; Kohonen, 1984; Marshall, 1989ab, 1990acdef, 1991, 1992ab; Nigrin, 1990abc, 1992, 1993; Rubner Schulten, 1990; Soodak, 1991; Wilson, 1988) inhibitory learning rule, to govern changes in the weights of the lateral inhibitory connections, in addition to the excitatory learning rule. ....

F¨oldi'ak, P. (1989). Adaptive network for optimal linear feature extraction. Proceedings of the International Joint Conference on Neural Networks, Washington, DC, June 1989, I., 401--405.


Learning in Linear Neural Networks: a Survey - Baldi, Hornik (1995)   (21 citations)  (Correct)

....extract the first principal component. However, an additional mechanism which introduces competition or some hierarchical order between the output units, for instance via some lateral inhibition mechanism (see Figure 5) might force the network to perform full PCA. For example, as in F oldi ak [44], we can consider a linear architecture where the outputs are updated according to y Ax Wy: 28) Here, y is the vector of activities in the p output units (there are no hidden units) A is the connection matrix from the inputs x to the outputs y and W is the zerodiagonal matrix of lateral ....

....the subspace algorithm in favor of general Brockett type rules where Theta has distinct positive entries. For the lateral inhibition algorithms, this implies that in symmetric mode, the simple anti Hebbian decorrelation mechanism as e.g. proposed by Barlow F oldi ak [48] and used in F oldi ak [44] has to be combined with an additional weight decay W term as proposed in Leen [46] Finally, we notice that if these algorithms are used in asymmetric mode, additional units can be added without retraining the already mature part of the network; i.e. one can incrementally build the principal ....

F¨oldi'ak, P. (1989). Adaptive network for optimal linear feature extraction. In Proceedings of the Joint International Conference on Neural Networks (pp. I: 401--405). San Diego, CA: SOS Printing.


Noisy Linear Networks - Hornik (1993)   (Correct)

....g( Sigma (A) or its sample counterpart could be employed; however, this leads to quite complicated and biologically implausible algorithms. On the other hand, the local PCA algorithms that have been introduced within the last few years typically force the outputs components to be uncorrelated (Foldi ak, 1989; Rubner Tavan, 1989; Sanger, 1989; Kung Diamantaras, 1990; Leen, 1991; cf. also Hornik Kuan, 1992) Hence, these algorithms may work reasonably well for small noise (because then minimizing the size of Sigma (A) roughly amounts to full PCA of Sigma) but are inappropriate for dealing ....

Foldi'ak, P. (1989). Adaptive network for optimal linear feature extraction. In Proceedings of the International Joint Conference on Neural Networks (pp. I: 401--405).


Unsupervised Neural Network Learning Procedures For.. - Suzanna Becker, Mark.. (1996)   (6 citations)  (Correct)

....component. The third output y 3 is decorrelated from the previous two, and so on until all M desired principal components have been extracted in order. Kung and Diamantaras [40] suggested an Adaptive Principal Component Extractor (APEX) which is related to the Rubner and Tavan approach. Foldi ak [20] suggested that units obeying the Oja rule should be decorrelated using a symmetric decorrelating stage (Fig. 3) In this network we have y j (t) w j (t) x(t) X k 6=j v jk (t)y k (t) 21) which in matrix notation is y(t) W(t)x(t) V(t)y(t) 22) where the lateral inhibition matrix V(t) ....

P. Foldi'ak. Adaptive network for optimal linear feature extraction. In Proceedings of the International Joint Conference on Neural Networks, IJCNN-89, pages 401--405, Washington, DC, 1989.


Generalization and Exclusive Allocation of Credit in.. - Marshall, Gupta (1998)   (Correct)

....in category scission between independent category groupings and allows the EXIN network to generate near optimal parsings of multiple superimposed patterns, in terms of multiple simultaneous activations (Marshall, 1995) 3.1. 3 Linear decorrelator networks Linear decorrelator networks (Oja, 1982; Foldi ak, 1989) also use an anti Hebbian inhibitory learning rule that can cause the lateral inhibitory connections to vanish during learning. This allows simultaneous neural activations. However, the linear decorrelator network responds essentially to differences, or distinctive features (Anderson, Silverstein, ....

Foldi'ak, P. (1989). Adaptive network for optimal linear feature extraction. Proceedings of the International Joint Conference on Neural Networks, Washington, DC, I, 401--405.


A Network which Performs Orthonormalized Principal Subspace.. - Plumbley (1994)   (Correct)

.... and Rubner and Tavan [18] In addition, M output networks have been proposed which find the subspace spanned by the first M principal components, the so called principal subspace, including the Williams [22] Symmetric Error Correction (SEC) Network, the Oja [12] Subspace Network, and the Foldi ak [5] linear feature extraction network. Although it can be argued that the behaviour of linear networks such as these PCA and principal subspace networks is more limited than non linear networks, they have the significant advantage that extensive theoretical analysis of their properties and behaviour ....

....[16] 4 Convergence analysis The two weight update algorithms (10a) 10b) for W and U interact in a rather complex way, such that convergence is likely to be determined by the relative magnitudes of their learning rates ffl w and ffl u . For example, Leen [8] gives an analysis of the Foldi ak [5] network, which also uses lateral feedback connections. This analysis shows that the relative update rates in the input and output stages is important when determining convergence behaviour. If the update rate for the input stage is too large, the algorithm is unstable. For the purposes of this ....

P. Foldi'ak. Adaptive network for optimal linear feature extraction. In Proceedings of the International Joint Conference on Neural Networks, IJCNN-89, pages 401--405, Washington D.C., June 1989.


Journal of Machine Learning Research 7 (2006) 793--815.. - Michael Spratling..   (Correct)

No context found.

P. F oldi ak. Adaptive network for optimal linear feature extraction. In Proceedings of the IEEE/INNS International Joint Conference on Neural Networks, volume 1, pages 401--5, New York, NY, 1989. IEEE Press. 813 P. F oldi ak. Forming sparse representations by local anti-Hebbian learning. Biological Cybernetics, 64:165--70, 1990.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC