12 citations found. Retrieving documents...
J. Kay, "Feature discovery under contextual supervision using mutual information," in Proc. of Int. Joint Conference on Neural Networks, pp. 79--84. IEEE, Piscataway, NJ, 1992.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Associative Clustering by Maximizing a Bayes Factor - Sinkkonen, Nikkilä, Lahti.. (2003)   (2 citations)  (Correct)

....sets [19] The current likelihood formulation, however, breaks down in the case of two margin clustering. Yet another example of dependency maximization is canonical correlation analysis, which uses a second moment criterion equivalent to mutual information assuming normally distributed data [13]. A B Downtown Srninen Kulosaari Matinkyl Westend Figure 1: A Demonstration of the di#erence between dependency and joint density modeling. The hypothetical joint domain of two one dimensional continuous margin variables x and y is shown by the outer square, and joint pdf is shown by shades ....

J. Kay. Feature discovery under contextual supervision using mutual information. In Proc IJCNN'92, pages 79--84. IEEE, 1992.


Canonical Correlations between Input and Output Processes of .. - De Cock, De Moor   (Correct)

....by H. Hotelling [10] Although a wide variety of applications exists in econometrics, biometrics, chemometrics, statistics, meteorology, etc. the technique has only got introduced quite recently in the communities of signal processing, system theory and identification and neural networks [4, 14, 20]. In a classic paper by Gel fand and Yaglom [9] CCA is extended to stochastic processes and related to the notion of mutual information, a concept from information theory that is closely related to CCA and that was introduced by Shannon [18] in 1948. A slightly di#erent interpretation in terms of ....

J. Kay, "Feature discovery under contextual supervision using mutual information", In Proceedings of the 1992.


Finding Efficient Nonlinear Visual Operators using Canonical.. - Borga, Knutsson (2000)   (Correct)

....is the best predictor and at the same time the linear combination of an other set which is the most predictable. It has been shown that nding the canonical correlations is equivalent to maximizing the mutual information between the sets if the underlying distributions are elliptically symmetric [9]. 2 Canonical correlation analysis Consider two random variables, x and y, from a multi normal distribution: x y N x 0 y 0 ; C xx C xy C yx C yy ; 1) where C = Cxx Cxy Cyx Cyy is the covariance matrix. C xx and C yy are nonsingular matrices and C xy = C ....

J. Kay. Feature discovery under contextual supervision using mutual information. In International Joint Conference on Neural Networks, volume 4, pages 79-84. IEEE, 1992.


Local Online Learning of Coherent Information - Der, Smyth   (Correct)

....between stream outputs can be used to discover the coherent information. They successfully apply this algorithm online. Floreano (1996) describes a local algorithm extending the PCA algorithm of Oja (1982) to extract the direction of maximium variance between, rather within, multivariate datasets. Kay (1992), and Diamantaras and Kung (1994) present global algorithms extending the corresponding multiple output PCA networks to find these directions of coherent variation. In line with the work of Phillips and colleagues we differ from this global approach by learning the features that are coherent ....

Kay, J. (1992). Feature discovery under contextual supervision using mutual information. In Proceedings of the 1992 International Joint Conference on Neural Networks (Baltimore), 4, 79-84.


A Unified Approach to PCA, PLS, MLR and CCA - Borga, Landelius, Knutsson (1992)   (Correct)

.... Deltaw = ff 0 xy T yx T 0 Gamma u 1 u T 1 w T 1 u 1 w Gamma xx T 0 0 yy T w : 75) Note that this algorithm simultaneously finds both the directions of canonical correlations and the canonical correlations ae i in contrast to the algorithm proposed by Kay [15], which only finds the directions. 3.4 MLR Finding the directions for minimum square error Also here, the algorithm in eq. 42 can be used for a stochastic gradient search. With the A, B and w according to eq. 38, we get the update direction as: Ef Deltawg = fl r w = ff 0 C xy C yx 0 ....

.... In particular, the concepts independant components and mutual information has been the basis for a number of successful applications, e.g. blind separation and blind deconvolution [2] It is appropriate to point out that there is a strong relation between these concepts and canonical correlation [1, 15]. The relevance of the present paper in this context is apparent. 6 Proofs 6.1 Orthogonality in the metrics A and B (eq. 5) w T i Bw j = 0 for i 6= j fi i 0 for i = j and w T i Aw j = 0 for i 6= j r i fi i for i = j (5) Proof: For solution i we have Aw i = r i Bw i The scalar ....

J. Kay. Feature discovery under contextual supervision using mutual information. In International Joint Conference on Neural Networks, volume 4, pages 79--84. IEEE, 1992.


SVD Algorithms: APEX-like versus Subspace Methods - Weingessel, Hornik (1997)   (3 citations)  (Correct)

....value decomposition (SVD) SVD has various applications in numerical analysis and is closely related to canonical correlations in statistics. The basic idea of canonical correlation is to find linear functions a 0 x and b 0 y of two random vectors x and y which are most highly correlated, cf. [3]. The SVD of an m Theta n matrix Z is defined as follows, see for example [4] Let r min(m; n) be the rank of Z. Then, Z can be written as Z = U V 0 where U 2 R m Thetam and V 2 R n Thetan are orthonormal matrices and 2 R m Thetan is a matrix whose entries are all zero except the first ....

Jim Kay. Feature discovery under contextual supervision using mutual information. In Proceedings of the International Joint Conference on Neural Networks, volume 4, pages 79--84, 1992.


Analysis, Visualization and Meta-analysis of Functional.. - Nielsen (1999)   (Correct)

....importance in a mental process. 3. 10 Canonical Manifold Analysis Canonical correlation analysis (CCA) was developed by Harold Hotelling (Hotelling 1935; Hotelling 1936) Introductions to CCA are available in Mardia et al. 1979, chapter 10) and (Anderson 1984, chapter 12) 2 Neural networks: (Kay 1992) (Haykin 1994, pages 463464) Becker 1992) Becker 1996) Elliptic distribution: Cambanis, Huang, and Simons 1981) Kay 1992) Canonical ridge analysis (DeMers and Cottrell 1993) Principal manifolds) Haykin 1994, chapter 9) connectionist approach to principal component analysis Heteroassociation ....

.... Hotelling (Hotelling 1935; Hotelling 1936) Introductions to CCA are available in Mardia et al. 1979, chapter 10) and (Anderson 1984, chapter 12) 2 Neural networks: Kay 1992) Haykin 1994, pages 463464) Becker 1992) Becker 1996) Elliptic distribution: Cambanis, Huang, and Simons 1981) (Kay 1992) Canonical ridge analysis (DeMers and Cottrell 1993) Principal manifolds) Haykin 1994, chapter 9) connectionist approach to principal component analysis Heteroassociation (Haykin 1994, page 66) Parra 1996) Symplectic Nonlinear components analysis (Sch#lkopf, Smola, and M#ller 1996) Nonlinear ....

[Article contains additional citation context not shown here]

Kay, J. (1992). Feature discovery under contextual supervision using mutual information. In Proceedings of the 1992 International Joint Conference on Neural Networks, Volume 4, pp. 7982. Baltimore, MD: IEEE.


Activation Functions, Computational Goals and Learning Rules.. - Kay, Phillips (1994)   (1 citation)  Self-citation (Kay)   (Correct)

....between underlying feature classes. If class labels are one of the data sets then the method of canonical correlation (Hotelling, 1936) may be used to provide the Fisher discriminant functions. This is relevant to the case of supervised learning with an external teacher. However, as shown by Kay(1992), canonical correlation can be used in a neural network for the extraction of linear latent variables under contextual supervision. This suggests that discovery of predictive relationships betweeen diverse data sets could be one goal of cerebral cortex and that this goal can be formulated at the ....

Kay, J. 1992. Feature discovery under contextual supervision using mutual information. In Proceedings of the 1992 International Joint Conference on Neural Networks(Baltimore) Book 4, 79-84.


Activation Functions, Computational Goals and Learning Rules.. - Kay, Phillips (1994)   (1 citation)  Self-citation (Kay)   (Correct)

No context found.

Physiol. 461, 247-262. Kay, J. 1992. Feature discovery under contextual supervision using mutual information. In Proceedings of the 1992 International Joint Conference on Neural Networks(Baltimore) Book 4, 79-84.


Non-Parametric Dependent Components - Klami, Kaski (2005)   (Correct)

No context found.

J. Kay, "Feature discovery under contextual supervision using mutual information," in Proc. of Int. Joint Conference on Neural Networks, pp. 79--84. IEEE, Piscataway, NJ, 1992.


Canonical Correlations between Input and Output Processes of .. - De Cock, De Moor (2002)   (Correct)

No context found.

J. Kay, \Feature discovery under contextual supervision using mutual information", In Proceedings of the 1992.


Information Bottleneck and Linear Projections of Gaussian.. - Chechik, Globerson (2003)   (Correct)

No context found.

J. Kay. Feature discovery under contextual supervision using mutual information. In International Joint Conference on Neural Networks, volume 4, pages 79-84, 1992.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC