21 citations found. Retrieving documents...
Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in neural information processing systems. In G. Tesauro & D. Touretzky & T. Leed (Eds.), Advances in Neural Information Processing Systems, (pp. 633--640). Cambridge, MA: MIT Press.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Incremental Active Learning for Optimal Generalization - Sugiyama, Ogawa (2000)   (2 citations)  (Correct)

....signals in many scientific experiments or learning of sensorimotor maps of multijoint robot arms. Learning can be performed more e#ciently if we can actively design input signals. The problem of designing input signals for optimal generalization is called active learning (Cohn, Ghahramani, Jordan, 1996; Fukumizu, 1996; Vijayakumar Ogawa, 1999) It is also referred to as optimal experiments (Kiefer, 1959; Fedorov, 1972; Cohn, 1994) or query construction (Sollich, 1994) Reinforcement learning (Kaelbling, 1996) which has been extensively studied recently in the field of machine learning, can ....

....a crucial problem for acquiring the optimal generalization capability. Within the framework of Bayesian statistics, MacKay (1992) derived a criterion for selecting the most informative training data for specifying the parameters of neural networks. Cohn (1994, 1996) and Cohn, Ghahramani, and Jordan (1996) gave an active learning criterion for minimizing the variance of the estimator. Fukumizu (1996) proposed an active learning method in multi layer perceptrons using asymptotic approximation for estimating the generalization error. Essentially, the criteria derived in these papers are Incremental ....

[Article contains additional citation context not shown here]

Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9 (pp. 417--423). The MIT Press. Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129--145.


Reconstruction of Chaotic Dynamics Using a Noise-Robust.. - Yoshida, Ishii, Sato (2000)   (Correct)

.... W i x j W i x b i . The NGnet can be interpreted as a stochastic model, in which a pair of an input and an output, x; y) is a stochastic event. For each event (x; y) a unit index i 2 f1; 1 1 1 ; Mg is assumed to be selected, which is regarded as a hidden variable. The stochastic model [8] is defined by the probability distribution for a triplet (x; y; i) P (x; y; ij ) G i (x) M (2) 0 D 2 oe 0D i exp 01 2oe 2 i (y 0 W i x) 2 ; 2) where j f i ; 6 i ; oe 2 i ; W i ji = 1; 1 1 1 ; Mg is a set of model parameters. The expectation value of the output y for ....

Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in Neural Information Processing Systems 7 (pp. 633-640), MIT Press.


Incremental Active Learning for Optimal Generalization - Sugiyama, Ogawa   (2 citations)  (Correct)

....signals in many scientific experiments or learning of sensorimotor maps of multijoint robot arms. Learning can be performed more e#ciently if we can actively design input signals. The problem of designing input signals for optimal generalization is called active learning (Cohn, Ghahramani, Jordan, 1996; Fukumizu, 1996; Vijayakumar Ogawa, 1999) It is also referred to as optimal experiments (Kiefer, 1959; Fedorov, 1972; Cohn, 1994) or query construction (Sollich, 1994) Reinforcement learning (Kaelbling, 1996) which has been extensively studied recently in the field of machine learning, can ....

....a crucial problem for acquiring the optimal generalization capability. Within the framework of Bayesian statistics, MacKay (1992) derived a criterion for selecting the most informative training data for specifying the parameters of neural networks. Cohn (1994, 1996) and Cohn, Ghahramani, and Jordan (1996) gave an active learning criterion for minimizing the variance of the estimator. Fukumizu (1996) proposed an active learning method in multi layer perceptrons using asymptotic approximation for estimating the generalization error. Essentially, the criteria derived in these papers are Incremental ....

[Article contains additional citation context not shown here]

Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9 (pp. 417--423). The MIT Press. Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129--145.


Reinforcement Learning based on On-line EM Algorithm - Sato, Ishii (1999)   (1 citation)  (Correct)

....X i=1 P (ijx(t) y(t) log P (x(t) y(t) ij ) 4) Since an increase of L( j ; X;Y ) implies an increase of the log likelihood for the observed data (X; Y ) Dempster et al. 1977) L( j ; X;Y ) is maximized with respect to . A solution of the necessity condition L= 0 is given by (Xu et al. 1995) i = hxi i (T ) h1i i (T ) 5a) Sigma Gamma1 i = hxx 0 i i (T ) h1i i (T ) Gamma i (T ) 0 i (T ) Gamma1 (5b) W i = hy x 0 i i (T ) h x x 0 i i (T ) Gamma1 (5c) oe 2 i = 1 D h hjy 2 ji i (T ) Gamma Tr i W i h xy 0 i i (T ) ji =h1i i (T ) 5d) where ....

.... i W i h xy 0 i i (T ) ji =h1i i (T ) 5d) where h Deltai i denotes a weighted mean with respect to the posterior probability (3) and it is defined by hf(x; y)i i (T ) j 1 T T X t=1 f(x(t) y(t) P (ijx(t) y(t) 6) The EM algorithm introduced above is based on batch learning (Xu et al. 1995), namely, the parameters are updated after seeing all of the observed data. We introduce here an on line version (Sato Ishii, 1998) of the EM algorithm. Let (t) be the estimator after the t th observed data (x(t) y(t) In this on line EM algorithm, the weighted mean (6) is replaced by f(x; ....

Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in Neural Information Processing Systems 7 (pp. 633-640), MIT Press.


Improved Learning Algorithms For Mixture Of Experts In Multiclass - Chen, Xu, Chi (1999)   Self-citation (Xu)   (Correct)

No context found.

Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in neural information processing systems. In G. Tesauro & D. Touretzky & T. Leed (Eds.), Advances in Neural Information Processing Systems, (pp. 633--640). Cambridge, MA: MIT Press.


Spectral Relaxation Models And Structure Analysis For K-Way.. - Gu Zha Ding (2001)   (7 citations)  Self-citation (Clustering)   (Correct)

No context found.

M. I. Ng, A. Y. Jordan and Y. Weiss. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems, 14, 2001.


Graphical Models And Variational Approximation - Jordan (1998)   Self-citation (Jordan)   (Correct)

No context found.

C. Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9, MIT Press, Cambridge MA. Jaakkola, T. S. (1997). Variational methods for inference and estimation in graphical models. Unpublished doctoral dissertation, Massachusetts Institute of Technology.


Improved Learning Algorithms for Mixture of Experts in.. - Chen, Xu, Chi (1999)   Self-citation (Xu)   (Correct)

....(EM) algorithm (Dempster, Laird Rubin, 1977) to both the ME and the HME architecture so that the learning process is decoupled in a manner that fits well with the modular structure. The favorable properties of the EM algorithm have been shown by theoretical analyses (Jordan Xu, 1995; Xu Jordan, 1996) In the ME architecture, the EM algorithm makes the original complicated maximum likelihood problem decomposed into several Neural Networks 12 (1999) 1229 1252 PERGAMON Neural Networks 0893 6080 99 see front matter # 1999 Published by Elsevier Science Ltd. All rights ....

....to implicitly avoid the instability problem. Alternatively, Chen et al. 1996b) used a so called generalized Bernoulli density as the statistical model of expert networks for multiclass classification and applied such ME and HME classifiers to speaker identification. Xu and Jordan (1994) and Xu, Jordan and Hinton (1995) have proposed an alternative ME model, where a localized gating network is employed so that parameter estimation in the gating network can be analytically solvable. For a regression task, the IRLS algorithm is avoided in the alternative ME model so that the EM algorithm becomes a single loop ....

[Article contains additional citation context not shown here]

Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in neural information processing systems. In G. Tesauro & D. Touretzky & T. Leed (Eds.), Advances in Neural Information Processing Systems, (pp.


Kernel Trick Embedded Gaussian Mixture Model - Wang, Lee, Zhang   (Correct)

No context found.

Ng, A. Y., Jordan, M. I. and Weiss, Y.: On Spectral Clustering: Analysis and an algorithm, Advance in Neural Information Processing Systems (NIPS) 14, MIT Press, Cambridge MA (2002)


Critical Lines in Symmetry of Mixture Models and its.. - Fukumizu, Akaho, Amari (2002)   (Correct)

No context found.

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Advances in Neural Information Processing Systems, 14, 2002. MIT Press.


Color Image Segmentation: Kernel Do the Feature Space - Lee, Wang, Zhang (2003)   (Correct)

No context found.

Ng, A. Y., Jordan, M. I. and Weiss, Y., On Spectral Clustering: Analysis and an algorithm, Advance in Neural Information Processing Systems (NIPS) 14, (2002)


Time Series Analysis And Prediction Using Recurrent Gated Experts - Gilde (1996)   (Correct)

No context found.

Mozer, M., Touretzky, D., and Perrone, M., editors (1996). Advances in Neural Information Processing Systems 8, Cambridge MA. MIT Press.


Incremental Active Learning for Optimal Generalization - Sugiyama, Ogawa (2000)   (2 citations)  (Correct)

No context found.

C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 295--301). Cambridge: The MIT Press. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1--58.


A Selective Attention Based Method for Visual Pattern Recognition - Salah (2001)   (1 citation)  (Correct)

No context found.

Mozer, & M.E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8, 771-777. Bishop, C.M. (1995). Neural Networks for Pattern Recognition. Oxford University Press.


Incremental Active Learning for Optimal Generalization - Sugiyama, Ogawa   (2 citations)  (Correct)

No context found.

C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 295--301). Cambridge: The MIT Press. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1--58.


Timescales for Sparseness of Natural Sound: Implications for.. - Iordanov, al. (2000)   (Correct)

No context found.

Mozer, M. C., M. I. Jordan, and T. Petsche (Eds.) (1997). Advances in Neural Information Processing Systems, Volume 9. The MIT Press.


Mensch-Maschine-Kommunikation in IVUs: Der kommunikative Agent .. - Milde, Ahlers (1999)   (Correct)

No context found.

Michael C. Mozer, Michael I. Jordan, and Thomas Petsche, editors, Advances in Neural Information Processing Systems,volume 9, page 1012. The MIT Press, 1997.


Principal Curves and Principal Oriented Points - Delicado (1998)   (7 citations)  (Correct)

No context found.

Jordan,and Thomas Petsche #Eds.#,Advances in Neural Information Processing Systems 9,pp. 354#360. Cambridge, MA:The MIT Press. Corwin, L. and R. Szczarba #1979#. Calculus in Vector Spaces. Marcel Dekker,Inc.


Interpreting neuronal population activity by.. - Zhang, Ginzburg.. (1998)   (23 citations)  (Correct)

No context found.

Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems, volume 9, pages 676--682. Cambridge, Massachusetts: The MIT Press. Zhang, K.-C. (1996). Representation of spatial orientation by the intrinsic dynamics of the head-direction cell ensemble: A theory. Journal of Neuroscience, 16, 2112--2126.


Pruning Using Parameter and Neuronal Metrics - Laar (1999)   (Correct)

No context found.

Mozer & M. E. Hasselmo (eds), Advances in Neural Information Processing Systems 8: Proceedings of the 1995 Conference, MIT Press, Cambridge, pp. 521--527.


Classification and Regression using Mixtures of Experts - Waterhouse (1997)   (7 citations)  (Correct)

No context found.

Mozer, M. C., Jordan, M. I. and Petsche, T., eds [1997], Advances in Neural Information Processing Systems 9, MIT Press, 55 Hayward St., Cambridge, MA, 02142-1399.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC