| Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in neural information processing systems. In G. Tesauro & D. Touretzky & T. Leed (Eds.), Advances in Neural Information Processing Systems, (pp. 633--640). Cambridge, MA: MIT Press. |
....signals in many scientific experiments or learning of sensorimotor maps of multijoint robot arms. Learning can be performed more e#ciently if we can actively design input signals. The problem of designing input signals for optimal generalization is called active learning (Cohn, Ghahramani, Jordan, 1996; Fukumizu, 1996; Vijayakumar Ogawa, 1999) It is also referred to as optimal experiments (Kiefer, 1959; Fedorov, 1972; Cohn, 1994) or query construction (Sollich, 1994) Reinforcement learning (Kaelbling, 1996) which has been extensively studied recently in the field of machine learning, can ....
....a crucial problem for acquiring the optimal generalization capability. Within the framework of Bayesian statistics, MacKay (1992) derived a criterion for selecting the most informative training data for specifying the parameters of neural networks. Cohn (1994, 1996) and Cohn, Ghahramani, and Jordan (1996) gave an active learning criterion for minimizing the variance of the estimator. Fukumizu (1996) proposed an active learning method in multi layer perceptrons using asymptotic approximation for estimating the generalization error. Essentially, the criteria derived in these papers are Incremental ....
[Article contains additional citation context not shown here]
Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9 (pp. 417--423). The MIT Press. Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129--145.
.... W i x j W i x b i . The NGnet can be interpreted as a stochastic model, in which a pair of an input and an output, x; y) is a stochastic event. For each event (x; y) a unit index i 2 f1; 1 1 1 ; Mg is assumed to be selected, which is regarded as a hidden variable. The stochastic model [8] is defined by the probability distribution for a triplet (x; y; i) P (x; y; ij ) G i (x) M (2) 0 D 2 oe 0D i exp 01 2oe 2 i (y 0 W i x) 2 ; 2) where j f i ; 6 i ; oe 2 i ; W i ji = 1; 1 1 1 ; Mg is a set of model parameters. The expectation value of the output y for ....
Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in Neural Information Processing Systems 7 (pp. 633-640), MIT Press.
....signals in many scientific experiments or learning of sensorimotor maps of multijoint robot arms. Learning can be performed more e#ciently if we can actively design input signals. The problem of designing input signals for optimal generalization is called active learning (Cohn, Ghahramani, Jordan, 1996; Fukumizu, 1996; Vijayakumar Ogawa, 1999) It is also referred to as optimal experiments (Kiefer, 1959; Fedorov, 1972; Cohn, 1994) or query construction (Sollich, 1994) Reinforcement learning (Kaelbling, 1996) which has been extensively studied recently in the field of machine learning, can ....
....a crucial problem for acquiring the optimal generalization capability. Within the framework of Bayesian statistics, MacKay (1992) derived a criterion for selecting the most informative training data for specifying the parameters of neural networks. Cohn (1994, 1996) and Cohn, Ghahramani, and Jordan (1996) gave an active learning criterion for minimizing the variance of the estimator. Fukumizu (1996) proposed an active learning method in multi layer perceptrons using asymptotic approximation for estimating the generalization error. Essentially, the criteria derived in these papers are Incremental ....
[Article contains additional citation context not shown here]
Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9 (pp. 417--423). The MIT Press. Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129--145.
....X i=1 P (ijx(t) y(t) log P (x(t) y(t) ij ) 4) Since an increase of L( j ; X;Y ) implies an increase of the log likelihood for the observed data (X; Y ) Dempster et al. 1977) L( j ; X;Y ) is maximized with respect to . A solution of the necessity condition L= 0 is given by (Xu et al. 1995) i = hxi i (T ) h1i i (T ) 5a) Sigma Gamma1 i = hxx 0 i i (T ) h1i i (T ) Gamma i (T ) 0 i (T ) Gamma1 (5b) W i = hy x 0 i i (T ) h x x 0 i i (T ) Gamma1 (5c) oe 2 i = 1 D h hjy 2 ji i (T ) Gamma Tr i W i h xy 0 i i (T ) ji =h1i i (T ) 5d) where ....
.... i W i h xy 0 i i (T ) ji =h1i i (T ) 5d) where h Deltai i denotes a weighted mean with respect to the posterior probability (3) and it is defined by hf(x; y)i i (T ) j 1 T T X t=1 f(x(t) y(t) P (ijx(t) y(t) 6) The EM algorithm introduced above is based on batch learning (Xu et al. 1995), namely, the parameters are updated after seeing all of the observed data. We introduce here an on line version (Sato Ishii, 1998) of the EM algorithm. Let (t) be the estimator after the t th observed data (x(t) y(t) In this on line EM algorithm, the weighted mean (6) is replaced by f(x; ....
Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in Neural Information Processing Systems 7 (pp. 633-640), MIT Press.
No context found.
Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in neural information processing systems. In G. Tesauro & D. Touretzky & T. Leed (Eds.), Advances in Neural Information Processing Systems, (pp. 633--640). Cambridge, MA: MIT Press.
No context found.
M. I. Ng, A. Y. Jordan and Y. Weiss. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems, 14, 2001.
No context found.
C. Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9, MIT Press, Cambridge MA. Jaakkola, T. S. (1997). Variational methods for inference and estimation in graphical models. Unpublished doctoral dissertation, Massachusetts Institute of Technology.
....(EM) algorithm (Dempster, Laird Rubin, 1977) to both the ME and the HME architecture so that the learning process is decoupled in a manner that fits well with the modular structure. The favorable properties of the EM algorithm have been shown by theoretical analyses (Jordan Xu, 1995; Xu Jordan, 1996) In the ME architecture, the EM algorithm makes the original complicated maximum likelihood problem decomposed into several Neural Networks 12 (1999) 1229 1252 PERGAMON Neural Networks 0893 6080 99 see front matter # 1999 Published by Elsevier Science Ltd. All rights ....
....to implicitly avoid the instability problem. Alternatively, Chen et al. 1996b) used a so called generalized Bernoulli density as the statistical model of expert networks for multiclass classification and applied such ME and HME classifiers to speaker identification. Xu and Jordan (1994) and Xu, Jordan and Hinton (1995) have proposed an alternative ME model, where a localized gating network is employed so that parameter estimation in the gating network can be analytically solvable. For a regression task, the IRLS algorithm is avoided in the alternative ME model so that the EM algorithm becomes a single loop ....
[Article contains additional citation context not shown here]
Xu, L., Jordan, M. I., & Hinton, G. E. (1995). Advances in neural information processing systems. In G. Tesauro & D. Touretzky & T. Leed (Eds.), Advances in Neural Information Processing Systems, (pp.
No context found.
Ng, A. Y., Jordan, M. I. and Weiss, Y.: On Spectral Clustering: Analysis and an algorithm, Advance in Neural Information Processing Systems (NIPS) 14, MIT Press, Cambridge MA (2002)
No context found.
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Advances in Neural Information Processing Systems, 14, 2002. MIT Press.
No context found.
Ng, A. Y., Jordan, M. I. and Weiss, Y., On Spectral Clustering: Analysis and an algorithm, Advance in Neural Information Processing Systems (NIPS) 14, (2002)
No context found.
Mozer, M., Touretzky, D., and Perrone, M., editors (1996). Advances in Neural Information Processing Systems 8, Cambridge MA. MIT Press.
No context found.
C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 295--301). Cambridge: The MIT Press. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1--58.
No context found.
Mozer, & M.E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8, 771-777. Bishop, C.M. (1995). Neural Networks for Pattern Recognition. Oxford University Press.
No context found.
C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 295--301). Cambridge: The MIT Press. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1--58.
No context found.
Mozer, M. C., M. I. Jordan, and T. Petsche (Eds.) (1997). Advances in Neural Information Processing Systems, Volume 9. The MIT Press.
No context found.
Michael C. Mozer, Michael I. Jordan, and Thomas Petsche, editors, Advances in Neural Information Processing Systems,volume 9, page 1012. The MIT Press, 1997.
No context found.
Jordan,and Thomas Petsche #Eds.#,Advances in Neural Information Processing Systems 9,pp. 354#360. Cambridge, MA:The MIT Press. Corwin, L. and R. Szczarba #1979#. Calculus in Vector Spaces. Marcel Dekker,Inc.
No context found.
Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems, volume 9, pages 676--682. Cambridge, Massachusetts: The MIT Press. Zhang, K.-C. (1996). Representation of spatial orientation by the intrinsic dynamics of the head-direction cell ensemble: A theory. Journal of Neuroscience, 16, 2112--2126.
No context found.
Mozer & M. E. Hasselmo (eds), Advances in Neural Information Processing Systems 8: Proceedings of the 1995 Conference, MIT Press, Cambridge, pp. 521--527.
No context found.
Mozer, M. C., Jordan, M. I. and Petsche, T., eds [1997], Advances in Neural Information Processing Systems 9, MIT Press, 55 Hayward St., Cambridge, MA, 02142-1399.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC