14 citations found. Retrieving documents...
R. Streit and T. Lunginbuhl, "Maximum likelihood training of probabilistic neural networks," IEEE Transactions on Neural Networks, Vol. 5, No. 5, pp. 764--783, 1994.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Comparison of Machine Learning and Traditional.. - Chan, Lee.. (2002)   (Correct)

.... in between flexibility are useful. The mixture of Gaussians (MOG) 33] 34] has been popular for its simplicity. Adopted to our classification problem, the probability densities for the positive and negative classes are each modeled first as a mixture of multivariant normal densities [35] [36], 6) where for each cluster , 7) The expectation maximation (EM) algorithm [37] is used to find the parameters , and . needed in (1) can also be obtained by ML. Similar to the MLP, multiple trials are required to avoid local minima. However, a learning rate is not required as EM automatically ....

R. L. Streit and T. E. Luginbuhl, "Maximum likelihood training of probabilistic neural networks," IEEE Trans. Neural Networks, vol. 5, pp. 764--783, Sept. 1994.


Heuristic Classifier Performance Bounds in High Dimensional.. - Baggenstoss   (Correct)

.... A widely accepted technique for estimating the parameters of the GM model is the EM algorithm [11] 18] The EM algorithm suffers from numerical problems when there is insufficient data leading some researchers to avoid it [17] or constrain the covariances of the kernels to be identical [19], or of uniform size with variable rotation [14] Adding to the covariance estimates based on a Bayesian prior density argument is the preferred method of dealing with the problem [20] 21] This involves simply adding a diagonal matrix, representing an independent measurement noise prior, to the ....

R. L. Streit and T. E. Luginbuhl, "Maximum likelihood training of probabilistic neural networks," IEEE Trans. on Neural Networks, vol. 5, pp. 764--783, 1994.


A Kurtosis-Based Dynamic Approach to Gaussian Mixture Modeling - Vlassis, Likas (1999)   (4 citations)  (Correct)

....kernel is constant and known and the mixing weights are equal to the reciprocal of the total number of inputs. The network can be regarded as a distributed implementation of the Parzen windows method [6] Some of the limitations of the original network model were relaxed in subsequent works [7] [8], 9] 10] leading to network models that implement some variants of the EM algorithm. However, in most approaches the number K of kernels of the mixture is considered known in advance, and it turns out that the automatic estimation of K is a dicult problem [1] 11] Statistical methods or ....

R. L. Streit and T. E. Luginbuhl, \Maximum likelihood training of probabilistic neural networks," IEEE Trans. Neural Networks, vol. 5, no. 5, pp. 764-783, 1994.


EM Algorithms for Probabilistic Mapping Networks - Li, Gong, Haton (1995)   (Correct)

.... factor analysis[1] The EM approach to the classication problem has been explored recently in neural network research, where the EM algorithm is developed for some probabilistic neural networks, such as hierarchical mixtures of expert[6] and generalized Fisher training probabilistic neural networks[7]. Probabilistic neural networks is a research topic which is currently given serious consideration in classier design for several reasons. First, it inherits the properties of a general neural network classier. For example, with its massive and adaptive architecture, it possesses better ....

....of forming more complex nonlinear manifolds of the input domain than sigmoidal ones through intermediate mappings. Some probabilistic neural networks, such as the Gaussian potential function network(GPFN) 11] the probabilistic neural network(PNN) 4] the probabilistic mapping networks (PMN) [7, 8] and the Gaussian clustering network(GCN) 12] have been developed using Gaussianlike nonlinear functions and appealed to our interests in classical pattern classi cation applications. However, there are no stochastic constraints imposed to the mixture weightings between the Gaussian basis units ....

[Article contains additional citation context not shown here]

R. L. Streit and T. E. Luginbuhl. Maximum likelihood training of probabilistic neural networks. IEEE Trans. on Neural Networks, 5(5):764783, September 1994.


Comparison of Statistical and Neural Classifiers and Their.. - Alpaydin, Gürgen (1996)   (Correct)

....is chosen. 2. Non parametric Methods. When no such assumptions can be done, the densities need be estimated directly from the data. These are also known as kernel based estimators [12, 45, 46] 3. Semi parametric Methods. The densities are written as a mixture model whose parameters are estimated [12, 40, 50, 36, 48]. In the case of normal mixtures, this approach is equivalent to cluster based classification strategies like LVQ of Kohonen [24] and is similar to Gaussian radial basis function networks [32] 5 A decision rule as given in Eq. 1) has the effect of dividing the input space into mutually ....

....X h=1 p(xj jh ; C j )P ( jh ) 13) 10 where the conditional densities p(xj jh ; C j ) are called the component densities and the prior probabilities P ( jh ) are called the mixing parameters. Note that here we have one mixture model for each class leading to an overall mixture of mixtures [48]. We want to estimate the parameters Phi j , that includes the sufficient statistics of the component densities and the mixing proportions, that maximizes the likelihood of a given iid sample X j of class j: L( Phi j jX j ) n j X i=1 log p(x i j Phi j ) X i log X h p(x i j jh ; ....

Streit, R. L., Luginbuhl, T. E. (1994) "Maximum Likelihood Training of Probabilistic Neural Networks," IEEE Transactions on Neural Networks, 5(5), 764--783.


A Kurtosis-Based Dynamic Approach to Gaussian Mixture Modeling - Vlassis, Likas (1999)   (4 citations)  (Correct)

....kernel is constant and known and the mixing weights are equal to the reciprocal of the total number of inputs. The network can be regarded as a distributed implementation of the Parzen windows method [6] Some of the limitations of the original network model were relaxed in subsequent works [7] [8], 9] 10] leading to network models that implement some variants of the EM algorithm. However, in most approaches the number K of kernels of the mixture is considered known in advance, and it turns out that the automatic estimation of K is a difficult problem [1] 11] Statistical methods or ....

R. L. Streit and T. E. Luginbuhl, "Maximum likelihood training of probabilistic neural networks," IEEE Trans. on Neural Networks, vol. 5, no. 5, pp. 764--783, 1994.


Gaussian Mixture Models and Probabilistic Decision-Based.. - Yiu, Mak, Li (1999)   (1 citation)  (Correct)

....the estimates of the Bayesian a posteriori probabilities [7] 8] Research has also shown that neural networks are closely related to Bayesian classifiers. For example, Specht [9] proposed a probabilistic neural network (PNN) that approaches the Bayes optimal decision surface asymptotically. In [10], a fourlayer feedforward architecture incorporated with Gaussian kernels, or Parzen windows, was shown to be able to approximate a Bayesian classifier. In [11] a face recognition system based on probabilistic decision based neural networks (PDBNNs) was proposed. A common property of these neural ....

R. L. Streit and T. E. Luginbuhl. Maximum likelihood training of probabilistic neural networks. IEEE Trans. on Neural Networks, 5(5):764--783, 1994.


A Kurtosis-Based Dynamic Approach to Gaussian Mixture Modeling - Vlassis, Likas (1999)   (4 citations)  (Correct)

....kernel is constant and known and the mixing weights are equal to the reciprocal of the total number of inputs. The network can be regarded as a distributed implementation of the Parzen windows method [6] Some of the limitations of the original network model were relaxed in subsequent works [7] [8], 9] 10] leading to network models that implement some variants of the EM algorithm. However, in most approaches the number K of kernels of the mixture is considered known in advance, and it turns out that the automatic estimation of K is a difficult problem [1] 11] Statistical methods or ....

R. L. Streit and T. E. Luginbuhl, "Maximum likelihood training of probabilistic neural networks," IEEE Trans. on Neural Networks, vol. 5, no. 5, pp. 764--783, 1994.


Probabilistic Decision-Based Neural Networks For Speech.. - Yiu, Mak, Li   (Correct)

....to link up neural networks with Bayesian classifiers. For example, Specht [1] proposed a probabilistic neural network (PNN) in which the sigmoid activation function is replaced by an exponential one. It was shown in [1] that the PNN s decision surface approach the Bayesian one asymptotically. In [2], a four layer feedforward architecture incorporated with Gaussian kernels, or Parzen windows, was shown to be able to approximate a Bayesian classifier. In [3] a face recognition system based on probabilistic decisionbased neural networks (PDBNNs) was proposed. A common property of the above ....

R. L. Streit and T. E. Luginbuhl. Maximum likelihood training of probabilistic neural networks. IEEE Trans. on Neural Networks, 5(5):764--783, 1994.


Constructive Training of Probabilistic Neural Networks - Berthold, Diamond (1998)   (2 citations)  (Correct)

....resulting network con168 tains as many neurons as there are patterns in the training dataset. Newer algorithms that attempt to reduce the network s size unfortunately require an a priori defined architecture, i.e. the number of used Gaussians must be specified before actual training can take place [20]. The Dynamic Decay Adjustment algorithm (DDA, see [3] presented in this paper allows the automatic construction of PNNs from even very large datasets. The PNN is dynamically constructed during training and the number of required hidden units is optimized automatically. In addition the region of ....

....to deal with larger training sets predefine the topology of the network and only adjust the remaining network parameters ( Sigma k j and k j ) during training. All of the approaches focus on a homoscedastic network; that is, use only one global covariance matrix Sigma. Streit and Luginbuhl [20] propose predefining the number of neurons for each class and then adjusting the parameters using a maximum likelihood training method. Using a global Sigma the training data of all classes can be used to adjust this matrix, making the approach feasible for smaller datasets as well. All these ....

Roy L. Streit and Tod E. Luginbuhl. Maximum likelihood training of probabilistic neural networks. IEEE Transactions on Neural Networks, 5(5):764--783, September 1994.


Mixture Density Estimation Based on Maximum.. - Vlassis.. (1998)   (3 citations)  (Correct)

....realization of the method of Parzen [8] The mixing weights are assumed equal among all kernels and equal to the reciprocal of the total number of input samples. The constraints the original PNN model imposed on the parameters of the kernels and the mixing weights were relaxed in subsequent works [15, 1, 13, 11]. In [1] different mixing weights are used, while in [15, 13] the kernels are represented by multivariate Gaussian functions whose parameters are estimated by the Maximum Likelihood technique, similarly to our approach here. However, most of the above approaches assume that the total number K of ....

....equal among all kernels and equal to the reciprocal of the total number of input samples. The constraints the original PNN model imposed on the parameters of the kernels and the mixing weights were relaxed in subsequent works [15, 1, 13, 11] In [1] different mixing weights are used, while in [15, 13] the kernels are represented by multivariate Gaussian functions whose parameters are estimated by the Maximum Likelihood technique, similarly to our approach here. However, most of the above approaches assume that the total number K of kernels is known a priori, and it turns out that the automatic ....

Streit R.L., Luginbuhl T.E.: Maximum likelihood training of probabilistic neural networks. IEEE Transactions on Neural Networks 5(5) (1994) 764--783.


Pattern Recognition via Neural Networks - Ripley   (Correct)

....from examples Popular accounts of neural networks often stress their ability to learn from examples. For example 2 It is an unacknowledgedre naming of the classical technique of kernel discriminant analysis; see for example [17] To add yet more confusion, the probabilistic neural network of [38] is a different classical statistical method, the use of mixtures of Gaussian distributions. Chou Chen [13] call that a new fast algorithm for the effective training of neural classifiers Pattern Recognition via Neural Networks 3 Harmful emissions could become a thing of the past thanks to ....

Streit, R. L. and Luginbuhl, T. E. (1994) Maximum likelihood training of probabilistic neural networks. IEEE Transactions on Neural Networks 5, 764--783.


Neural Processing Letters 9: 63--76, 1999. - Mixture Density Estimation   (Correct)

No context found.

R. Streit and T. Lunginbuhl, "Maximum likelihood training of probabilistic neural networks," IEEE Transactions on Neural Networks, Vol. 5, No. 5, pp. 764--783, 1994.


A Comparison of Informative and Discriminative Estimation of.. - Goodman (2000)   (Correct)

No context found.

Streit, R. L. and Luginbuhl, T. E. (1994). Maximum likelihood training of probabilistic neural networks. IEEE Transactions on Neural Networks, 5(5):764-- 783.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC