| L. R. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-28:729--734, 1982. |
....delta function) corresponds to a standard Gaussian distribution. By appropriately choosing p(v) the tails and the peakiness of the distribution can be controlled. An EM scheme is described for ML estimates of and that does not require explicitly obtaining the distribution p(v) is described in [12]. A discrete version of (13) is described in [19] where p(v) r w r vr (v) with w r 0, r w r = 1. The paper also gives formulae for 16 ML estimates of , and w r . This form of distribution was used in [6] for discrete speech modeling, though in the experiments described the discrete ....
....described in (13) Power exponential distributions can not in general be modelled with Richter distributions. This fact can be veri ed by noticing that functions in the class (13) are all log concave, whereas the power exponentials are not log concave for 0 1. This makes the framework of [12] unsuitable for parameter update for 0 1. The estimation formula for w is identical to the standard HMM re estimation formulae. Update formulae for and are suggested in [5] q ; 26) and ; 27) is de ned in equation ....
L. A. Liporace, \Maximum Likelihood Estimation for Multivariate Observations of Markov Sources," IEEE Transactions Information Theory 28, pp 729-734, 1982.
....L( j x) i=1 log f(x i j ) of the data x = x 1 ; x 2 ; xN ) 2 R with respect to and the constraints j=1 j = 1, 5 j 0. In addition there is the requirement that A j A must be positive de nite. The optimization strategy will be according to the EM algorithm, [21, 3]. An outline of the derivation for the EM algorithm is presented here. If x is generated by a random variable X with probability density function (1) with parameters it will maximize the function L( E [log f(X j ) at the value = L( lim N 1 L( j x) indicates that may ....
L. A. Liporace, \Maximum Likelihood Estimation for Multivariate Observation of Markov Sources," IEEE Transactions of Information Theory 5, pp. 729-734, 1982.
....: dg with m n. 2 Parameter Estimation The parameters are chosen to maximize the loglikelihood L( j x) i=1 log f(x i j ) of the data x = x 1 ; x 2 ; xN ) 2 R with respect to and the constraints j = 1, j 0 and i 0. For the optimization the EM methodology, [11, 2], is used. For the computational details here see [12] where theoretical details as well as implementation details are explained. 12] also considers MAP adaptation and models with a Bayesian prior for the same model as well as representing as a sparse matrix for ecient computation. The EM ....
L. A. Liporace, \Maximum Likelihood Estimation for Multivariate Observation of Markov Sources," IEEE Transactions of Information Theory 5, pp. 729-734, 1982.
....vector : CLHL One possible optimization scheme is a coordinatewise descent algorithm with a golden section line search [27] but a more efficient scheme may be a gradient algorithm [27] B. Likelihood Gradient The EM algorithm relies on an auxiliary function, which is usually denoted [42] [43] built on two hyperparameter vectors and by completing the observed data set with parameters to be marginalized : With the proposed notations, usual hidden Markov chains calculations yield (17) where we have the following. and ( are parameters of the model under hyperparameters ....
L. A. Liporace, "Maximum likelihood estimation for multivariate observations of Markov sources," IEEE Trans. Inform. Theory, vol. IT-28, pp. 729--734, Sept. 1982.
....nature, Bayesian learning is shown to serve as a unified approach for a wide range of speech recognition applications. 1 Introduction Estimation of a probabilistic function of Markov chain, also called a hidden Markov model (HMM) is usually obtained by the method of maximum likelihood (ML) [1, 2, 23, 15] which assumes that the size of the training data is large enough to provide robust estimates. This paper investigates maximum a posteriori (MAP) estimation of continuous density hidden Markov models (CDHMM) The derivations given here can straight forwardly be extended to the subcases of discrete ....
L. R. Liporace, "Maximum Likelihood Estimation for Multivariate Observations of Markov Sources," IEEE Trans. Inform. Theory, vol. IT-28, no. 5, pp. 729-734, September 1982.
....database; 1000 word recognition on the Resource Management Database; and 5,000 and 20,000 word recognition on the Wall Street Journal Database. 2 Overview of the HTK Toolkit The HTK toolkit is designed to facilitate the construction of systems using continuous den sity Gaussian mixture HMMs[Liporace1982, Juang 1985, Juang et al. 1986, Bahl et al. 1987] It consists of a number of tools (programs) and a comprehensive set of library interface modules. The library modules ensure that all tools behave in a uniform way and they also simplify the development of new tools. 2.1 The HTK Library The HTK ....
Liporace LA. Maximum-Likelihood Estimation for Multivariate Observa- tions of Markov Sources. IEEE Trans Information Th, Vol IT-28, No 5, pp729- 734.
....the sucient statistics can be obtained easily as j (t) j (t) j (t) 102) ij (t) i (t 1)a ij b t (o t ) j (t) 103) To update the mixture components the mixture posteriors jm (t) P q t ; j; mjO;M) are needed. Since the mixture components can be regarded as additional states [31], the joint likelihood of being in state j and mixture component m is jm (t) c jm b jm (o t ) where b jm (o t ) is the posterior of the observation o t given the state j and the mixture component m. The traditional derivation of the Baum Welch algorithm can be found in [35] A ....
L.A. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-28(5):729-734, 1982.
....can be obtained easily as j (t) 1 p(O) j (t) j (t) 101) ij (t) 1 p(O) i (t 1)a ij b t (o t ) j (t) 102) To update the mixture components the mixture posteriors j;m (t) P q t ; j; mjO;M) are needed. Since the mixture components can be regarded as additional states [31], the joint likelihood of being in state j and mixture component m is jm (t) 1 p(O) c jm b jm (o t ) Ns X i=1 a ij i (t 1) j (t) 103) where b jm (o t ) is the posterior of the observation o t given the state j and the mixture component m. The traditional derivation of the Baum Welch ....
L.A. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-28(5):729-734, 1982.
....is used to explicitly model the stochastic relationship between two (input and output) event sequences, then yielding to a model usually referred to as input output HMM. The parameters of these models can be trained by different variants of the powerful Expectation Maximization (EM) algorithm [1, 12], which, depending on the criterion being used, is referred to as Maximum Likelihood (ML) or Maximum A Posteriori (MAP) training. However, although being part of the same family of models, all these models exhibit different properties. The present paper thus aims at presenting and comparing some ....
....In this case, the set of parameters comprises all the Gaussian means and variances, mixing coefficients, and transition probabilities. These parameters are then usually trained according to the maximum likelihood criterion (1) resulting in the efficient Expectation Maximization (EM) algorithm [8, 12]. Given this formalism, the likelihood of an observation sequence X given the model M can be calculated by extending the forward recurrence (4) defined for Markov models to also include the emission probabilities. IDIAP RR 01 37 7 Assuming a Moore automaton (emission on states) we thus have the ....
Liporace, L.A., 1982, Maximum Likelihood Estimation for Multivariate Observations of Markov Sources, IEEE Trans. on Information Theory, vol. IT-28, no. 5, pp. 729-734.
....when using full or block diagonal covariance matrices there tends to be a dramatic increase in the number of parameters per Gaussian component, limiting the number of components which may be robustly estimated. To overcome this problem multiple diagonalcovariance Gaussian distributions may be used [16, 13]. In addition to being able to model nonGaussian distributions they can model correlations. However, it is preferable to decorrelate the feature vector as far as possible, as otherwise components must be used to model correlations rather than the possible non Gaussian nature of the density ....
L A Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions Information Theory, 28:729-734, 1982.
....taken, rather than the sum. 2.3. Adapting Richter Distributions It is also common to use linear transformations to adapt model parameters to be more representative of a particular speaker, or acoustic environment. A variety of linear transformations and re estimation formulae are described in [6]. Modifying these formulae to handle Richter distributions is trivial. The main modification is to deal with fl (m) r ( v (m)2 r rather than the standard posterior component probability. As an example the estimation formulae for the transform A in maximum likelihood linear regression, ....
....base 11.6 18.5 18.7 base adapt 10.1 17.0 16.4 Richter 11.3 18.1 18.4 Richter adapt 10.1 16.9 16.3 Table 1: Results on the Hub4 1997 partitioned evaluation test set and the equivalent baseline system. The adaptation scheme used in both was a global mean and full variance transform described in [6]. This was applied in an unsupervised batch adaptation mode. Using Richter components showed a small gain in performance over the standard Gaussian components. After adaptation the performance of the two systems was almost indistinguishable. The experiments using power exponential components used ....
L A Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions Information Theory, 28:729--734, 1982.
....the process to be modeled. Another frequently used method 0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 Eb N0 (dB) Required Number of States N AWGN flat fading channel vehicular channel Figure 3: Required number of HMM states. is the HMM likelihood function introduced by Liporace [10]. We used a related probabilistic distance measure, the scoring indicator introduced in [11, 7] For all channels and each E b =N 0 value considered, we rst evaluated the simplest one state model, investigating then more elaborate models with more states by inspection of the KLD and the scoring ....
L. A. Liporace, \Maximum likelihood estimation for multivariate observations of Markov sources",IEEE Trans. Informat. Theory , vol.IT-28, no.5, pp.729-734, 1982.
....acoustic phonetic TIMIT database. The speech recognition results demonstrate the advantages of the time warping trended HMMs over the regular trended HMMs about 10 to 15 improvement in terms of the recognition rate. 3 1. Introduction The standard hidden Markov model (HMM) developed in [1, 10] and widely in use for speech recognition [11] contains the mathematical structure of a (hidden) Markov chain with each state associated with a distinct independent and identically distributed (IID) or a stationary random process. The model is used as a type of data generator for speech signals ....
L.A. Liporace, \Maximum likelihood estimation for multivariate observations of Markov sources," IEEE Trans. Information Theory, Vol.28, pp. 729-734, 1982.
No context found.
L. R. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-28:729--734, 1982.
No context found.
L.A. Liporace, "Maximum likelihood estimation for multivariate observations of Markov sources," IEEE Trans. Inf. Theory, vol.28, no.5, pp.729--734, Sept. 1982.
No context found.
L.A. Liporace, Maximum Likelihood Estimation for Multivariate Observations of Markov Sources. IEEE Transactions on Information Theory, Vol. IT-28(5), pp. 729734, Sept. 1982.
No context found.
L.A. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-28(5):729--734, 1982.
No context found.
L.A. Liporace, Maximum Likelihood Estimation for Multivariate Observations of Markov Sources. IEEE Transactions on Information Theory, Vol. IT-28(5), pp. 729-734, Sept. 1982.
No context found.
Liporace, L.A.: Maximum Likelihood Estimation for Multivariate Observations of Markov Sources. In: IEEE Transactions on Information Theory, Vol. IT-28(5) (Sept. 1982), 729--734
No context found.
Louis A. Liporace. Maximum likelihood estimation for multivariate observations of markov souces. IEEE Trans. Information Theory, 28(5):729-734, 1982. 38
No context found.
L.A. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-28(5):729--734, 1982.
No context found.
L.A. Liporace (1982), "Maximum likelihood estimation for multivariate observation of Markov sources," IEEE Transactions on Information Theory, Vol. IT-28, No. 5, pp. 729-734.
No context found.
L.A. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-28(5):729--734, 1982.
No context found.
L. A. Liporace, "Maximum likelihood estimation for multivariate observation of markov sources," IEEE Transactions of Information Theory, 1982.
No context found.
L. A. Liporace. Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Transactions on Information Theory, IT-- 28(5):729--734, 1982.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC