| Montacie, C., Choukri, K., and Chollet, G. (1989). Speech recognition using temporal decomposition and multi-layer feed-forward automata. In Proc. of ICASSP 89, pages I--409--412. |
....the amount of training data required from a new speaker to achieve acceptable recognition performance. The inter speaker variation of the acoustic data is reduced by estimating a feature vector transformation between the acoustic parameter space of the new speaker and that of the reference speaker (Montacie et al. 1989; Class et al. 1990; Nakamura and Shikano, 1990; Huang, 1992; Matsukoto and Inoue, 1992) This multivariate transformation, also called spectral mapping given the type of features considered in the parameterization of speech data, provides an acoustic front end to the recognition system. ....
.... (Furui and Sondhi, 1991) Good performance have been achieved with spectral mapping techniques based on MSE optimization (Class et al. 1990; Matsukoto and Inoue, 1992) Alternative approaches presented estimation of the spectral normalization mapping with Multi Layer Perceptron neural networks (Montacie et al. 1989; Nakamura and Shikano, 1990; Huang, 1992; Watrous, 1994) This paper introduces a supervised speaker normalization method based on neural network regression with a generalized local basis model of elliptical kernels (Generalized Resource Allocating Network: GRAN model) Kernels are recursively ....
[Article contains additional citation context not shown here]
Montacie, C., Choukri, K., and Chollet, G. (1989). Speech recognition using temporal decomposition and multi-layer feed-forward automata. In Proc. of ICASSP 89, pages I--409--412.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC