| Chen, S S et al., \Recent Improvements to IBM's Speech Recognition System for Automatic Transcription of Broadcast News," Proceedings of the Broadcast News Transcription and Understanding Workshop, 1999. |
....the performance of the two systems are indistinguishable. The experiments using power exponential components used a modi ed baseline system consisting of approximately 120,000 Gaussians. The test was performed on a subset of the 1997 partitioned evaluation that was used for development [22]. Finally a smaller language model than for that of the tests with the Richter distribution where used, thus degrading the performance for the spontaneous speech category, F1, and for some of the more dicult conditions, F2 FX. Two power exponential systems were built. The rst used a xed value of ....
Chen, S S et al., \Recent Improvements to IBM's Speech Recognition System for Automatic Transcription of Broadcast News," Proceedings of the Broadcast News Transcription and Understanding Workshop, 1999.
....eciently using the bisection approach discussed in section 3.2. The convergence of this algorithm is still under investigation. 4 Experiments In this section, we present experiments applying our BIC based techniques on the 1997 DARPA broadcast news evaluation task. The IBM speech recognizer [2] was used in all our experiments. 4.1 Choosing the number of mixture components We conducted experiments comparing the BIC approach in section 3.2 with the heuristic thresholding method. We designed a system by the thresholding method which had 90K Gaussians. By choosing the penalty weight = ....
S.S. Chen et al, \Recent Improvements to IBM's Speech Recognition System for Automatic Transcription of Broadcast News", Proc. ICASSP, 1999.
....exponential distribution. Both the proposed distributions are symmetric, so they do not address the skew symmetric problem. First, the Richter distribution is examined. This class of distributions was first suggested by Alan Richter in [9] and was referred to as the Richter distribution in [4]. The Richter distribution is a mixture of Gaussians where all the means are equal and the covariance matrices are multiples of each other. This may be considered as a particular form of Gaussian mixture parameter tying. A Richter distribution consisting of R Gaussians will only have 2R Gamma 2 ....
....ffi vr (v) with wr 0, P r wr = 1. Then f(o; Sigma; p(v) X r wrN (o; v 2 r Sigma) 2) In addition to giving the formulae for calculating the and Sigma, formulae are given for ML estimates for the discrete distribution of v are described. This form of distribution was used in [4] for discrete speech modelling, though in the experiments described the discrete distribution of v was determined a priori rather than trained from the data. For large vocabulary speech recognition systems multiple Gaussian components are typically used to model each state. This paper therefore ....
[Article contains additional citation context not shown here]
S S Chen, E M Eide, M J F Gales, RA Gopinath, D Kanevsky, and P Olsen. Recent improvements to ibm's speech recognition system for automatic transcription of broadcast news. In Prodeedings of the Broadcast News Transcription and Understanding Workshop, 1999.
....Adaptive Training(SAT) scheme has been broadly used in BN transcription task. The idea is to clarify the linguistic acoustic variation from speaker variation. In 1998, we implemented this scheme into the evaluation system, most of the approach is similar as IBM 1998 English evaluation system[3], the only difference is, for Chinese, we use two blocks to process cepstral based parameters and pitch based parameters separately. We did not try a single block for both cepstral and pitch, and more study and experiments should be done in the future. The training data we used to train the SAT ....
Scott Chen, etc. "Recent Improvements to IBM' s Speech Recognition System for Automatic Transcription of Broadcast News", in the same proceedings.
....with many evaluation results reported [7] so far. Regardless of all these developments, automatic recognition and efficient retrieval of broadcast radio and television news speech remains to be a very challenging research topic because of the wide variety of speaking styles and acoustic conditions [8 10], and the various different problems in spoken document retrieval. There have been several different approaches developed for spoken document retrieval (SDR) in recent years. Word based retrieval approaches have been very popular and successful, although with the potential problems of either ....
S.S. Chen, E. M. Eide, M. J. F. Gales, R. A. Gopinath, D. Kanevsky, and P. A. Olsen, "Recent Improvements to IBM's Speech Recognition System for Automatic Transcription of Broadcast News," In Proc. ICASSP, 1999.
No context found.
Chen, S.S., Eide, E.M., Gales, M.J.F., Gopinath, R.A., Kavensky, D. & Olsen. P. (1999). Recent Improvements to IBM's Speech Recognition System for Automatic Transcription of Broadcast News. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (pp. 37- -40).
No context found.
Chen S. S., Eides E. M., Gales M. J. F., Gopinath R. A., Kanevsky D., Olsen P., Recent Improvement to IBM's Speech Recognition System for Automatic Transcription of Broadcast News, Proceedings of ICASSP 1999 pp.3740.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC