| G. Adda et al., "Text Normalization and Speech Recognition in French," EuroSpeech'97, Rhodos, Sept. 1997. |
.... comparable conditions[6] This difference between French and English mainly stems from the number and gender agreement in French for nouns, adjectives and past participles, and the high number of different verb forms[6] This lexical variety can be partly reduced by appropriate text normalization [1], but there is a need for larger text corpora for training French LMs [2] In this paper we address the impact of the text training data epoch and size on lexical coverage, language model (LM) perplexity and recognition results. Recognition results are presented and compared on 20k and 65k ....
....between coverage and language modeling is investigated more deeply here. Lexical Coverage The problem of lexical coverage has been addressed along different axes: word list size, word definition and word list selection. Text training corpora from Le Monde have been divided in different subsets[1] in order to assess the impact of training data size and epoch on vocabulary design: 0 : years 1994 95 (40M words) Word list size: Better lexical coverage is obtained by increasing the number of words in the recognition word list. OOV rates are displayed in Table 1 for lexicon sizes ....
[Article contains additional citation context not shown here]
G. Adda et al., "Text Normalization and Speech Recognition in French," EuroSpeech'97, Rhodos, Sept. 1997.
.... automatic transcription system consists of an audio partitioner [7] and a speech recognizer [6] Combined with an indexation module, it has been used for building the SDR indexation system [8] The recognition system initially developed for American English has been ported to the French language [1, 2], which is one of the target languages of the ALERT project and was chosen 2 http: speechbot.research.compaq.com 3 http: www.real.com 4 http: www.cselt.it mpeg 5 http: www.tnt.uni hannover.de project mpeg audio for the experiments of this paper. Prior to word recognition, the ....
G. Adda, M. Adda-Decker, J.L. Gauvain, L. Lamel, "Text Normalization and Speech Recognition in French", Proc. ESCA Eurospeech'97, pp. 2711-2714, Rhodes, Greece, Sep. 1997.
....for which the number of occurrences in the corpus is increased. ffl Language model perplexity Table 7 gives 4 gram language model (LM) perplexities (ppx) on the 30k development set from the transcripts. Perplexities are normalized to take into account changes in corpus size due to decompounding [5]. OOV words are discarded for ppx comICSLP 2000 Beijing, Adda Decker Adda Lamel Vol I 268 30k dev. 65kn 65kn t rel. n n t) original 5.2 4.5 all decomp 4.7 4.0 10 11 all decomp inflect. 3.4 2.9 35 36 Table 6: OOV rates obtained with 65k lexica on 3 forms of the 30k development ....
Adda G., Adda-Decker M., Gauvain J.L., Lamel L. (1997), "Text normalization and speech recognition in French", ESCA Eurospeech, Rhodes.
....performance is obtained with the larger lexica. A gain of over 20 is observed with the 20k lexicon which includes the 20 most frequent multi character words. The addition of more words yields much smaller gains. There is a corresponding decrease in the normalized word and character perplexities [2] also shown in the table. Including tone information in the lexicon improves the recognizer performance by about 5 relative. Table 2 summarizes some experimental results comparing different language models for the 50k lexicon and running slower (35xRT instead of 10xRT) The first line gives the ....
G. Adda, M. Adda-Decker, J.L. Gauvain, L. Lamel, "Text normalization and speech recognition in French," Proc. Eurospeech '97, pp. 2711--2714, Rhodes, Greece.
No context found.
G. Adda et al., "Text Normalization and Speech Recognition in French," Eurospeech'97, Rhodes, pp. 56-59, Sept. 1997.
No context found.
Adda, G., Adda-Decker, M., Gauvain, J.L. and Lamel, L., "Text Normalization and Speech Recognition in French," Proceedings of the European Conference on Speech Technology, EuroSpeech, Rhodes, Greece, 1997.
No context found.
G. Adda, M. Adda-Decker, J.L. Gauvain, L. Lamel. Text Normalization and Speech Recognition in French. Eurospeech'97, Rhodes.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC