Download:
|
by Sin-horng Chen, Yuan-fu Liao, Song-mao Chiang, Saga Chang
IEEE Trans. Speech Audio Processing
ftp://speech.cm.nctu.edu.tw/liao/fsm.ps.gz
Add To MetaCart
Abstract:
A novel RNN-based front-end pre-classification scheme for fast continuous Mandarin speech recognition is proposed in this paper. First, an RNN is employed to discriminate each input frame for the three broad classes of initial, final, and silence. A finite state machine (FSM) is then used to classify the input frame into four states including three stable states of Initial (I), Final (F), and Silence (S), and a Transient (T) state. The decision is made based on examining whether the RNN discriminates well between classes. We then restrict the search space for the three stable states in the following DP search to speed up the recognition process. Efficiency of the proposed scheme was examined by simulations in which we incorporate it with an HMMbased continuous 411 Mandarin base-syllables recognizer. Experimental results showed that it can be used in conjunction with the beam search to greatly reduce the computational complexity of the HMM recognizer while keeping the recognition rate almost undegraded. 1.
Citations
|
147
|
An application of recurrent nets to phone probability estimation
– Robinson
- 1994
|
|
48
|
Improvements in beam search for 10000-word continuous speech recognition
– Ney, Haeb-Umbach, et al.
- 1992
|
|
14
|
A data-driven organization of the dynamic programming beam search for continuous speech recognition
– Ney, Mergel, et al.
- 1987
|
|
14
|
A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition
– Atal, Rabiner
- 1976
|
|
12
|
A tree search strategy for large-vocabulary continuous speech recognition
– Gopalakrishnan, Bahl, et al.
- 1995
|
|
8
|
Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data
– Wang, Shen, et al.
- 1995
|
|
6
|
Recurrent neural networks for syllabification
– HUNT
- 1993
|
|
4
|
Voiced-unvoiced-silence classification of speech using hybrid feature and a network classifier
– Qi, Hunt
- 1993
|
|
2
|
Efficient search using phone probability estimates
– Renals, Hochberg
- 1995
|
|
2
|
Continuous mandarin speech recognition using hierarchical recurrent neural network
– Liao, Chen, et al.
- 1996
|