| James, D. A. and S. J. Young (1994). A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 377--380. |
....overall. 7 Conclusions and Further Work Our first retrieval system is admittedly limited in scale, and works mainly on an artificial retrieval task. Current plans are to enhance the retrieval system to accommodate open keyword and user sets, allowing a search for arbitrary words spoken by anyone [16]. Using the system in a real world office environment will undoubtedly raise other issues which must be addressed, such as noise robustness and the actual information content of video mail messages. These should not be insurmountable, however, and the first steps presented here suggest that ....
D. A. James and S. J. Young. A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of ICASSP 9d, pages 377380, Adelaide, 1994. IEEE.
....error rate on the 5000 word 23 bigram (8.8 ) 5000 word trigram (5. 0 ) and 20000 word bigram (14.5 ) and the second lowest error rate on the 20000 word trigram (12.7 ) Woodland et al. 1994] In addition to building speech recognition systems, HTK has been used to build word spotting systems [James Young 1994], speaker separation systems [Wang Young 1992] and even face recognition systems [Samaria 1993] It has also been used as a research environment for studies in discriminative training [Woodland 1992, Kapadia et al. 1993] hybrid HMM Neural Net systems [Young 1991, Valtchev et al. 1993] prosody ....
James D, Young SJ. A Fast Lattice-Based Approach to Vocabulary Independent Wordspotting. ICASSP 94, Adelaide.
....time. Though this takes substantial computation, it is less expensive than a largevocabulary recognition system, and has the additional advantage that it requires no language model. Once computed, the phone lattice may be rapidly scanned using a dynamic programming algorithm to find index terms [10]. This re quires a phonetic decomposition of any desired words, but these are easily found from a dictionary or by a rule based algorithm [4] For comparison, the lexicon of our large vo cabulary experiments was 20,000 words, while our phonetic dictionary has more than 200,000 entries. This ....
D. A. James and S. J. Young. A fast lattice-based approach to vocabulary independent wordspotting. In Proc. ICASSP 9, volume I, pages 377-380, Adelaide, 1994. IEEE.
....focused on the content based retrieval and browsing of video mail messages. Earlier versions of the VMR system used a fixed keyword vocabulary and conventional word spotting [1, 2] The final VMR system is shown in Fig I and it uses phone lattices to allow the use of unrestricted keywords[3]. To retrieve a desired message, a user enters a text search request and the precomputed phone lattice attached to each message is examined for occurrences of the search words. These oc currences are then combined using an information retrieval (IR) metric to give a score for each document. ....
D. A. James and S. J. Young. A fast lattice-based approach to vocabulary independent wordspotting. In Proc. ICASSP 94, volume I, pages 377-380, Adelaide, 1994. IEEE.
....for SDR. Phoneme level language models The most popular sub word unit to be applied to SDR is the phoneme. A simple approach implements a recognizer that weights every phoneme as equally probable (does not implement a n gram language model) and performs retrieval on the resulting N best lattices [10]. Other SDR systems are built on top of recognizers using n gram phone based language models [25] A more sophisticated system is described in [2] The LVCSR systems builds a phoneme lattice which is not searched directly for keywords, but rather used in a rough match to identify time locations ....
D.A. James and S.J. Young. A fast lattice-based approach to vocabulary independent word spotting, ICASSP 1994.
....system. If the phone lattice is generated before need, it can then be searched extremely rapidly to find phone strings corresponding to desired query words. James reports the lattice scanner approach working about 1000 times real time; in other words an hour of audio may be searched in 3. 6 seconds [23]. Figure 2 shows an example lattice for the word cat. Keeping multiple hypotheses makes the system much more robust to recognition errors. For example, in the figure, even though the phone T was not the first choice, it is still in the lattice, thus the phone string K AE T can still be found. ....
D. A. James and S. J. Young. A fast lattice-based approach to vocabulary independent wordspotting. In Proc. ICASSP 94, volume I, pages 377--380, Adelaide, 1994. IEEE.
....words. Word models were built by concatenating the appropriate subword models, and a garbage model was built by an agglomerative clustering of the model states, producing a set of CHAPTER 5. BASELINE RETRIEVAL SYSTEM garbage monophones which would be deliberately worse at modelling the speech [79]. The models were placed in a conventional wordspotter recognition network, as illustrated earlier in Figure 4.2, and the HTK Viterbi recogniser HVite used to decode each spoken message into a string of query term and garbage word occurrences. The recognition output was rescored using the output ....
D. A. James and S. J. Young. A Fast Lattice-based Approach to Vocabulary Independent Wordspotting. In Proc. Int. Conf. Acoust., Speech., Sig. Processing, volume I, pages 377--380, Adelaide, 1994. IEEE.
....in [RJN 93] uses a forward backward posterior probability to score keywords. Several of these wordspotting techniques are compared in [RJN 93] A completely different approach for keyword spotting in speech documents by means of hidden Markov models has been taken by David James et al. JY94] This wordspotter creates a phone level transcription of each speech document which contains phoneme guesses for every point in time. The search for keywords is performed on the phone level transcription using dynamic programming techniques. The advantage of this system over the aforementioned ....
....documents are taken from the 3000 most frequent words that occur in texts of an arbitrary domain. From these words the stop words are removed. The remaining words constitute the indexing vocabulary. The indexing features are detected in speech documents by means of the wordspotter presented in [JY94] In order to improve the recognition a language model is used. If a query contains a word that is not contained in the indexing vocabulary yet, it is added to the vocabulary and the document descriptions are updated accordingly. In the latter project the Cambridge Olivetti Retrieval System ....
[Article contains additional citation context not shown here]
D. James and S.J. Young. A Fast Lattice-Based Approach to Vocabulary-Independent Word Spotting. In International Conference on Acoustics, Speech, and Signal Processing, 1994.
....error rate on the 5000 word bigram (8.8 ) 5000 word trigram (5. 0 ) and 20000 word bigram (14.5 ) and the second lowest error rate on the 20000 word trigram (12.7 ) Woodland et al. 1994] In addition to building speech recognition systems, HTK has been used to build word spotting systems [James Young 1994], speaker separation systems [Wang Young 1992] and even face recognition systems [Samaria 1993] It has also been used as a research environment for studies in discriminative training [Woodland 1992, Kapadia et al. 1993] hybrid HMM Neural Net systems [Young 1991, Valtchev et al. 1993] prosody ....
James D, Young SJ. A Fast Lattice-Based Approach to Vocabulary Independent Wordspotting. ICASSP 94, Adelaide.
....be guaranteed. Instead, term models were built by concatenating monophones, and placed in a parallel network with a garbage model. The garbage model was built from the monophone states using an agglomerative clustering procedure which allows the garbage model accuracy to be precisely controlled [12]. In this case, the number of garbage model states was set to allow a large number of term hypotheses per message, and the term log likelihood scores post processed to obtain the durationally normalised log likelihood ratio score (DNLLR) 13] Setting a single, keyword independent threshold on ....
....of paths, each labelled with a phone hypothesis and acoustic log likelihood score, at every point. Figure 1 illustrates an example of a lattice. At query time, each term is detected by searching each precomputed lattice for the exact phone sequence corresponding to the pronunciation of that term [12]. The advantage of the method is that messages can be indexed at query time far more quickly by lattice wordspotting than by Viterbi wordspotting. In addition, since each message lattice contains, by definition, the Viterbi phone sequence, term detections can be assigned the DNLLR score with no ....
D. A. James and S. J. Young. A Fast Lattice-Based Approach to Vocabulary Independent Wordspotting. In Proc ICASSP, pp I-377-380. IEEE, Adelaide, 1994.
....simples [Godin 89] mais elle est peu evolutive et il n offre pas de moyen direct d int egrer ou commander des probabilit es d emission. En tant qu algorithme d alignement il est utilis e combin e avec d autres m ethodes. Il sert notamment au Keyword spotting a base de lattices de phon emes [James 94] sachant que la lattice peut etre constitu ee par une autre m ethode, en l occurence un algorithme de Viterbi modifi e. Il est souvent utilis e aussi pour des post et pr e traitements auxquels on ne d esire pas consacrer un apprentissage trop lourd. 3.1.2 R eseaux de Neurone Ils existent ....
....511 514, 1993. Clary 92] G.J. Clary, J.H.L Hansen, A Novel Speech Recognizer for Keyword Spotting , ICSLP 92, pp. 13 16, 1992. Godin 89] C. Godin, P. Lockwood, DTW schemes for continuous speech recognition: a unified view , IEEE Computer Speech and Language(1989) vol. 3, pp. 169 198, 1989. James 94] D.A. James, S.J. Young, A Fast Lattice based approach to Vocabulary IndependantWorspotting , ICASSP 94, vol. 1, pp. 377 380, 1994. Kimura 87] T. Kimura, K. Niyada, S. Hirakoa, S. Morii, T. Watanabe, A Telephone Recognition System Using Word Spotting Technique based on Statistical Measure , ....
D.A. James, S.J. Young, "A Fast Lattice-based approach to Vocabulary IndependantWorspotting ", ICASSP 94, vol. 1, pp. 377-380, 1994.
....tasks. 4. CONCLUSIONS AND FURTHER WORK Our tests so far have been very limited in scale, and in an artificial laboratory retrieval environment. Work is underway on enhancing the retrieval system to accommodate open keyword and user sets, allowing a search for arbitrary words spoken by anyone [9]. Using the system in a real world office environment will undoubtedly raise other issues which must be addressed, such as noise robustness and the actual information content of video mail messages. These should not be insurmountable, however, and the first steps presented here suggest that ....
D. A. James and S. J. Young. A fast lattice-based ap- proach to vocabulary independent wordspotting. In Pro- ceedings of ICASSP 9, pages I-(377-380), Adelaide, 1994. IEEE.
No context found.
D. A. James and S. J. Young. A fast lattice-based approachtovocabulary independentwordspotting. In Proc. ICASSP 94, volume I, pages 377#380, Adelaide, 1994. IEEE.
....performance. 4.3 Phone Lattice based Word Spotting (PLS) WS PLS SD SI SD mo SI mo SI bi FOM 81.2 69.9 73.6 48.0 60.4 Table 1: FOM summary for WS systems. The PLS word spotting technique involves searching a phone lattice for the sequence of phones corresponding to a particular search term [James Young, 1994]. A phone lattice is a directed acyclic graph whose nodes consist of start end times, and whose arcs are putative phone occurrences, which are labelled with the phone s acoustic score. Phone lattices may be computed in advance, and rapidly scanned for an arbitrary phone sequence at search time. ....
James, D. A., & Young, S. J. (1994). A fast lattice-based approach to vocabulary independent wordspotting. In Proc. ICASSP 94, volume I, pp. 377--380, Adelaide. IEEE.
....addition to their speed advantage, these methods require little memory so they can be applied on any platform, including hand held devices. This contrasts with other fast implementation approaches reported previously such as lattice based wordspotting systems which have been shown to be very fast [3], but require a large amount of memory for lattice storage. 2. STANDARD WORD SPOTTING SYSTEM Two recognition passes are run in the standard wordspotting system. In the first, the keyword and filler models are run together to determine putative keyword hits. The filler models are also applied ....
James, D.A. and Young, S.J. A fast lattice-based approach to vocabulary independent wordspotting, Proc ICASSP'94, Adelaide, 1994.
No context found.
James, D. A. and S. J. Young (1994). A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 377--380.
No context found.
James DA, Young SJ. A Fast Lattice-Based Approach to Vocabulary Independent Wordspotting. In: Proceedings ICASSP'94, Adelaide 1994, IEEE.
No context found.
D. A. James and S. J. Young. A fast lattice-based approach to vocabulary independent word spotting. In Proc. of ICASSP, vol. 1, pages 377--380, Adelaide, Australia, 1994.
No context found.
D.A. James and S. J. Young. A Fast Lattice-based Approach to Vocabulary Independent Wordspotting. In Proc. Int. Conf. Acoust., Speech., Sig. Processing, volume I, pages 377 380, Adelaide, 1994. IEEE.
No context found.
D. James, S. Young, "A fast lattice-based approach to vocabulary independent wordspotting", Proc. ICASSP '94, 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC