| M. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, USA, 1996. |
....4.1. Disk based search network In order to tackle the problem of large memory requirements needed for storing the precompiled search network, we propose and implemented an approach of keeping the network on disk and only loading the required parts on de mand. This is inspired by an approach in [8] for reducing fast memory requirements when coping with large n gram language models. The idea is to keep the search network on disk in a dedicated file structure, to load a node and its outgoing arcs only when a token reaches it and to unload it, as soon as it is not longer activated. The ....
M. K. Ravishankar, "Efficient Algorithms for Speech Recognition", PhD Thesis, CMU, Pittsburgh, 1996.
....and medical experts, Pearl also features two sturdy handle bars, a compact design that allows for cargo space, a removable tray, and a sophisticated head unit. On the software side, the robot features off the shelf autonomous mobile robot navigation system [5,28] speech recognition software [24], speech synthesis software [4] fast image capture and compression software for online video streaming, face detection tracking software [25] as well as the three major new software modules described in this paper. These modules are principally concerned with people interaction and control. They ....
M. Ravishankar, Efficient algorithms for speech recognition, Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, 1996.
....hypothesis records, with a top scored path in bold. 4. Word Model Pool When a search space of active word models grows too large, it must be reduced to a limited set for computational reasons. Typically, pruning methodologies apply two forms of threshold criteria, used individually or combined [3, 4]. The relative likelihood approach, also referred to as beam pruning, is to process partial hypotheses that differ by no more than a relative distance from the best scored partial hypothesis. Although this method is computationally efficient, it will result in an uneven time distribution of the ....
Ravishankar, M. K., Efficient algorithms for speech recognition. Ph.D. Thesis, Carnegie Mellon University, 1996.
....because it weeds out sentences which are not occurring in the normal language. 4.2 How it works Figure 4.1: Speech recognizer schematics In the figure 4.1 we see the schematics of the speech recognizer using the same principles as Sphinx II uses. More detailed description can be found in [6]. The processing is as follows: 1. The input waveform is preprocessed by the front end. Usually, signal is nor12 malized and then the feature vector is computed. In the speech recognition, cepstrum is often used, together with signal power and some other values. Output of the front end is the ....
M. K. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, 1996. 27
....easily grow to sizes that do not allow to completely load them into fast memory, at least not so on standard workstations. For this reason, we have set up a specific file structure for storing the recognition network on disk, that allows a fast on demand access. This approach was inspired by [5], where a disk based access mode for backoff n gram language models has been proposed. In our approach though, the whole recognition network, comprising language model, HMM topology, pronunciation network etc. is kept on disk and accessed only on demand. The structure of the involved files is ....
M. K. Ravishankar, "Efficient Algorithms for Speech Recognition", PhD Thesis, CMU, Pittsburgh, 1996. M. Riley et al. , "Transducer Composition
....nurses and medical experts following deployment of the first robot, Flo. Pearl was largely designed and built by the Standard Robot Company in Pittsburgh, PA. On the software side, both robots feature off the shelf autonomous mobile robot navigation system [5, 24] speech recognition software [20], speech synthesis software [3] fast image capture and compression software for online video streaming, face detection tracking software [21] and various new software modules described in this paper. A final software component is a prototype of a flexible reminder system using advanced planning ....
M. Ravishankar. Efficient algorithms for speech recognition, 1996. Internal Report.
....of cases where the letter sequence recognised matches exactly with the name spelled. All the experiments have been run over a Pentium II 350 Mhz with RAM of 128 Mb, so all the Processing Time results provided are referred to this computer. These results are given in xRealTime units (xRT) as in [5]. 1xRT is the average time spent to pronounce the spelled name. 3. ARCHITECTURES PROPOSED In all the architectures, we use letter CHMMs with a number of states proportional to the length of each letter. The shortest model has 9 states and it was associated to the vowel letter I, the longest one ....
Ravishankar, M.K. "Efficient Algorithms for Speech Recognition". Unpublished PhD Dissertation, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, May 15, 1996.
....It is therefore of great importance that the robot communicates in ways familiar to elderly people. To that end, spoken interaction with the robot is absolutely essential. Flo possesses a real time speech interface. The speech recognition system is based on CMU s SPHINX II system (Lee 1989; Ravishankar 1996). This system has the principle virtues of being speaker independent, and requiring no pre training by any user. Furthermore, most existing speech recognition systems require very small distances between the speaker and the microphone for reasonable recognition rates, however, our experience has ....
.... devices increase compliance with prescribed medication only 5 the total cost of noncompliance with medication is approximately 1 billion US per year (Dunbar Jacob 2000) The system design integrates existing systems from previous mobile robots (Thrun et al. 1999) and other research areas (Ravishankar 1996), however, several problems remain unsolved in the current robot. The dialog management problem has been addressed partially, however, we would like the system to tailor its dialogs to individual user preferences automatically. The POMDP based approach has eliminated the need for full scale ....
Ravishankar, M. 1996. Efficient Algorithms for Speech Recognition. Ph.D. Dissertation, Carnegie Mellon.
....approximates the POMDP policy for the corresponding belief p(s) 3 The Example Domain The system that was used throughout these experiments is based on a mobile robot, Florence Nightingale (Flo) developed as a prototype nursing home assistant. Flo uses the Sphinx II speech recognition system (Ravishankar, 1996), and the Festival speech synthesis system (Black et al. 1999) Figure 1 shows a picture of the robot. Figure 1: Florence Nightingale, the prototype nursing home robot used in these experiments. Since the robot is a nursing home assistant, we use task domains that are relevant to everyday life. ....
M. Ravishankar. 1996. Efficient Algorithms for Speech Recognition. Ph.D. thesis, Carnegie Mellon.
....early as in the AT T approach. We plan to mitigate this problem in the future by using the unigram probability as an early approximation. Our approach does not require the use of approximations in the language model and allows the use of disk based language models to reduce the memory requirements [11]. The decoder has the following main inputs: The output of the neural network. A prior probability vector, to convert the probabilities estimated by the neural network to scaled likelihoods. An n gram language model. A distribution to word WFST that is built outside of the system and ....
M. K. Ravishankar, "Efficient Algorithms for Speech Recognition", PhD thesis, School of Computer Science, Carnegie Mellow University, 1996.
....about Fluency as a testbed. 1. Introduction In this paper we describe an experiment that was carried out to determine whether using the Fluency system [1] really helps non native speakers of English improve their pronunciation skills. Fluency is a system that uses the SPHINX II [2] recognizer to detect pronunciation errors and then offers hints on how to correct them. We taught two sounds that are a problem for speakers of many languages when learning English: the unvoiced as in thin and the voiced h as in that . The interface design included user driven ....
Ravishankar, M. (1996). Efficient Algorithms for Speech Recognition, Ph.D. Thesis, Carnegie Mellon University, Technical Report CMU-CS-96-143.
....interpreter can only infer this state from the (noisy) speech of the user. The particular domain that we use in the following discussion is based on a mobile robot, Florence Nightingale (Flo) developed as a prototype nursing home assistant. Flo uses the Sphinx II speech recognition system (Ravishankar, 1996), and the Festival speech synthesis system (Black et al. 1999) Figure 8 shows a picture of the robot. Since the robot is a nursing home assistant, we use task domains that are relevant to everyday life. Table 1 shows a list of the task domains the user can ask about: the time, the weather, what ....
Ravishankar, M. (1996). Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon.
....of a Computerfone 1 with a serial cable connection to the host computer. The record process is pipelined to the speech recognition server and the play process is pipelined the text to speech server. 2. 2 Speech Recognition We are currently using the Carnegie Mellon University SphinxII system[3] in our speech recognition server. This is a semicontinuous Hidden Markov Model recognizer with a class trigram language model. The recognition server receives the input vectors from the audio server. The recognition server produces a word lattice from which a single best hypothesis is picked and ....
Ravishankar, M.K., "Efficient Algorithms for Speech Recognition". Unpublished Dissertation CMU-CS-96-138, Carnegie Mellon University, 1996
....laboratory recognizers so that they may be deployed on more affordable machines of lower processing power and smaller memory size without losing accuracy. Techniques exist to reduce memory requirement alone, for example, by using simpler but less accurate models, or through data compression [73]. There are also techniques to speed up computation alone: for example, by simply exercising more vigorous pruning schemes, by computing state likelihoods only from a small subset of the most relevant state probability density distributions [6, 8, 41, 63, 79] or by fast match techniques [21] ....
.... Time to Share More The most common approach to reducing the number of parameters in acoustic models is parameter tying: Similar structures are discovered among the acoustic models, 4 This is the recognition speed before the recent implementation of efficient search algorithms as described in [73]. The efficient search later increases the speed to 1.6 times real time. 5 and they are then tied together to share the same value. With the (limited) amount of training data on hand, parameter tying allows more complex acoustic models to be estimated reliably while the number of model ....
[Article contains additional citation context not shown here]
M.K. Ravishankar. "Efficient Algorithms for Speech Recognition". PhD thesis, School of Computer Science, Carnegie Mellon University, 1996. 121
No context found.
M. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, USA, 1996.
No context found.
M.K. Ravishankar, Efficient Algorithms for Speech Recognition, Ph.D. thesis, School of Computer Science, Carnegie Mellon University, 1996.
No context found.
M.K. Ravishankar, Efficient Algorithms for Speech Recognition, Doctor Thesis, Technial Report CMUCS -96-143, Pittsburgh, USA, 1996
No context found.
M.K. Ravishankar, Efficient Algorithms for Speech Recognition, Ph.D. thesis, School of Computer Science, Carnegie Mellon University, 1996.
No context found.
Ravishankar, M. (1996). Efficient Algorithms for Speech Recognition, Ph.D. Thesis, Carnegie Mellon University, Technical Report CMU-CS-96-143.
No context found.
Ravishankar, M.K., "Efficient Algorithms for Speech Recognition". Unpublished Dissertation CMU-CS-96-138, Carnegie Mellon University, 1996
No context found.
M. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, USA, 1996.
No context found.
M. Ravishankar. Efficient algorithms for speech recognition. Ph.D. Thesis. School of Computer Science. Carnegie Mellon Univerisity. 1996.
No context found.
M. Ravishankar. Efficient algorithms for speech recognition. 1996. Internal Report.
No context found.
M. Ravishankar. Efficient algorithms for speech recognition, 1996. Internal Report.
No context found.
Ravishankar, M., Efficient Algorithms for Speech Recognition. Ph.D Thesis, Carnegie Mellon University, May 1996, Tech Report. CMU-CS-96-143.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC