38 citations found. Retrieving documents...
M. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, USA, 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Time and Memory Efficient Viterbi Decoding for.. - Willegg..   (Correct)

....4.1. Disk based search network In order to tackle the problem of large memory requirements needed for storing the precompiled search network, we propose and implemented an approach of keeping the network on disk and only loading the required parts on de mand. This is inspired by an approach in [8] for reducing fast memory requirements when coping with large n gram language models. The idea is to keep the search network on disk in a dedicated file structure, to load a node and its outgoing arcs only when a token reaches it and to unload it, as soon as it is not longer activated. The ....

M. K. Ravishankar, "Efficient Algorithms for Speech Recognition", PhD Thesis, CMU, Pittsburgh, 1996.


Robotics and Autonomous Systems 42 (2003) 271--281 - Towards Robotic Assistants (2002)   (4 citations)  (Correct)

....and medical experts, Pearl also features two sturdy handle bars, a compact design that allows for cargo space, a removable tray, and a sophisticated head unit. On the software side, the robot features off the shelf autonomous mobile robot navigation system [5,28] speech recognition software [24], speech synthesis software [4] fast image capture and compression software for online video streaming, face detection tracking software [25] as well as the three major new software modules described in this paper. These modules are principally concerned with people interaction and control. They ....

M. Ravishankar, Efficient algorithms for speech recognition, Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, 1996.


A Tree-Trellis N-best Decoder for Stochastic Context-Free Grammars - Seward (2000)   (3 citations)  (Correct)

....hypothesis records, with a top scored path in bold. 4. Word Model Pool When a search space of active word models grows too large, it must be reduced to a limited set for computational reasons. Typically, pruning methodologies apply two forms of threshold criteria, used individually or combined [3, 4]. The relative likelihood approach, also referred to as beam pruning, is to process partial hypotheses that differ by no more than a relative distance from the best scored partial hypothesis. Although this method is computationally efficient, it will result in an uneven time distribution of the ....

Ravishankar, M. K., Efficient algorithms for speech recognition. Ph.D. Thesis, Carnegie Mellon University, 1996.


Postgraduate Course - Virtual Reality And   (Correct)

....because it weeds out sentences which are not occurring in the normal language. 4.2 How it works Figure 4.1: Speech recognizer schematics In the figure 4.1 we see the schematics of the speech recognizer using the same principles as Sphinx II uses. More detailed description can be found in [6]. The processing is as follows: 1. The input waveform is preprocessed by the front end. Usually, signal is nor12 malized and then the feature vector is computed. In the speech recognition, cepstrum is often used, together with signal power and some other values. Output of the front end is the ....

M. K. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, 1996. 27


A Time-Synchronous Viterbi-Decoder For Arbitrary.. - Willett.. (2001)   (Correct)

....easily grow to sizes that do not allow to completely load them into fast memory, at least not so on standard workstations. For this reason, we have set up a specific file structure for storing the recognition network on disk, that allows a fast on demand access. This approach was inspired by [5], where a disk based access mode for backoff n gram language models has been proposed. In our approach though, the whole recognition network, comprising language model, HMM topology, pronunciation network etc. is kept on disk and accessed only on demand. The structure of the involved files is ....

M. K. Ravishankar, "Efficient Algorithms for Speech Recognition", PhD Thesis, CMU, Pittsburgh, 1996. M. Riley et al. , "Transducer Composition


Experiences with a Mobile Robotic Guide for the Elderly - Montemerlo, Pineau, Roy, .. (2002)   (13 citations)  (Correct)

....nurses and medical experts following deployment of the first robot, Flo. Pearl was largely designed and built by the Standard Robot Company in Pittsburgh, PA. On the software side, both robots feature off the shelf autonomous mobile robot navigation system [5, 24] speech recognition software [20], speech synthesis software [3] fast image capture and compression software for online video streaming, face detection tracking software [21] and various new software modules described in this paper. A final software component is a prototype of a flexible reminder system using advanced planning ....

M. Ravishankar. Efficient algorithms for speech recognition, 1996. Internal Report.


Spanish Recogniser Of Continuously Spelled Names.. - San-Segundo, Cols, .. (2000)   (Correct)

....of cases where the letter sequence recognised matches exactly with the name spelled. All the experiments have been run over a Pentium II 350 Mhz with RAM of 128 Mb, so all the Processing Time results provided are referred to this computer. These results are given in xRealTime units (xRT) as in [5]. 1xRT is the average time spent to pronounce the spelled name. 3. ARCHITECTURES PROPOSED In all the architectures, we use letter CHMMs with a number of states proportional to the length of each letter. The shortest model has 9 states and it was associated to the vowel letter I, the longest one ....

Ravishankar, M.K. "Efficient Algorithms for Speech Recognition". Unpublished PhD Dissertation, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, May 15, 1996.


Towards Personal Service Robots for the Elderly - Roy, Baltus, Fox, Gemperle.. (2000)   (10 citations)  (Correct)

....It is therefore of great importance that the robot communicates in ways familiar to elderly people. To that end, spoken interaction with the robot is absolutely essential. Flo possesses a real time speech interface. The speech recognition system is based on CMU s SPHINX II system (Lee 1989; Ravishankar 1996). This system has the principle virtues of being speaker independent, and requiring no pre training by any user. Furthermore, most existing speech recognition systems require very small distances between the speaker and the microphone for reasonable recognition rates, however, our experience has ....

.... devices increase compliance with prescribed medication only 5 the total cost of noncompliance with medication is approximately 1 billion US per year (Dunbar Jacob 2000) The system design integrates existing systems from previous mobile robots (Thrun et al. 1999) and other research areas (Ravishankar 1996), however, several problems remain unsolved in the current robot. The dialog management problem has been addressed partially, however, we would like the system to tailor its dialogs to individual user preferences automatically. The POMDP based approach has eliminated the need for full scale ....

Ravishankar, M. 1996. Efficient Algorithms for Speech Recognition. Ph.D. Dissertation, Carnegie Mellon.


Spoken Dialogue Management Using Probabilistic Reasoning - Roy, Pineau, Thrun (2000)   (9 citations)  (Correct)

....approximates the POMDP policy for the corresponding belief p(s) 3 The Example Domain The system that was used throughout these experiments is based on a mobile robot, Florence Nightingale (Flo) developed as a prototype nursing home assistant. Flo uses the Sphinx II speech recognition system (Ravishankar, 1996), and the Festival speech synthesis system (Black et al. 1999) Figure 1 shows a picture of the robot. Figure 1: Florence Nightingale, the prototype nursing home robot used in these experiments. Since the robot is a nursing home assistant, we use task domains that are relevant to everyday life. ....

M. Ravishankar. 1996. Efficient Algorithms for Speech Recognition. Ph.D. thesis, Carnegie Mellon.


A Decoder for Finite-State Structured Search Spaces - Caseiro, Trancoso (2000)   (1 citation)  (Correct)

....early as in the AT T approach. We plan to mitigate this problem in the future by using the unigram probability as an early approximation. Our approach does not require the use of approximations in the language model and allows the use of disk based language models to reduce the memory requirements [11]. The decoder has the following main inputs: The output of the neural network. A prior probability vector, to convert the probabilities estimated by the neural network to scaled likelihoods. An n gram language model. A distribution to word WFST that is built outside of the system and ....

M. K. Ravishankar, "Efficient Algorithms for Speech Recognition", PhD thesis, School of Computer Science, Carnegie Mellow University, 1996.


An Empirical Study of the Effectiveness of.. - Tomokiyo, Le Wang..   (Correct)

....about Fluency as a testbed. 1. Introduction In this paper we describe an experiment that was carried out to determine whether using the Fluency system [1] really helps non native speakers of English improve their pronunciation skills. Fluency is a system that uses the SPHINX II [2] recognizer to detect pronunciation errors and then offers hints on how to correct them. We taught two sounds that are a problem for speakers of many languages when learning English: the unvoiced as in thin and the voiced h as in that . The interface design included user driven ....

Ravishankar, M. (1996). Efficient Algorithms for Speech Recognition, Ph.D. Thesis, Carnegie Mellon University, Technical Report CMU-CS-96-143.


Finding Approximate POMDP solutions Through Belief Compression - Roy (2000)   (6 citations)  (Correct)

....interpreter can only infer this state from the (noisy) speech of the user. The particular domain that we use in the following discussion is based on a mobile robot, Florence Nightingale (Flo) developed as a prototype nursing home assistant. Flo uses the Sphinx II speech recognition system (Ravishankar, 1996), and the Festival speech synthesis system (Black et al. 1999) Figure 8 shows a picture of the robot. Since the robot is a nursing home assistant, we use task domains that are relevant to everyday life. Table 1 shows a list of the task domains the user can ask about: the time, the weather, what ....

Ravishankar, M. (1996). Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon.


The Cu Communicator System - Ward, Pellom (1999)   (8 citations)  (Correct)

....of a Computerfone 1 with a serial cable connection to the host computer. The record process is pipelined to the speech recognition server and the play process is pipelined the text to speech server. 2. 2 Speech Recognition We are currently using the Carnegie Mellon University SphinxII system[3] in our speech recognition server. This is a semicontinuous Hidden Markov Model recognizer with a class trigram language model. The recognition server receives the input vectors from the audio server. The recognition server produces a word lattice from which a single best hypothesis is picked and ....

Ravishankar, M.K., "Efficient Algorithms for Speech Recognition". Unpublished Dissertation CMU-CS-96-138, Carnegie Mellon University, 1996


Towards A Compact Speech Recognizer: Subspace Distribution.. - Mak (1998)   (Correct)

....laboratory recognizers so that they may be deployed on more affordable machines of lower processing power and smaller memory size without losing accuracy. Techniques exist to reduce memory requirement alone, for example, by using simpler but less accurate models, or through data compression [73]. There are also techniques to speed up computation alone: for example, by simply exercising more vigorous pruning schemes, by computing state likelihoods only from a small subset of the most relevant state probability density distributions [6, 8, 41, 63, 79] or by fast match techniques [21] ....

.... Time to Share More The most common approach to reducing the number of parameters in acoustic models is parameter tying: Similar structures are discovered among the acoustic models, 4 This is the recognition speed before the recent implementation of efficient search algorithms as described in [73]. The efficient search later increases the speed to 1.6 times real time. 5 and they are then tied together to share the same value. With the (limited) amount of training data on hand, parameter tying allows more complex acoustic models to be estimated reliably while the number of model ....

[Article contains additional citation context not shown here]

M.K. Ravishankar. "Efficient Algorithms for Speech Recognition". PhD thesis, School of Computer Science, Carnegie Mellon University, 1996. 121


Efficient Language Model Lookahead Through Polymorphic - Linguistic Context Assignment (2002)   (Correct)

No context found.

M. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, USA, 1996.


Mitsubishi Electric Research Laboratories - Http Www Merl (2001)   (Correct)

No context found.

M.K. Ravishankar, Efficient Algorithms for Speech Recognition, Ph.D. thesis, School of Computer Science, Carnegie Mellon University, 1996.


Improved Estimation of Hidden Markov Model Parameters from Multiple .. - Es   (Correct)

No context found.

M.K. Ravishankar, Efficient Algorithms for Speech Recognition, Doctor Thesis, Technial Report CMUCS -96-143, Pittsburgh, USA, 1996


Quantization-Based Language Model Compression - Whittaker, Raj (2001)   (2 citations)  (Correct)

No context found.

M.K. Ravishankar, Efficient Algorithms for Speech Recognition, Ph.D. thesis, School of Computer Science, Carnegie Mellon University, 1996.


The Fluency Pronunciation Trainer - Maxine Eskenazi And (1998)   (Correct)

No context found.

Ravishankar, M. (1996). Efficient Algorithms for Speech Recognition, Ph.D. Thesis, Carnegie Mellon University, Technical Report CMU-CS-96-143.


University of Colorado Dialog Systems for Travel and.. - Pellom, Ward..   (7 citations)  (Correct)

No context found.

Ravishankar, M.K., "Efficient Algorithms for Speech Recognition". Unpublished Dissertation CMU-CS-96-138, Carnegie Mellon University, 1996


Efficient Language Model Lookahead Through Polymorphic .. - Soltau, Metze, Fügen, .. (2002)   (Correct)

No context found.

M. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, USA, 1996.


Towards Robotic Assistants in Nursing Homes: Challenges and .. - Joelle Pineau Michael (2002)   (8 citations)  (Correct)

No context found.

M. Ravishankar. Efficient algorithms for speech recognition. Ph.D. Thesis. School of Computer Science. Carnegie Mellon Univerisity. 1996.


Towards Robotic Assistants in Nursing Homes: Challenges and .. - Joelle Pineau Michael (2002)   (8 citations)  (Correct)

No context found.

M. Ravishankar. Efficient algorithms for speech recognition. 1996. Internal Report.


Experiences with a Mobile Robotic Guide for the Elderly - Montemerlo, al. (2002)   (13 citations)  (Correct)

No context found.

M. Ravishankar. Efficient algorithms for speech recognition, 1996. Internal Report.


A Schema Based Approach To Dialog Control - Constantinides, Hansma, Tchou.. (1998)   (10 citations)  (Correct)

No context found.

Ravishankar, M., Efficient Algorithms for Speech Recognition. Ph.D Thesis, Carnegie Mellon University, May 1996, Tech Report. CMU-CS-96-143.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC