| Lee, K.F., Hon, H.W., Reddy, R., "An Overview of the SPHINX Speech Recognition System", IEEE Transaction on Acoustics, Speech and Signal Processing, Vol. ASSP-38, January 1990. |
....User Text Figure 1. Schematic of a Speech Recognition system. Considerable effort has been devoted in the past to the design of the Recognizer block. A traditional approach is to employ Hidden Markov Models (HMM) 1] in the design, and a well known example of this technique is the SPHINX system [2]. More recently, Time Delay Neural Networks (TDNN) 3] have been employed for realizing the Recognizer, with the basic idea underlying the scheme is to use delays to turn a sequence of feature vectors into a point in the feature space. Then, recognition is achieved in the feature space domain ....
....for recognition, which is the approach taken in [5] where trajectories in a feature space with dimension as high as 20 are used to recognize a single word. Another approach is to use the obtained trajectory to generate a sequence of labels, which are then used for recognition, as is done in [2]. In the present work, we use the obtained trajectory for word recognition, but not directly. The original trajectory is processed such that a corresponding trajectory in a lower dimensional feature space is obtained. Since the processed trajectory is used as the input to the recurrent neural ....
[Article contains additional citation context not shown here]
K-F Lee, H-W Hon, and R. Reddy, "An overview of the SPHINX Speech Recognition System", IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 38, No. 1, January 1990.
....streams have somewhatdecoupled dynamics then a factorial HMM could be a logical alternative to HMMs. Each distinct sub vector stream could be modeled by each of the layers in the FHMM. The idea of streams has already been proposed in the speech research community. Recognition engines like SPHINX [8] and HTK [11] allow similar formulations in their HMM systems. The difference between our formulation and theirs is that the streamed FHMM allows more decoupling of the streams dynamics. Notice that in Equation 4 we show a single covariance although extending this formulation to use a different ....
K. Lee, H. Hon and D. Reddy, "An overview of the SPHINX speech recognition system", IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 38, No. 1, pp. 35-45, 1990.
....may not be a problem, but for large ones it is something unthinkable: no user would willingly utter thousands and thousands of examples. Another approach is based on HMM, and solves the temporal and acoustic problems at once using statistical considerations. It is described by the following steps [9]: 1. Like the Template Approach, something that stores the knowledge about the possible variations of a class must be defined. The difference is that in this case it is an HMM 32 instead of a template. Consider a system that may be described at any time by means of a set of M distinct states, ....
....space, several options exist: The most common approach is to use the obtained trajectory to generate a sequence of labels, normally by means of a vector quantization (VQ) scheme [29] This sequence is then used for recognition. As an example, Carnegie mellon s SPHINX speech recognition system [9] fed the output of the speech coding scheme into a VQ system which translated the incoming data into a sequence of phonemes. The SPHINX system then used an HMM approach to process the sequences of labels and recognize the words. Although it cannot be said that this approach is inferior because it ....
[Article contains additional citation context not shown here]
K-F Lee, H-W Hon, and R. Reddy, "An Overview of the SPHINX Speech Recognition System," IEEE Transactions on Acoustic, Speech, and Signal Processing, vol. 38, no. 1, January 1990. 96
....avoid TIP s hint queue headof line blocking behavior (described in Section 2.2.3.1) the manually modified Postgres needs to (and does) issue a hint cancellation call whenever such a cache hit occurs. 7.3. 5 Sphinx The Sphinx benchmark uses a modification of version 8 of the Sphinx application [30]. Sphinx is a speaker independent, continuous voice speech recognition system developed at Carnegie Mellon University (CMU) The performance of the CMU application degrades drastically when there is insufficient main memory for it to execute mostly in core. The version used as the original in the ....
K.-F. Lee, H.-W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(1):35--45, January 1990.
....among the MRAM banks in our system. Benchmarks: To evaluate the performance of our memory system, we use six SPEC2000 floating point benchmarks [35] five SPEC2000 integer programs, 4 scientific applications 3 from the NAS suite [3] and smg2000 [5] and a speech recognition program, sphinx [26]. We chose these benchmarks because they have been previously shown to have high L1 miss rates, and simulate those instructions of the application which capture the core repetitive phase of the program [19] Table 3 lists these benchmarks along with the number of instructions skipped to reach the ....
K.-F. Lee, H.-W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 38:35--44, 1990.
....the first pass may be combined with a language model score of another context in the second pass. 3. EXPERIMENTS AND RESULTS To verify the improvements of the MMIConnectionist HMM we trained an equivalent discrete HMM system with a k means vector quantizer comparable to the system presented in [7]. Table 1 gives the improvements in word error rates comparing the k means system to the MMIConnectionist system for monophones and for wordinternal triphones. In the right column of Table 1 the results for the multi frame MMI network are given. The multi frame approach gives a further reduction ....
Lee K.-F., Hon H.-W., Reddy R.: An Overview of the SPHINX Speech Recognition System, IEEE Transactions on ASSP, Vol. 38, No. 1, January 1990
.... FO4 roughly corresponds to 360 pico seconds times the transistor s drawn gate length in microns [13] floating point benchmarks from the SPEC2000 suite [32] six SPEC2000 integer benchmarks, three scientific applications from the NAS suite [4] and one speech recognition benchmark called Sphinx [22]. For each benchmark we simulated the sequence of instructions which capture the core repetitive phase of the program. The phases were determined empirically by plotting the L2 miss rates over one execution of each benchmark, and choosing the smallest subsequence that captured the recurrent ....
K.-F. Lee, H.-W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 38:35--44, 1990.
....It is often referred to as text to speech technology. 7] The speech synthesis system used in our research was Festival [8] which offers a general framework for building speech synthesis systems. It also includes some support for Java Speech API. The speech recognition system used was Sphinx [9]. Sphinx is mainly a library, not an independent product. So a custom server application was developed during our research. 3. IMPLEMENTATION The goal of the research project was to develop a system demonstrating the possibilities of VoiceXML combined with a more traditional browser interface. ....
K-F. Lee, H-W. Hon, and R. Reddy, An Overview of the SPHINX Speech Recognition System, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 38, no. 1, Jan. 1990.
....Duisburg, Germany. Publisher Item Identifier S 1063 6676(01)02738 9. In this paper, an integration of discrete HMMs into the mixture model approach is given. Instead of separating the vector quantizer (VQ) from the discrete acoustic model (which seems to be common in the traditional approaches [15]) we consider the natural combination of both as a single system. This allows for a continuous emission density modeling of the entire system that is strongly related to the well known mixture models. In [25] the concept of a maximum mutual information (MMI) neural network that is used as VQ is ....
....appropriate codebook size is always a compromise between modeling resolution and the number of parameters that can be estimated reliably with the given amount of training data. To increase the quantizer resolution, the usage of multiple VQs for multiple features is common in speech recognition [15], 9] In this case, the original feature vector is split up into different subvectors that contain the features of (e.g. contains static features, contains features, etc. Each subvector is mapped by an individual VQ on a discrete label ( Thus, the entire feature vector is mapped on the set ....
[Article contains additional citation context not shown here]
K. F. Lee et al., "An overview of the SPHINX speech recognition system," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 35--45, Jan. 1990.
....over all of the objects used to create the model. HMMs historically have their roots in speech processing for the purpose of recognition of words given the valid phonemes of a language (Rabiner, 1989) One particular implementation of HMMs in speech is the SPHINX Speech Recognition System (Lee, et al. 1990). In SPHINX, the goal is to create a system that can accurately determine the continuous speech of an independent speaker in which a large vocabulary is used. HMM Structure in Sequence Analysis Pattern matching in speech recognition can be applied to sequence analysis in a straightforward ....
Lee, K.F., Hon, H.W., Reddy, R., (1990) "An Overview of the SPHINX Speech Recognition System." IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(1), 35-44.
....on the nature of other structures to be searched for. An example of this occurs in the case of co articulation phenomena in connected speech processing, where a phoneme at the end of one word can impose constraints on the acoustic characteristics to be expected at the beginning of the next word [Lee et al. 1990, Lowerre and Reddy, 1980] Of course the decision to add interpretation processes to perceptual systems involved far more than the symbolic representations themselves. The inferencing processes that manipulate the representations, the control processes that schedule the inferencing and ....
Lee, K., Hon, H., and Reddy, R., "An overview of the SPHINX speech recognition system," in Readings in Speech Recognition, Morgan Kaufmann Publishers, Inc., San Mateo, CA; Waibel, A., and Lee, K., eds., pp. 600--610, 1990.
....on the nature of other structures to be searched for. An example of this occurs in the case of co articulation phenomena in connected speech processing, where a phoneme at the end of one word can impose constraints on the acoustic characteristics to be expected at the beginning of the next word [Lee et al. 1990, Lowerre and Reddy, 1980] Of course the decision to add interpretation processes to perceptual systems involved far more than the symbolic representations themselves. The inferencing processes that manipulate the representations, the control processes that schedule the inferencing and ....
Lee, K., Hon, H., and Reddy, R., "An overview of the SPHINX speech recognition system," in Readings in Speech Recognition, Morgan Kaufmann Publishers, Inc., San Mateo, CA; Waibel, A., and Lee, K., eds., pp. 600--610, 1990.
....natural language systems using commercial hardware. In addition to the theory, the paper also describes experimental results obtained from using an implementation of this theory. 1 Spoken Natural Language: The Future is Now Major advances in recent years in speech recognition technology (see [7] and [13] 1 have raised expectations about the development of practical spoken natural language interfaces. Such interfaces can provide user flexibility as well as allow users to have their hands and eyes busy on the task at hand. Examples of such situations include equipment repair, telephone ....
K. Lee, H. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. In A. Waibel and K. Lee, editors, Readings in speech Recognition, pages 600--610. Morgan Kaufman, San Mateo, CA, 1990.
....and careful. Significantly, these feats have been achieved in a speaker independent fashion. The problem of recognizing human speech has been approached from several directions, including artificial neural networks (e.g. Waibel et al. 1989] and statistical modeling techniques [Waibel and Lee, 1990; Rabiner and Juang, 1993] Spectral coding techniques, Hidden Markov Models, and simple models of language have been the mainstay of the most recent SR systems capable of recognizing large vocabulary, real time, speaker independent, continuous speech. Also, recent increases in microprocessor ....
....probabilistic approach has served best. A properly trained probabilistic model assigns every sequence of words a non zero (possibly very small) probability. Also, probabilistic models can be used to easily rank several analyses. The Sphinx system was built during the late 1980s at CMU [Lee, 1989; Lee et al. 1990; Lee, 1990] it perfected the art of using HMMs to recognize sentences drawn from a mediumsized vocabulary of English words. It innovatively synthesized and improved on HMM techniques for acoustic modeling in order to reach high performance recognition of speaker independent, continuous speech. ....
[Article contains additional citation context not shown here]
K. F. Lee, H. W. Hon, and R. Reddy, "An Overview of the SPHINX Speech Recognition System," IEEE Transactions on Acoustics, Speech, and Signal Processing, January 1990, Reprinted in [Waibel and Lee, 1990].
No context found.
Lee, K., Hon, H., and Reddy, R. An Overview of the SPHINX Speech Recognition System. In IEEE Transactions on Acoustics, Speech, and Signal Processing, Jan 1990, pp 35-45.
No context found.
Lee, K.F., Hon, H.W., Reddy, R., "An Overview of the SPHINX Speech Recognition System", IEEE Transaction on Acoustics, Speech and Signal Processing, Vol. ASSP-38, January 1990.
No context found.
Lee, K., Hon, H., Reddy, R. (1990). "An Overview of the SPHINX Speech Recognition System", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 38 (1), pp. 35--45.
No context found.
K.F. Lee, H.W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. In Readings in Speech Recognition, pages 600-610. Morgan Kaufmann Publishers, San Mateo, CA, 1990.
No context found.
K.-F. Lee, H.-W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(1):35--44, 1990.
No context found.
Lee, K., Hon, H., Reddy, R. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(1). January 1990.
No context found.
K. Lee, H. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 34:35--44, 1990.
No context found.
K. Lee, H. Hon, R. Reddy, An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(1). January 1990.
No context found.
K.F. Lee, H.W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. In Readings in Speech Recognition, pages 600-610. Morgan Kaufmann Publishers, San Mateo, CA, 1990.
No context found.
K.-F. Lee, H.-W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(1):35--44, 1990.
No context found.
K.F. Lee et al. An overview of the SPHINX speech recognition system. IEEE Trans. ASSP, 38(1):35-45, January 1990. 129
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC