51 citations found. Retrieving documents...
H. Ney. The use of a one--stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(3):263--271, 1984.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Robust Matching by Dynamic Space Warping for Accurate Face.. - Sahbi, Boujemaa (2001)   (Correct)

....of all possible matches, taking the most plausible one according to a distance function, is a Np complete problem. To avoid this combination explosion we resolve the matching problem by using a dynamic programming principal, which was initially used in Dynamic Time Warping for speech recognition [10]. Dynamic programming is one way to organize computation taking advantage of the underlying functional form, to have one possible solution. This heuristic considers an ordering assumption: is matched with then : is matched (if possible) only with : This ordering assumption ....

H. Ney, "The use of a one-stage dynamic programming algorithm for connected word recognition," IEEE Trans. ASSP, vol. 32, no. 2, pp. 1042--1062, 1984.


A Matciiing Technique In Exampl E-Based Maciiine Translation - Lambros Cranias Harris   (Correct)

.... against a reference vector is, however, not straightlbrward, since tilere are gener:dly axis tluctuations between the vectors (not necessarily aligned vectors and of most probably different length) To overcome these probletns we use a two level l)ynamic Prograrnmiug (DP) technique [Sakoe 781, INey 841. The first level treats the matches at fw level, while the second is reached only in case of a match in the first level, and is concerned with the lmnmas and tags of the words within fw boundaries. Both levels utilise the same (DP) model which is next described. We have already referred to the ....

Ney H., (1984). The use of a One-stage Dynamic Programming Algorithm for Connected Word Recognition" IEEE wl. ASSP-32, No 2.


Continuous Speech Dictation at LIMSI - Gauvain, Lamel, Adda-Decker   (Correct)

....For language modeling n gram statistics are estimated on text material. To deal with phonological variability alternate pronunciations are included in the lexicon, and optional phonological rules are applied during training and recognition. The decoder uses a time synchronous graph search strategy[Ney84] for a first pass with a bigram back off language model (LM) Kat87] A trigram LM is used in a second acoustic decoding pass which incorporates the word graph generated This work is partially funded by the LRE project 62 058 SQALE. in the first pass[Gau94b] Experimental results are reported on ....

....the design of an efficient search algorithm to deal with the huge search space, especially when using language models with a longer span than two successive words, such as trigrams. The most commonly used approach for small and medium vocabulary sizes is the one pass frame synchronous beam search [Ney84] which uses a dynamic programming procedure. This basic strategy has been recently extended by adding other features such as fast match [Gil90, Bah92] N best rescoring[Sch92] progressive search[Mur93] and one pass dynamic network decoding[Ode94] The two pass approach used in our system is ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. ASSP, 32(2), pp. 263-271, April 1984.


Large Vocabulary Continuous Speech Recognition: from.. - GAUVAIN, LAMEL (1996)   (1 citation)  (Correct)

....purposes where real time recognition is not needed there is a limit on computing resources (memory and CPU time) above which the development process becomes too costly. The most commonly used approach for small and medium vocabulary sizes is the one pass framesynchronous Viterbi beam search [62] which uses a dynamic programming procedure. This basic strategy has been extended to deal with large vocabularies by adding features such as fast match [7] 37] word dependent phonetic trees [63] forward backward search [4] Nbest rescoring [85] progressive search[29] 61] and onepass ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. Acoustics, Speech, and Signal Processing, ASSP-32(2), pp. 263-271, April 1984.


Discriminative Prototype-Based Methods For Speech Recognition - Mcdermott   (Correct)

....of samples, rather than a single well selected sample, is a refinement of this idea. This rough outline describes a large portion of research in speech recognition up to now, starting with the template matching methods based on dynamic time warping that were prevalent in the 1970s and 1980s [1 2], and including the hidden Markov model paradigm [3] that is dominant today. The use of a class representative (or a few representatives) be it a sam ple or an average of samples, to classify new patterns, defines the prototype based approach. A specific implementation within this approach has ....

....is crucial in template or prototype based approaches to speech recognition. One of the crucial advances in the field of automatic speech recognition was the advent of dynamic program ming (DP) algorithms that allow an efficient non linear matching between a sample utterance and a speech prototype [1 2, 34]. Initially, most template matching methods used one reference per word, and operated in speaker dependent, isolated word recognition mode [35] Later studies showed how DP could be used to perform connected word recognition [2] and further more how clustering algorithms (e.g. the modified ....

[Article contains additional citation context not shown here]

H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP-32(2):263-271, 1984.


Discriminative Training for Speech Recognition - McDermott (1997)   (1 citation)  (Correct)

....words with unknown segment boundaries. The classic solution to this central problem in speech recognition is the use of Dynamic Programming (DP) to consider multiple, competing segmentations, and choose the likeliest one to generate the classifier output, in an efficient manner (see, for example, [Ney, 1984]) There have been two main approaches to incorporate DP in ANNs: 1) using ANNs to provide local DP HMM scores, and 2) global training of an ANN in the context of DP operation. In the following, these two approaches are described briefly by mentioning typical implementations. It was suggested ....

....as closely as possible to avoid an overly coarse application of the theory. In this light, this chapter describes how the MCE framework can be applied to a recognition system that uses a finite state machine (FSM) to describe the possible categories of the problem, Ney s One Pass DP procedure [Ney, 1984] to find the best category, and Soong Huang s A based N best algorithm [Soong Huang, 1991] in a manner consistent with the requirements of the MCE formalism. These practical aspects of implementation, though requiring a significant amount of software development time and effort, do not ....

[Article contains additional citation context not shown here]

Ney, H. (1984). The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 32, pp. 263-271.


EuTrans: Speech to Speech Translation Prototype - Pastor, Sanchis, Vidal.. (2001)   (Correct)

....decoding, the word acoustic models are integrated dynamically in the syntactic translation model: the transitions in the syntactic translation model are substituted by the corresponding word acoustic models (see Fig. 3) The decoding process is performed using the beam search Viterbi algorithm [1] through the integrated network. In ATROS system, by following the edges in the optimal path through the syntactic translation model, we can recover not only the optimal sequence of words in the input language but also the corresponding translation. giorno morning buon good g i n r o o Acoustic ....

H. Ney. \The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition" In IEEE Transactions on Acoustics, Speech, and Signal Processing pp 263-271, 1984


EUTRANS: a Speech-to-Speech Translator Prototype. - Pastor, Sanchis.. (2001)   (1 citation)  (Correct)

....the syntactic translation model by the lexical model corresponding to the associated word, and then replacing each arc in the lexical models by the acoustic model corresponding to the associated acoustic model (see Fig. 2) The decoding process is performed using the beam search Viterbi algorithm [1] through the integrated network. In order to achieve the maximum spatial efficiency, the integrated model is not fully expanded in memory, only those states which are not pruned by the beam search are expanded for each frame. In ATROS system, by following the edges in the optimal path through the ....

H. Ney. "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition" In IEEE Transactions on Acoustics, Speech, and Signal Processing pp 263-271, 1984


Lr-Parser-Driven Viterbi Search With Hypotheses Merging.. - Tomokazu Yamada Shigeki (1996)   (1 citation)  (Correct)

....which uses HMMs and an LR parser in phone synchronous processing. The experiments show that our algorithm runs faster than the conventional HMM LR algorithm with an equivalent recognition accuracy. 1. INTRODUCTION The combination of a Finite State Network (FSN) and one pass search algorithm [4] is suitable for real time processing of speech recognition because of its frame synchronous processing and ease of implementation. On the other hand, use of Context Free Grammar (CFG) is advantageous in representing more generalized language constraints. Generalized LR parser [7] is one of most ....

H. Ney, "The use of a one-stage dynamic programming algorithm for connected word recognition," IEEE Trans. Acoustics, Speech and Signal Processing, ASSP-32(2): 263--271, 1984.


On the Development of a Dictation Machine for Spanish.. - Macías-Guarasa, ..   (Correct)

....function and acoustic heuristic rules to obtain the final phonetic string. 15 prototypes are used, extracted from 150 words spoken by the user in the training stage. Further details of this scheme can be found in [2] DHMM approach: It uses a low cost, frame synchronous, one pass algorithm [3], based on discrete Hidden Markov Models. It is given the indexes of the quantified vectors from the previous stage. When the end of the word is reached, a backtracking procedure recovers the most likely sequence of phoneme like units according to the incoming speech signal (although additional ....

H. Ney. "The use of a one-stage dynamic programming algorithm for connected word recognition". IEEE Transactions on ASSP, vol. 32, n. 2, 1984.


Initial Evaluation Of A Preselection Module For A.. -.. (1996)   (Correct)

....or soft quantized if semi continuous HMMs (SCHMMs) are used, with up to 2 codebooks and 256 centroids each) Phonetic String Build Up (PSBU) the resulting indexes are passed to the phonetic string build up module which generates a string of alphabet units. We have used the One Pass algorithm [4] with minor modifications. Lexical Access (LA) The phonetic string is matched against the dictionary, using a dynamic programming algorithm and alignment costs for unit substitution, insertion and deletion errors [5] Intermediate Unit Generation Lexical Access Verification Module ....

Ney, H. "The use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition" IEEE Trans. ASSP Vol. 32, n. 2. 1984


RECNET - the speech recognition system - Bittner (1996)   (Correct)

....utterance, and pick the smallest distance to give the best match sequence of phonemes for that utterance. Alternatively, a more efficient method is that of dynamic programming [24] which may be used to find the best match. This method is the basis of speech recognition using Dynamic time warping [25], Hidden Markov Models employing a Viterbi search and is also used in most hybrid HMM connectionist systems [26] 2.6 Word recognition 2.6.1 Markov model The Markov model used was a simple context independent single pronunciation system with no provision for function word or cross word ....

Ney, H. (1984). "The use of a one-stage dynamic programming algorithm for connected word recognition", IEE Transactions on Acoustics, Speech and Signal Processing, 32(2):263-271.


Sottosistema di interazione vocale di MAIA: versione.. - DalZotto, Fiutem, Gretter   (Correct)

....Artificiale, Quella vicino alla biblioteca, La segretaria del gruppo voce) ed infine una per il riconoscimento di s i o no . 3. 3 Segmentazione e strategie di riconoscimento Nel sistema e possibile scegliere il riconoscitore da utilizzare: per parole isolate con Dynamic Time Warping (DTW, Ney84] per parole isolate con Hidden Markov Models (HMM, vedi [Rab89] per un introduzione) o per parlato continuo con HMM. In tutti i casi la sequenza di parole da riconoscere e vincolata da una grammatica regolare. Nel caso di parlato continuo viene semplicemente utilizzato l algoritmo di Viterbi ....

H. Ney. The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 32:263--271, 1984.


Development And Improvement Of A Real-Time ASR.. - De Cordoba..   (Correct)

....better results than those obtained with automatic end pointing. The performance is not dependent on the number of neurons and the number of examples used in training (results not shown for this) Almost same results as manual, but now automatically. 3.3. One pass decoding with noise models [4][5] A completely different approach to override the wrong automatic end pointing is to use noise models. We train two different noise models with three states each using noise frames (out of boundaries of manual end pointing) and, in recognition, given the automatic end pointing, we consider 15 ....

H. Ney, "The use of a One-stage dynamic programming algorithm for connected word recognition", IEEE Transactions on ASSP, April 1984.


Comparison of Three Approaches to Phonetic String.. -..   (Correct)

....frame shift is reduced to 6.25 milliseconds, and vector quantization is performed prior to recognition using a 128 centroids codebook. Mahalonobis distance with diagonal covariance matrix is used. The generation of a phonetic string is done using a low cost, frame synchronous, one pass algorithm [4], based on discrete Hidden Markov Models (DHMM) and modified in order to allow the use of phoneme durations, phoneme pair statistics and beam search. A backtracking procedure recovers the most likely sequence of recognized phonemes at the end of the utterance, but a module based on partial ....

....for both codebook design and HMM parameters estimation. III.3. Neural Network Based approach For this approach a vector composed of 17 parameters (16 log energy in mel frequency scale plus log energy over the whole frame) is computed every 10 milliseconds. A Time State Neural Network (TSNN) [4] is trained to discriminate 28 allophones plus silence using a hand labeled 500 words database. A Time Delay Neural Network was also studied and it proved to be more expensive in training time because it has to handle more neurons on the 1st layer to have competitive results. In the 1st layer for ....

H. Ney. "The use of a one-stage dynamic programming algorithm for connected word recognition". IEEE Transactions on ASSP, vol. 32, n. 2, 1984


A Syntax-Directed Level Building Algorithm for.. - Koerich, Sabourin, .. (2000)   (Correct)

....that gives the best likelihood at level (l) will be considered. On the other hand, with Viterbi, the decision is taken only after the likelihood of the last character model of the word is computed. Therefore, in the SDLBA wehave a local decision while in the Viterbi scheme wehave a global decision [8]. Moreover, the baseline system requires the double of memory space since it keeps the likelihoods of both uppercase and lowercase characters until the final decision is taken. Figure 3 shows the behavior of both algorithms when dealing with contextual information. Fig. 2. A simplified overview ....

Hermann Ney. The Use of a One--Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32 (2): 263--271, April 1984.


Precise n-gram Probabilities from Stochastic Context-free.. - Stolcke, Segal (1994)   (7 citations)  (Correct)

....is reported later in the paper. On the other hand, even if vastly more sophisticated language models give better results, n grams will most likely still be important in applications such as speech recognition. The standard speech decoding technique of frame synchronous dynamic programming (Ney 1984) is based on a first order Markov assumption, which is satisfied by bigrams models (as well as by Hidden Markov Models) but not by more complex models incorporating non local or higher order constraints (including SCFGs) A standard approach is therefore to use simple language models to generate ....

Ney, Hermann. 1984. The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 32.263--271.


Realtime Personal Positioning System for Wearable Computers - Aoki, Schiele, Pentland (1999)   (12 citations)  (Correct)

....and label audio video sequences from a worn camera and microphone [12] In this paper, we propose a realtime personal positioning system that only uses a small wearable camera as shown in Figure 1, and a standalone PC. In previous work [13] we demonstrated that a dynamic programming algorithm [14], can be used to recognize not only the user s location but also the approaching trajectory. In [13] we tested the accuracy of the system with about 100 second video sequences that were manually chosen and segmented. The images were handled off line on an SGI workstation. In this paper, ....

H.Ney. The use of a one-stage dynamic programming algorithm for connected word recognition. Readings in Speech Recognition: 188-196. 1990.


Selecting Good Speech Features for Recognition - Lee, Hwang (1996)   (Correct)

....measure. 30 Youngjik Lee and Kyu Woong Hwang ETRI Journal, volume 18, number 1, April 1996 I. INTRODUCTION For three decades since 1960 s, various speech recognition methods have been developed to recognize isolated words and continuous speech. They are dynamic time warping method [1] 2] [3], and hidden Markov model [4] 5] 6] 7] respectively. These methods have found their own application such as voice controlled computers and automatic response systems. Recently, spontaneous speech recognition has become one of the major research areas since it handles really natural human ....

H. Ney, "The use of a one-stage dynamic programming algorithm for connected word recognition," IEEE Trans. Acoust., Speech, Signal Proc., vol. ASSP-32, pp. 263-271, 1984.


A Matching Technique In Example-Based Machine Translation - Cranias, Papageorgiou.. (1995)   (4 citations)  (Correct)

.... vector against a reference vector is, however, not straightforward, since there are generally axis fluctuations between the vectors (not necessarily aligned vectors and of most probably different length) To overcome these problems we use a two level Dynamic Programming (DP) technique [Sakoe 78] Ney 84] The first level treats the matches at fw level, while the second is reached only in case of a match in the first level, and is concerned with the lemmas and tags of the words within fw boundaries. Both levels utilise the same (DP) model which is next described. We have already referred to the ....

Ney H., (1984). "The use of a Onestage Dynamic Programming Algorithm for Connected Word Recognition". IEEE vol. ASSP-32, No 2.


ISADORA - a Speech Modelling Network Based on Hidden.. - Schukat-Talamazzini..   (Correct)

....developped probabilistic technique offering solutions to many subproblems in the speech recognition field. For HMM s we have available efficient Dynamic Programming algorithms for optimal decoding (Viterbi decoding, Viterbi (1967) which serve for the recognition of isolated or connected words (Ney, 1984) and may even be sped up by a beam search (Greer, Lowerre Wilcox, 1982) From speech samples along with a textual transcription, we can estimate the probabilistic parameters of HMM s with respect to the maximum likelihood criterion (Baum, 1972) or to entropy based objective functions (Bahl, ....

H. Ney. The Use of a One-stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE Trans. on Acoustics, Speech and Signal Processing, 32:263--271, 1984.


The LIMSI Nov93 WSJ System - Gauvain, Lamel, Adda, Adda-Decker (1994)   (4 citations)  (Correct)

....vocabulary speechrecognizer is the design of an efficient searchalgorithm to deal with the huge search space, especially when using long span language models such as trigrams. The most commonly used approach for small and medium vocabulary sizes is the one pass frame synchronous beam search [15] which uses a dynamic programming procedure. This basic strategy has been recently extended by adding other features such as fast match [8, 2] N best rescoring [19] and progressive search [14] The two pass approach used in our system is based on the idea of progressive search [14] where the ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for ConnectedWord Recognition," IEEE Trans. ASSP, 32(2), pp. 263-271, April 1984.


Likelihood Normalization Using An Ergodic Hmm For Continuous.. - Kazuhiko Ozeki (1996)   (Correct)

....the proposed one is also carried out. 2. HYPOTHESIS GENERATION AND SCORING 2.1. Baseline Method In order to compare various normalization methods on continuous speech, some kind of hypothesis generation technique is necessary. To that end, an HMM version of a connected word recognition algorithm[7] was employed. Let W = fW1 ; WMg be a set of vocabulary words, and O = O1 1 1 1 O t 1 1 1 OT an acoustic observation, O t being the observation vector at the t th frame. A vocabulary word is modeled with a concatenation of phone HMMs. For each frame t, each word Wm , and each state i of ....

H. Ney, "The use of one-stage dynamic programming algorithm for connected word recognition", IEEE Trans. Vol.ASSP-32, No.2, 1984.


The LIMSI Continuous Speech Dictation System - Gauvain, Lamel, Adda, Adda-Decker   (1 citation)  (Correct)

....is partially funded by the LRE project 62 058 SQALE. text material. To deal with phonological variability alternate pronunciations are included in the lexicon, and optional phonological rules are applied during training and recognition. The recognizer uses a time synchronous graph search strategy[16] for a first pass with a bigram back off language model (LM) 10] A trigram LM is used in a second acoustic decoding pass which makes use of the word graph generated using the bigram LM[6] Experimental results are reported on the ARPA Wall Street Journal (WSJ) 19] and BREF[14] corpora, using for ....

....the design of an efficient search algorithm to deal with the huge search space, especially when using language models with a longer span than two successive words, such as trigrams. The most commonly used approach for small and medium vocabulary sizes is the one pass frame synchronous beam search [16] which uses a dynamic programming procedure. This basic strategy has been recently extended by adding other features such as fast match [9, 1] N best rescoring[21] and progressive search[15] The two pass approach used in our system is based on the idea of progressive search where the ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. ASSP, 32(2), pp. 263-271, April 1984.


Generation Of Multiple Hypothesis In Connected Phonetic-Unit.. - Jos Mari   (Correct)

....capacities to the connected phonetic unit recognition algorithm. Perhaps, the most important of these capacities is the ability of multiple hypothesis generation. A typical algorithm applied to connected word (phoneticunit) recognition is the so called one stage dynamic programming algorithm [1]. After reviewing this algorithm, this paper introduces a modified version designed to afford n different strings of phonetic units that This research was supported by CICYT under contract TIC88 0305 C03 02. match the utterance (or a part of it) with the n best scores (n being a parameter) ....

....supported by CICYT under contract TIC88 0305 C03 02. match the utterance (or a part of it) with the n best scores (n being a parameter) II. ONE STAGE DYNAMIC PROGRAMMING ALGORITHM The one stage dynamic programming algorithm is well known: an excellent description of it can be found in reference [1]. In this section a brief summary is provided in order to review its basic principles and introduce the nomenclature used along this paper. Let us consider a test pattern represented by i = 1, N time frames, where each frame is characterized by a vector of features. This test is supposed to ....

[Article contains additional citation context not shown here]

H. Ney, "The use of an one-stage dynamic programming algorithm for connected word recognition", IEEE Trans. ASSP-32, pp. 263-271, April, 1984.


Automatic Recognition and Analysis of Birdsong.. - Anderson, Dave.. (1995)   (Correct)

.... (e.g. level building: see Myers Rabiner 1981; phrase level matching: see Kato 1980) The one stage algorithm performs a parallel search over all templates and is particularly attractive Anderson et al. Automatic Recognition of Birdsong 3 because it is the most computationally efficient (Ney 1984; Silverman Morgan 1990) In the onestage algorithm, recognition of continuous input is recast into the problem of specifying a sequence of template patterns and nonuniform compression dilation of the templates time axes to achieve the best match to the input pattern. We now formalize how the ....

....nonuniform compression dilation of the templates time axes to achieve the best match to the input pattern. We now formalize how the best match is found. The time frames of the input pattern and time frames of the template patterns can be used to define a set of grid points (i; j; k) as in Fig. 1 (Ney 1984). Here i indexes the input pattern time frame, j indexes a template s time frame, and k indexes the templates. Continuous paths among points of the grid that begin at the beginning of the input pattern and finish at the end of the input pattern determine a potential alignment between the input ....

[Article contains additional citation context not shown here]

Ney, H. 1984. The use of a one-stage dynamic programming algorithm for connected word recognition.


Bayesian Learning of Probabilistic Language Models - Stolcke (1994)   (54 citations)  (Correct)

....boundary) CHAPTER 2. FOUNDATIONS 9 The number of parameters in n gram models grows exponentially with n, and only the cases n = 2 (bigram models) and n = 3 (trigram models) are of practical importance. Bigram and trigram models are popular for various applications, especially speech decoding (Ney 1984), to approximate the true distributions of language elements (characters, words, etc. which are known to violate the independence assumption embodied in (2.1) Because (2.1) is essentially a truncated version of the true joint probability given by (2.2) n grams are in some sense a natural ....

....illustrating this approach is reported below. On the other hand, even if more sophisticated language models give better results, n grams will most likely still be important in applications such as speech recognition. The standard speech decoding technique of frame synchronous dynamic programming (Ney 1984) is based on a first order Markov assumption, which is satisfied by bigrams models (as well as by Hidden Markov Models) but not by more complex models incorporating non local or higher order constraints (including SCFGs) A standard approach is therefore to use simple language models to generate ....

NEY, HERMANN. 1984. The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 32.263--271.


Connectionist Probability Estimators in HMM Speech.. - Renals, Morgan.. (1994)   (16 citations)  (Correct)

.... equivalently, by an A search [18] Recognition may be performed using the Viterbi criterion, by computing the state sequence, Q N 1 , that maximizes the posterior P(Q N 1 X) The Viterbi algorithm essentially traces the minimum cost (or maximum probability) path through a time state lattice [19] subject to the constraints imposed by the acoustic and language models. C. Acoustic Data Modeling Density Estimation. The usual HMM training approach is to construct a density estimator that maximizes the likelihood P(X M) or P(X Q N 1 ) if the Viterbi criterion is used) In the course of ....

H. Ney, "The use of a one-stage dynamic programming algorithm for connected word recognition," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 32, pp. 263--271, 1984.


Hybrid HMM/ANN Systems for Speech Recognition: Overview and.. - Bourlard, Morgan (1998)   (2 citations)  (Correct)

....global decoding block is to compensate for temporal distortions that occur in normal speech. For instance, vowels are typically shortened in rapid speech, while some consonants may remain nearly the same length. The most common global decoding approach is some form of dynamic programming (DP) [26], in which time warping of the input against possible speech representations results in the most likely sequence of sound categories to match the input. There are many variations to this process, but in general the local computation consists of finding the lowest cost path through possible ....

.... Theta; Theta ) 4) argmax 8i p(XjM i ; Theta)P (M i j Theta ) 5) since Theta is fixed during recognition [consequently turning p(Xj Theta) into a constant factor independent of the model] This is usually solved by the Viterbi algorithm, a particular case of dynamic programming (DP) [26]. During training, we want to determine the parameter sets Theta and Theta that maximize P (M j jX j ; Theta; Theta ) for all training utterances X j , j = 1; J , associated with M j (known during training) i.e. argmax Theta; Theta J Y j=1 P (M j jX j ; Theta; ....

Ney, N., "The use of a one-stage dynamic programming algorithm for connected word recognition," IEEE Trans. on Acoustics, Speech, and Signal Processing, 32:263-271, 1984.


Decoder Technology For Connectionist Large Vocabulary Speech .. - Renals, Hochberg (1995)   (20 citations)  (Correct)

.... of speech recognition, the Viterbi algorithm is used to find the most probable path through a probabilistically scored time state lattice, i.e. evaluate the Viterbi criterion specified in (10) This approach was first used in speech recognition in the 1960s [36] and a tutorial is given by Ney [22]. This approach is efficient for small problems with no language model (or a simple bigram or finite state syntax) and is guaranteed to find the optimal path. However, the computational expense of an exhaustive dynamic programming search is too great for large continuous speech recognition ....

NEY, H. The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 32 (1984), 263--271.


THE MUTUAL INFORMATION AS AN ACOUSTIC MATCHING MEASURE FOR.. - The University   (Correct)

....Pair In order to evaluate the above two methods, it is necessary to generate hypothesisobservation pairs (w 1 ; o 1 ) w 2 ; o 2 ) and calculate the approximate mutual information for each pair. For that purpose, a frame synchronous connected word recognition algorithm [8] was employed. Let fw 1 ; wM g be a set of vocabulary words, and o = o 1 1 1 1 o t 1 1 1 o T an acoustic observation, o t being the observation vector at the t th frame. A vocabulary word is modeled with a concatenation of phone HMMs. For each frame t, each word wm , and each state i of ....

H. Ney, "The use of one-stage dynamic programming algorithm for connected word recognition", IEEE Trans. Vol.ASSP-32, No.2, 1984.


Continuous Speech Dictation in French - Gauvain, Lamel, Adda, Adda-Decker   (Correct)

....For language modeling n gram statistics are estimated on text mate y This work is partially funded by the LRE project 62 058 SQALE. 1 Most of our LV, CSR research in English focuses on the ARPA Wall Street Journal task[5, 4] rial. The recognizer uses a time synchronous graph search strategy[12] for a first pass with a bigram back off language model (LM) 7] A trigram LM is used in a second acoustic decoding pass whichmakes use of the word graph generated using the bigram LM[4] Experimental results are reported for vocabularies of 5k and 20k words and for two training conditions. ....

....the design of an efficient search algorithm to deal with the huge search space, especially when using language models with a longer span than two successive words, such as trigrams. The most commonly used approach for small and medium vocabulary sizes is the one pass frame synchronous beam search[12] which uses a dynamic programming procedure. This basic strategy has been recently extended by adding other features such as fast match [6, 1] N best rescoring[14] and progressive search[11] The two pass approach used in our system is based on the idea of progressive search where the ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. ASSP, 32(2), April 1984.


Improving Connected Letter . . . - Bregler, al.   (Correct)

....into two front end TDNNs [14] respectively. Each TDNN consists of an input layer, one hidden layer and the phone state layer. Backpropagation was applied to train the networks in a bootstrapping phase, to fit phoneme targets. Above the two phone state layers, the Dynamic Time Warping algorithm [8] is applied (in the DTW layer) to find the optimal path of phone hypotheses for the word models (German alphabet) In the letter layer the activa Figure 1: Typical AOIs tions of the phone state units along the optimal paths are accumulated. The highest score of the letter units represents the ....

H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE International Conference on Acoustics, Speech, and Signal Processing, April 1984.


A Connectionist Recognizer For On-Line Cursive Handwriting.. - Manke, Bodenhausen   (7 citations)  (Correct)

.... [5] with its time shift invariant architecture has been applied successfully [6] The Multi State Time Delay Neural Network (MS TDNN) an extension of the TDNN, combines the high accuracy character recognition capabilities of a TDNN with a non linear time alignment procedure (Dynamic Time Warping) [7] for finding an optimal alignment between strokes and characters in handwritten continuous words. The following section describes the basic network architecture and training method of the MS TDNN, followed by a description of the input features used in this paper (section 3) Section 4 presents ....

H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, March 1984.


The LIMSI Continuous Speech Dictation System.. - Gauvain, Lamel..   (11 citations)  (Correct)

....vocabulary speech recognizer is the design of an efficient search algorithm to deal with the huge search space, especially when using long span language models such as trigrams. The most commonly used approach for small and medium vocabulary sizes is the one pass frame synchronous beam search [10] which uses a dynamic programming procedure. This basic strategy has been recently extended by adding other features such as fast match [14, 15] N best rescoring [13] and progressive search[9] The two pass approach used in our system is based on the idea of progressive search[9] where the ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. ASSP, 32(2), pp. 263-271, April 1984.


Modeling and Interpreting Multimodal Inputs: A Semantic.. - Vo, Waibel (1997)   (3 citations)  (Correct)

....in the mutual information network, hence the sum can be interpreted as the score of a path that goes through the segment labels c 1 c 2 . c k in order, as illustrated in Figure 3. Using a dynamic programming algorithm similar to the Viterbi search or Dynamic Time Warping in speech recognizers [11], we can find an input segmentation and a corresponding label assignment that maximize the path score. Multiple input modalities are accommodated by implementing the path score maximization algorithm over more than one input dimension, where each dimension extends along one input stream. Figure 4 ....

Ney, H., "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 32, No. 2, 1984, pp. 263-271.


A Spoken Language System For Information Retrieval - Bennacef, Bonneau-Maynard..   (14 citations)  (Correct)

.... The acoustic models, which have been trained on 15,200 sentences from 80 speakers taken from the BREF corpus of read newspaper text[5] are the same as are used for our research in largevocabulary, speaker independent dictation[4] The recognizer uses a time synchronous graph search strategy[6] which includes the intra and inter wordcontextdependent phone models, phonological rules, and a bigram language model[3] The HMM based word recognizer graph is built by putting together word models according to the grammar in one large HMM. Each word model is obtained by concatenation of the ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. ASSP, 32(2), 1984.


Language Models For A Spelled Letter Recognizer - Betz, Hild (1995)   (1 citation)  (Correct)

....example, in the minFSG graph, the letter E occurs in over 5,800 transitions. Since the full left context of a string is considered during the search, each transition may have a different individual accumulated search score. Therefore, if the conventional one stage dynamic time warping (DTW) search[6] is employed, one individual word model is needed in the DTW search matrix for every transition in the minFSG graph. With a total of 57,713 transitions, this results in a prohibitively time and memory consuming search process. To remedy the problem, we use a technique similar to the Two Level ....

H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. In Transactions on Acoustics, Speech, and Signal Processing, pages 263--271. IEEE, April 1984.


The JANUS Speech Recognizer - Rogina, Waibel (1995)   (7 citations)  (Correct)

....which, together with the lack of noise modeling, resulted in hypotheses that had very long TH phones to cover noise seqences or breathing noises. 3 THE DECODER IN JANUS The decoder is a Viterbi style two pass decoder: the first pass is a standard Viterbi search implemented roughly as described in [11]. The second pass is a word dependent N best seach [12] using the backtrace information from the first pass for efficient pruning [13] First and second pass use a bigram language model. The output of the second pass is not a list of hypotheses but a wordgraph from which the hypothesis with the ....

Ney, H.: "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition". IASSP 1984, vol. 2, pp 263-271


Connectionist Probability Estimation in HMM Speech Recognition - Renals, Morgan (1992)   (3 citations)  (Correct)

....sequence of models to have generated the speech signal. An efficient algorithm for computing this state sequence is a dynamic programming algorithm known as Viterbi decoding. The Viterbi algorithm essentially traces the minimum cost (or maximum probability) path through a time state lattice (Ney, 1984) subject to the constraints imposed by the acoustic and language models. The Viterbi algorithm may also be used in training. In this case a Viterbi alignment is performed for a known word model sequence to obtain the optimal state segmentation. Given this optimal segmentation the output pdf ....

Ney, H. (1984). The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 32, 263--271.


Bimodal Sensor Integration on the Example of "Speech-Reading" - Bregler, al. (1993)   (4 citations)  (Correct)

....Preprocessing TDNN Figure 2: Neural Network Architecture In a MS TDNN the hierarchy continues above the phone state layer with the Multi State (MS) units [17] or better the DTW layer and word layer. In the forward pass of the network the DTW layer performs the Dynamic Time Warping algorithm [8] with the phoneme hyptheses as input to find the optimal path for the word models (German alphabet) The activations of the phone state units along the optimal paths are accumulated in the word layer. The word unit with the highest score represents the recognized letter. In a second learning phase ....

H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE International Conference on Acoustics, Speech, and Signal Processing, April 1984.


Multimodal Interfaces - Waibel, Vo, Duchnowski, Manke (1995)   (10 citations)  (Correct)

....is the weighted sum of the activations of the corresponding phoneme state and viseme state units. Some visemes will therefore influence more than one of the combined layer units. In the final layer (which copies the activations from the combined layer) a one stage Dynamic Time Warping algorithm [18] is applied to find the optimal path through the phone hypotheses that corresponds to a sequence of letter models. Network training is done in two phases. First, the acoustic and visual sub nets are trained separately to fit phoneme viseme targets. Second, the complete network is trained to fit ....

....of words into a single network architecture. The MSTDNN, which was originally proposed for continuous speech recognition tasks [13] 6] combines shift invariant high accuracy pattern recognition capabilities of a TDNN [33] 9] with a non linear time alignment procedure (dynamic time warping) [18] for aligning strokes into character sequences. Figure 3a shows the basic architecture of our on line handwriting recognition system. This recognition system is integrated into the example application, which is shown in Figure 3b. The following sections describe the preprocessing techniques, the ....

H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. In Proc. ICASSP'84.


Developments in Large Vocabulary Dictation: The LIMSI.. - Gauvain, Lamel.. (1995)   (4 citations)  (Correct)

....the design of an efficient search algorithm to deal with the huge search space, especially when using language models with a longer span than two successive words, such as trigrams. The most commonly used approach for small and medium vocabulary sizes is the one pass frame synchronous beam search [12] which uses a dynamic programming procedure. This basic strategy has been extended by adding other features such as fast match [8, 1] N best rescoring[16] progressive search[11] and one pass dynamic network decoding[13] The twostep approach used in our system is based on the idea of ....

H. Ney, "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. ASSP, 32(2), pp. 263-271, April 1984.


Large Vocabulary Off-Line Handwritten Word Recognition - Koerich (2002)   (Correct)

No context found.

H. Ney. The use of a one--stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(3):263--271, 1984.


Large Vocabulary Off-Line Handwritten Word Recognition - Koerich (2002)   (Correct)

No context found.

H. Ney. The use of a one--stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(3):263--271, 1984.


A Time-Length Constrained Level Building Algorithm for.. - Koerich, Sabourin, Suen (2001)   (Correct)

No context found.

Ney H. The use of a one--stage dynamic programming algorithm for connected word recognition. IEEE Trans on ASSP 1984; 32: 263--271.


Bayesian Learning of Probabilistic Language Models - Stolcke (1994)   (54 citations)  (Correct)

No context found.

NEY,HERMANN. 1984. The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 32.263--271.


Morphological Earley-based Chart Parsing in Connected Word.. - Pampel (1996)   (Correct)

No context found.

Hermann Ney. 1984. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-32, No.2, pages 263-271, April 1984.


Speaker-Independent Continuous Speech Dictation - Gauvain Lamel (1994)   (13 citations)  (Correct)

No context found.

H.Ney (1984), "The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition," IEEE Trans. ASSP, Vol. 32, No. 2.


Speech Processing with Linear and Neural Network Models - Burrows (1996)   (1 citation)  (Correct)

No context found.

Bibliography 185 Ney, H. (1984), `The use of one-stage dynamic programming algorithm for connected word recognition', IEEE Transactions on Acoustics, Speech, and Signal Processing 32, 263--272.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC