22 citations found. Retrieving documents...
Mohri, M., 1996. On Some Applications of Finite-State Automata Theory to Natural Language Processing. Natural Language Engineering, (2):1--20.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Towards a Unified Framework for Sub-lexical and Supra-lexical.. - Mou (2002)   (Correct)

....of FSTs requires only local state knowledge. During the last decade, there has been a substantial surge in the use of finite state methods in linguistic processing both below and above the word level. Kaplan and Kay [49] contributed to the theory of finite state linguistic modeling. Mohri [68] [69] has recently addressed various aspects of natural language processing problems using a finite state modeling approach. Many researchers, including Koskenniemi [53] and Karttunen et al. 50] have successfully used finite state devices in computational morphology and other aspects of ....

M. Mohri, "On some applications of finite-state automata theory to natural language processing," Natural Language Engineering, vol. 2, pp. 1--20, 1996.


Turkish To Crimean Tatar Machine Translation System - Altintas (2001)   (Correct)

....generator and a post processor respectively, and they claim that they get successful results. Chapter 1. Introduction 20 1.5. Finite State Techniques in Machine Translation Successful applications of finite state techniques in various areas of natural language processing have already been done [8]. Among the machine translation methods mentioned above, morphological analysis and the translation mechanisms are interesting to us. Finite state transducers read their input symbol by symbol and each time they read a symbol, they give a corresponding output and move to a new state. This ....

.... Swedish, Russian, English, Swahili, Turkish and Arabic have been developed [9] For more compact information on finite state morphological analysis process, see [9, 10, 11] Apart from the morphological analysis process, large dictionaries can successfully be stored in finite state transducer [8]. Maohri gives the experimental results for a large finite state dictionaries and claims that it is efficient both in the sense of time and space. Since many words have their first few characters in common, they share the same path in automata. As a result, the storage required for a dictionary ....

Mehryar Mohri. On Some Applications of Finite-State Automata Theory to Natural Language Processing . Natural Language Engineering, 2:1-20, 1996


Tokenization using DCG Rules - Covington (2000)   (Correct)

.... for example Deransart et al. 1991, Jensen and Wirth 1974, or practically any o#cial language specification) However, the actual work of tokenization is normally done with finite state transition networks equivalent to regular expressions (Aho and Ullman 1977, Beesley and Karttunen forthcoming, Mohri 1996). Until recently, it was taken for granted that phrase structure rules were not, in themselves, executable. However, Prolog makes them executable in the form of definite clause grammar (DCG) rules (O Keefe 1990, Shalfield 1999) 2 For example, the rule A # B C can go into Prolog as a b, ....

Mohri, Mehryar (1996) On some applications of finite-state automata theory to natural language processing. Online at http://www.research.att.com/#mohri/jnle.ps.gz.


Competing Patterns for Language Engineering - Methods to Handle.. - Sojka (2000)   (Correct)

....pattern database (dictionary problem) is blindingly fast linear with respect to the length of searching word as with other finite state approaches. 1 Introduction There is a need to store empirical language data in almost all areas on natural language engineering (LE) Finite state methods [21,13,16,17,10] have found their revival in the last decade. The theory of finite state automata (FSA) and transducers (FST) is a well developed part of theoretical computer science (for an overview, see e.g. 6,2] As the finite state machines (FSM) needed tend to grow with increased demand for quality of ....

Mehryar Mohri. On some applications of finite-state automata theory to natural language processing. Natural Language Engineering, 2(1):61--80, 1996.


Compiling Regular Formalisms with Rule Features into Finite-State.. - Kiraz (1997)   (1 citation)  (Correct)

....transducers. When interpreted as acceptors with n tuples of symbols on each transition, they can be determinized using standard algorithms (Hopcroft and Ullman, 1979) When interpreted as a transduction that maps an input to an output, they cannot always be turned into a deterministic form (see (Mohri, 1994; Roche and Schabes, 1995) 5 Compilation with Rule Features This section shows how feature structures which are associated with rules and lexical entries can be incorporated into FSAs. 12 A special case can be added for epenthetic rules. Entry Feature Structure abcd f 1 ef f 2 ghi f 3 ....

Mohri, M. 1994. On some applications of finite-state automata theory to natural language processing.


Treatment of epsilon-Moves in Subset Construction - van Noord (1998)   (2 citations)  (Correct)

.... Three different minimisation algorithms are supported: Hopcroft s algorithm (Hopcroft, 1971) Hopcroft and Ullman s algorithm (Hopcroft and Ullman, 1979) and Brzozowski s algorithm (Brzozowski, 1962) Determinisation and minimisation of string to string and string to weight transducers (Mohri, 1996; Mohri, 1997) Visualisation. Support includes built in visualisation (Tcl Tk, TeX PicTeX, TeX PsTricks, Postscript) and interfaces to third party graph visualisation software (Graphviz (dot) VCG, daVinci) Random generation of finite automata (an extension of the algorithm in Leslie ....

Mohri, Mehryar. 1996. On some applications of finite-state automata theory to natural language processing.


Minimization Algorithms for Sequential Transducers - Mohri (1997)   (17 citations)  (Correct)

....be used to minimize weighted transducers, transducers with both output string and weight, when combined with the automata minimization. The weighted minimization algorithms can be used to reduce the size of transducers with output numbers encountered in speech recognition [23] text indexation [21], arithmetic [18] or image processing [11] As previously mentioned, in the case of automata with output weights, the quasi determinization stage can be performed using classical single source shortest paths algorithms. The complexity of the whole minimization algorithm is therefore linear in the ....

....effective in reducing the size of large transducers. As an example, using that implementation, we could compile a large French dictionary of more than 800,000 entries ( 21 Mb) into a compact p subsequential transducer of about 1:3 Mb in less than 20 minutes (including I O s) on an HP 9000 755 [21]. When the input transducer is not deterministic, though equivalent to a p subsequential transducer, a transducer determinization algorithm close to the classical powerset determinization can be used prior to the application of the minimization [21] Minimization can further be used for ....

[Article contains additional citation context not shown here]

Mohri, Mehryar. 1996. On Some Applications of Finite-State Automata Theory to Natural Language Processing. Journal of Natural Language Engineering, 2.


Generalized Optimization Algorithm For Speech Recognition - Transducers Cyril Allauzen (2003)   Self-citation (Mohri)   (Correct)

No context found.

M. Mohri. On some Applications of Finite-State Automata Theory to Natural Language Processing. Journal of Natural Language Engineering, 2:1--20, 1996.


Finitely Subsequential Transducers - Allauzen, Mohri (2003)   Self-citation (Mohri)   (Correct)

....of the size of the transducer, provided that the appropriate data structure is used for the representation of the transducer. Subsequential transducers can be generalized to nitely subsequential transducers which are deterministic transducers augmented with a nite number of nal output strings [13]. This generalization is necessary in many applications such as language processing to account for nite ambiguities [14] Another advantage of the use of nitely subsequential transducers is that a general minimization algorithm is available for these machines [15] which can help reduce their ....

....of nitely subsequential transducers is that a general minimization algorithm is available for these machines [15] which can help reduce their size. There exists a general determinization algorithm that takes as input a nondeterministic transducer and outputs a nitely subsequential transducer [13]. That algorithm does not apply to all transducers. In fact, not all transducers admit equivalent nitely subsequential transducers. We present the rst characterization of nitely subsequentiable transducers, i.e. transducers that are equivalent to nitely subsequential transducers. Our ....

[Article contains additional citation context not shown here]

M. Mohri. On some Applications of Finite-State Automata Theory to Natural Language Processing. Journal of Natural Language Engineering, 2:1-20, 1996.


p-Subsequentiable Transducers - Allauzen, Mohri   Self-citation (Mohri)   (Correct)

....systems is substantially increased when subsequential transducers [15] i.e. nite state transducers with deterministic input, are used. Subsequential machines can be generalized to p subsequential transducers which are transducers with deterministic input with p, p 1) nal output strings [10]. This generalization is necessary in many applications such as language processing to account for nite ambiguities [11] Not all transducers admit equivalent p subsequential transducers however. We present the rst characterization of p subsequentiable transducers, i.e. transducers that admit ....

....report experimental results showing that these algorithms are practical in large vocabulary speech recognition applications. We rst introduce the notation used in the rest of this paper, then brie y describe a generalized determinization algorithm for p subsequen tial transducers introduced by [10], present a fundamental characterization theorem, and describe our experimental results. 2 Preliminaries De nition 1. A nite state transducer T = Q; I; F; E; is an 8 tuple where is a nite input alphabet, a nite output alphabet, Q a nite set of states, I Q the set of ....

[Article contains additional citation context not shown here]

Mehryar Mohri. On some Applications of Finite-State Automata Theory to Natural Language Processing. Journal of Natural Language Engineering, 2:1-20, 1996.


p-Subsequentiable Transducers - Cyril Allauzen And   Self-citation (Mohri)   (Correct)

....systems is substantially increased when subsequential transducers [14] i.e. nite state transducers with deterministic input, are used. Subsequential machines can be generalized to p subsequential transducers which are transducers with deterministic input with p, p 1) nal output strings [10]. This generalization is necessary in many applications such as language processing to account for nite ambiguities [11] Not all transducers admit equivalent p subsequential transducers however. We present the rst characterization of p subsequentiable transducers, i.e. transducers that admit ....

....report experimental results showing that these algorithms are practical in large vocabulary speech recognition applications. We rst introduce the notation used in the rest of this paper, then brie y describe a generalized determinization algorithm for p subsequen tial transducers introduced by [10], present a fundamental characterization theorem, and describe our experimental results. 2 Preliminaries De nition 1. A nite state transducer T = Q; I; F; E; is an 8 tuple where is a nite input alphabet, a nite output alphabet, Q a nite set of states, I Q the set of ....

[Article contains additional citation context not shown here]

Mehryar Mohri. On some Applications of Finite-State Automata Theory to Natural Language Processing. Journal of Natural Language Engineering, 2:1-20, 1996.


The Design Principles of a Weighted Finite-State Transducer .. - Mohri, Pereira, Riley (2000)   (9 citations)  Self-citation (Mohri)   (Correct)

.... composition, ffl removal, determinization and minimization, work without change over different semirings because of their foundation in the theory of rational power series [18] For example, the same power series determinization algorithm and code [18] can be used to determinize transducers [17], weighted transducers, weighted automata encountered in speech processing [24] and weighted automata using the probability operations. To do so, one just needs to use the algorithm with the string semiring ( Sigma [ f1g; Delta; 1; ffl) 21] in the case of transducers, with the semirings (R; ....

.... union, concatenation, Kleene closure, reversal, inversion and projection; Composition: transducer composition [22] and acceptor intersection, as well as taking the difference between a weighted acceptor and an unweighted DFA; Equivalence transformations: ffl elimination, determinization [17,18] and minimization for unweighted (both the general case [1] and the more efficient acyclic case [29] and weighted acceptors and transducers [15,18] removal of inaccessible states and transitions; Search: best path [20] n best paths, pruning (remove all states and transitions that occur only on ....

M. Mohri. On some applications of finite-state automata theory to natural language processing. Journal of Natural Language Engineering, 2:1--20, 1996.


Weighted Automata in Text and Speech Processing - Mohri, Pereira, Riley (1996)   (15 citations)  Self-citation (Mohri)   (Correct)

....is deterministic. 6 Conclusion We sketched the application of weighted automata in speech recognition and some of the main algorithms that support it. Weighted finite state automata can also be used in text based applications such as the segmentation of Chinese text [13] and text indexation [9]. Acknowledgments Hiyan Alshawi, Adam Buchsbaum, Emerald Chung, Don Hindle, Andrej Ljolje, Steven Phillips and Richard Sproat have commented extensively on these ideas, tested many versions of our algorithms, and contributed a variety of improvements. Details of our joint work and their own ....

Mehryar Mohri, `On some applications of finite-state automata theory to natural language processing: Representation of morphological dictionaries, compaction, and indexation ', Technical Report IGM 94-22, Institut Gaspard Monge, Noisy-le-Grand, (1994).


Network Optimizations for Large Vocabulary Speech Recognition - Mohri, Riley (1998)   (3 citations)  Self-citation (Mohri)   (Correct)

....sections, we briefly illustrate the determinization of weighted acceptors, the determinization of weighted transducers, and the minimization of weighted acceptors. We have given elsewhere a full description of these algorithms, including their mathematical basis and proofs of their soundness [9,10,12,13]. 2.1 Determinization of Weighted Acceptors A weighted acceptor or transducer A is said to be deterministic 2 iff at each state of A there exists at most one transition labeled with any given element of the input alphabet. Figure 1 gives an example of a non deterministic weighted acceptor: at ....

M. Mohri. On some applications of finite-state automata theory to natural language processing. Journal of Natural Language Engineering, 2:1--20, 1996.


Weighted Determinization and Minimization for Large Vocabulary .. - Mohri, Riley (1997)   (11 citations)  Self-citation (Mohri)   (Correct)

....give a detailed description of these algorithms here. In the following sections, we briefly illustrate these algorithms in the particular case of weighted automata. We have given elsewhere a detailed description of these algorithms, including their mathematical basis and proofs of their soundness [3, 4, 5, 6]. 2.1. WEIGHTED DETERMINIZATION A weighted automaton A is said to be deterministic 2 iff at each state of A there exists at most one transition labeled with any given element of the input alphabet. Figure 1 gives an example of a non deterministic weighted automaton: at state 0, for instance, ....

M. Mohri. On some applications of finite-state automata theory to natural language processing. Journal of Natural Language Engineering, 2:1--20, 1996.


Full Expansion Of Context-Dependent Networks In.. - Mohri, Riley.. (1998)   (11 citations)  Self-citation (Mohri)   (Correct)

....of general optimization techniques for weighted automata and do not require any (possibly difficult) changes to the decoder. 3. ALGORITHM Building the fully expanded C ffi L ffi G network uses several novel algorithms: efficient transducer composition [9] weighted transducer determinization [7, 8] and ffl removal for weighted automata. Weighted transducer determinization ensures that distinct arcs leaving a state have distinct input labels. Clearly, a necessary condition for transducer determinization is that the initial transducer maps each input sequence to at most one output sequence. ....

....determinization ensures that distinct arcs leaving a state have distinct input labels. Clearly, a necessary condition for transducer determinization is that the initial transducer maps each input sequence to at most one output sequence. But this is not sufficient: the mapping must be sequential [1, 7]. These conditions may be somewhat relaxed to mappings with bounded ambiguity (or p subsequential [7] The purpose of applying determinization to the model network is to decrease the number of alternative arcs that need to be considered during decoding. In many cases, the size of the model is ....

[Article contains additional citation context not shown here]

M. Mohri. On some applications of finite-state automata theory to natural language processing. Journal of Natural Language Engineering, 2:1--20, 1996.


A Rational Design for a Weighted Finite-State Transducer.. - Mohri, Pereira, Riley (1998)   (21 citations)  Self-citation (Mohri)   (Correct)

.... union, concatenation, Kleene closure, reversal, inversion and projection; Composition: transducer composition [17] and acceptor intersection, as well as taking the difference between a weighted acceptor and an unweighted DFA; Equivalence transformations: ffl elimination, determinization [14, 15] and minimization for unweighted (both the general case [1] and the more efficient acyclic case [20] and weighted acceptors and transducers [12, 15] removal of inaccessible states and transitions; Search: best path, n best paths, pruning (remove all states and transitions that occur only on ....

M. Mohri. On some applications of finite-state automata theory to natural language processing. Journal of Natural Language Engineering, 2:1--20, 1996.


A Web-based Text Corpora Development System - Bohus, Boldea (2000)   (Correct)

No context found.

Mohri, M., 1996. On Some Applications of Finite-State Automata Theory to Natural Language Processing. Natural Language Engineering, (2):1--20.


Unknown -   (Correct)

No context found.

M. Mohri, On Some Applications of Finite-State Automata Theory to Natural Language Processing, Natural Language Engineering, Vol. 2(1), 1-20, 1996.


Directed Replacement - Karttunen (1996)   (20 citations)  (Correct)

No context found.

Mehryar Mohri. 1994. On Some Applications of Finite-State Automata Theory to Natural Language Processing. Technical Report 94-22. L'Institute Gaspard Monge. Universit'e de Marne-laVall 'ee. Noisy Le Grand.


Dependency Parsing with an Extended Finite State Approach - Oflazer (1999)   (6 citations)  (Correct)

No context found.

Mehryar Mohri. 1996. On some applications of finite-state automata theory to natural language processing.


Introduction to Finite-State Devices in Natural Language.. - Roche, Schabes (1996)   (1 citation)  (Correct)

No context found.

Mohri, Mehryar. 1994b. On some applications of finite-state automata theory to natural language processing.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC