| Yves Schabes and Waters R.C. Stochastic lexicalized context-free grammars. Technical Report 93-12, Mitsubishi Electric Research Laboratories, 201 Broadway. Cambridge MA 02139, 1993. |
....seen It seems likely that the probabilities for the grammar must be revised in such a way as to reduce the probability of some sentences that were previously parsable. In that case, 17 however, the reported probabilities do not make sense, as they may sum to more than 1.0. Like Hindle, Schabes [4,23] reports work on lexicalized grammars, in this case lexicalized tree adjoining grammars. As with Hindle, he faces a sparse data problem, as each word may be associated with several syntactic rules. His tack on this problem is to use more informative data. Rather than training his grammar with raw ....
Y. Schabes & R. Waters, "Stochastic lexicalized context-free grammar," in Proceedings of the August 1993 International Workshop on Parsing Technologies, 1993.
....2 [Bod, 1993b] STSGs are a limited form of Stochastic Tree Adjoining Grammars [Resnik, 1992] STAGs) whereby only substitution is allowed. Polynomial parsing of TAGs has been considered in [Vijay Shanker and Weir, 1993] And studies of STAGs and limited forms of it (e.g. Schabes, 1992] [Schabes and Waters, 1993]) lead to polynomial algorithms for computing both the MPD and the probability of a sentence. However, as far as we know, there exist polynomial algorithms neither for computing the probability of a given parse, nor for finding the MPP of an input sentence. Moreover, there are no optimizations of ....
....to finding the MPD for a sentence if every P is exchanged with the operator Maximum on sets of reals. The polynomiality of its computation follows from that of the CYK and from the fact that the sets DF are all bounded in size by a constant. Algorithm 6 shows much similarity to the algorithm of [Schabes and Waters, 1993] for SLCFGs. The main difference is in our RB STSG representation which results in the derivation forest discussed above. Such a forest is essential for the algorithms and optimizations discussed in the next section. The most probable parse In this section we study the problem of finding the MPP ....
[Article contains additional citation context not shown here]
Schabes, Y. and Waters, R. (1993). Stochastic Lexicalized ContextFree Grammar. In Proceedings Third International Workshop on Parsing Technologies, Tilburg/Durbuy.
....in) These counts are normalized to make probabilities and the probabilities are used to make decisions. In this paper we take this idea further and apply word statistics at all points in the parse. While this seems like an important thing to do, there are relatively few papers in this area [1,10,11,16]. Section 6 compares our work to earlier research. 2 The Model First some notation. A sequence of words in a sentence from the j th word to the k th is denoted by w j;k . A constituent, say an np, which dominates these words is denoted by np j;k . If we wish to make the constituent type a ....
....of our PCFG is in line with these systems. Our PCFG results are also in line with those reported in [15] though their method of reporting results do not allow them to be shown in a graph line that of Figure 7. The work most similar to our is that of Black et al. 1] and Schabes and Waters [16]. These papers like ours are concerned with using lexical information together with probabilities on rules to improve parsing performance. Furthermore, these papers also starts by defining a language model in terms of the sum over the probabilities for the parses of a sentence, and selecting the ....
Schabes, Y. and Waters, R. C. Stochastic Lexicalized Context-Free Grammar. In Proceedings of the Third International Workshop on Parsing Technologies. 1993, 257--266.
....into each other if the adjunction would result in a wrapping auxiliary tree. The resulting system is strongly equivalent to CFGs, yet is fully lexicalized and still O(n 3 ) parsable, as shown by Schabes and Waters (1994) Furthermore, LTIGs can be parameterized to form probabilistic models (Schabes and Waters, 1993). Informally speaking, a parameter is associated with each possible adjunction or substitution operation between a tree and a node. For instance, suppose there are V left auxiliary trees that might adjoin into node j. Then there are V 1 parameters associated with node j 1 The best theoretical ....
Y. Schabes and R. Waters. 1993. Stochastic lexicalized context-free grammar. In Proceedings of the Third International Workshop on Parsing Technologies, pages 257--266.
....Tree Substitution Grammars are a limited form of Stochastic Tree Adjoining Grammars [Resnik, 1992] STAGs) where only substitution is allowed. Polynomial parsing of TAGs has been considered in [Vijay Shanker and Weir, 1993] And studies of STAGs and limited forms of it (e.g. Schabes, 1992] [Schabes and Waters, 1993]) lead to polynomial algorithms for computing both the MPD and the probability of a sentence. However, as far as we know, there exist polynomial algorithms neither for computing the probability of a given parse, nor for finding the MPP of an input sentence. Moreover, there are no optimizations of ....
....to finding the MPD for a sentence if every P is exchanged with the operator Maximum on sets of reals. The polynomiality of its computation follows from that of the CYK and from the fact that the sets DF are all bounded in size by a constant. Algorithm 6 shows much similarity to the algorithm of [Schabes and Waters, 1993] for SLCFGs. The main difference is in our RB STSG representation which results in the derivation forest discussed above. This should reduce the effect of the grammar size on execution time as shown below. Moreover, such a forest is essential for the algorithms and optimizations discussed in the ....
[Article contains additional citation context not shown here]
Schabes, Y. and Waters, R. (1993). Stochastic Lexicalized Context-Free Grammar. In Proceedings Third International Workshop on Parsing Technologies, Tilburg/Durbuy.
....It attempts to seed the grammar in a favorable search space by first training it with data from an existing corpus. Section 4 discusses the induction strategies in more detail. A third factor that affects the learning process is the complexity of the data. In their study of parsing the WSJ, Schabes et al. 1993) have shown that a grammar trained on the InsideOutside re estimation algorithm can perform quite well on short simple sentences but falters as the sentence length increases. To take this factor into account, we perform our experiments Categories Labeled Sentence ATIS WSJ HighP (I want (to take ....
.... To induce a grammar from the sparsely bracketed training data previously described, we use a variant of the Inside Outside re estimation algorithm proposed by Pereira and Schabes (1992) The inferred grammars are represented in the Probabilistic Lexicalized Tree Insertion Grammar (PLTIG) formalism (Schabes and Waters, 1993; Hwa, 1998a) which is lexicalized and context free equivalent. We favor the PLTIG representation for two reasons. First, it is amenable to the Inside Outside re estimation algorithm (the equations calculating the inside and outside probabilities for PLTIGs can be found in Hwa (1998b) Second, ....
Y. Schabes and R. Waters. 1993. Stochastic lexicalized context-free grammar. In Proceedings of the Third International Workshop on Parsing Technologies, pages 257--266.
....Abstract This paper presents an optimization of a syntactic disambiguation algorithm for Data Oriented Parsing (DOP) Bod 93) in particular, and for Stochastic Tree Substitution Grammars (STSGs) in general. The main advantage of this algorithm on existing alternatives ( Bod 93) Schabes Waters 93) Sima an et al. 94) is that its time complexity is linear, instead of square, in grammarsize (and cubic in sentence length) It is particularly suitable for natural language STSGs which have many deep elementary trees and a small underlying Context Free Grammar (CFG) A first implementation ....
....seems infeasible (a proof of this is still necessary, however) And secondly, we presented an algorithm, of time complexity cubic in sentence length, for computing the MPD. A similar algorithm for computing the MPD for Stochastic Lexicalized Context Free Grammars (SLCFGs) is presented in (Schabes Waters 93) Both algorithms have time complexity square in grammar size. For natural language DOP grammars, these algorithms become unattractive as soon as the grammar takes realistic sizes. In this paper the algorithm for computing the MPD (Sima an et al. 94) is refined to achieve time complexity of order ....
[Article contains additional citation context not shown here]
Y. Schabes and R.C. Waters. Stochastic Lexicalized Context-Free Grammar. In Proceedings Third International Workshop on Parsing Technologies, Tilburg/Durbuy, 1993.
....2 [Bod, 1993b] STSGs are a limited form of Stochastic Tree Adjoining Grammars [Resnik, 1992] STAGs) whereby only substitution is allowed. Polynomial parsing of TAGs has been considered in [Vijay Shanker and Weir, 1993] And studies of STAGs and limited forms of it (e.g. Schabes, 1992] [Schabes and Waters, 1993]) lead to polynomial algorithms for computing both the MPD and the probability of a sentence. However, as far as we know, there exist polynomial algorithms neither for computing the probability of a given parse, nor for finding the MPP of an input sentence. Moreover, there are no optimizations of ....
....to finding the MPD for a sentence if every P is exchanged with the operator Maximum on sets of reals. The polynomiality of its computation follows from that of the CYK and from the fact that the sets DF are all bounded in size by a constant. Algorithm 6 shows much similarity to the algorithm of [Schabes and Waters, 1993] for SLCFGs. The main difference is in our RB STSG representation which results in the derivation forest discussed above. Such a forest is essential for the algorithms and optimizations discussed in the next section. The most probable parse In this section we study the problem of finding the MPP ....
[Article contains additional citation context not shown here]
Schabes, Y. and Waters, R. (1993). Stochastic Lexicalized ContextFree Grammar. In Proceedings Third International Workshop on Parsing Technologies, Tilburg/Durbuy.
....the probabilities of partial strings and substrings of sentences derived from a PCFG. They also describe how such probabilities can be used in conjunction with an island driven probabilistic parser to score alternative acoustic hypotheses produced by a speech recognition system. Schabes and Waters [84] introduce stochastic lexicalized CFGs (SLCFGs) a context free version of stochastic lexicalized tree adjoining grammars, and present algorithms for parsing, training of the probabilities and recovering the most probable parse of a given input for these kinds of grammars. The main advantage of ....
Y. Schabes and R. C. Waters. Stochastic Lexicalized Context-Free Grammar. In Proceedings of the 3rd International Workshop on Parsing Technologies (IWPT'93), pages 257--266, Tilburg, The Netherlands, 1993.
No context found.
Yves Schabes and Waters R.C. Stochastic lexicalized context-free grammars. Technical Report 93-12, Mitsubishi Electric Research Laboratories, 201 Broadway. Cambridge MA 02139, 1993.
No context found.
Yves Schabes and Waters R.C. Stochastic lexicalized context-free grammars. In Proceedings of the Third International Workshop on Parsing Technologies, pages 257--266, Tilburg (the Netherlans), Durbuy (Belgium), August 1993.
No context found.
Yves Schabes and Richard C. Waters. Stochastic lexicalized context-free grammars. In Proceedings of the Third International Workshop on Parsing Technologies, 1993.
No context found.
Yves Schabes and Richard Waters. 1993. Stochastic lexicalized context-free grammar. In Proceedings of the Third International Workshop on Parsing Technologies, pages 257--266.
No context found.
Yves Schabes and Richard Waters. 1993. Stochastic Lexicalized Context-Free Grammar. In Proceedings Third IWPT, Tilburg/Durbuy.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC