| Goodman, J. (1996). Efficient algorithms for parsing the DOP model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 96), pages 143-152. |
....The methods we propose show that the score for a parse can be calculated in polynomial time in spite of an exponentially large number of subtrees, and that efficient parameter estimation techniques exist which optimize discriminative criteria that have been well studied theoretically. Goodman [9] gives an ingenious conversion of the model in [2] to an equivalent PCFG whose number of rules is linear in the size of the training data, thus solving many of the computational issues. An exact implementation of Bod s parsing method is still infeasible, but Goodman gives an approximation that can ....
....at n 1 n 2 , together with a choice at each child of simply taking the non terminal at that child, or any one of the common sub trees at that child. Thus there are (1 C(child(n 1 ; i) child(n 2 ; i) possible choices at the i th child. Note that a similar recursion is described by Goodman [9], Goodman s application being the conversion of Bod s model [2] to an equivalent PCFG. It is clear from the identity h(T 1 ) h(T 2 ) n1;n2 C(n 1 ; n 2 ) and the recursive definition of C(n 1 ; n 2 ) that h(T 1 ) h(T 2 ) can be calculated in O(jN 1 jjN 2 j) time: the matrix of C(n 1 ; ....
Goodman, J. (1996). Efficient algorithms for parsing the DOP model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 96), pages 143-152.
....random derivations that should be sampled to reliably estimate the most probable parse increases exponentially with the sentence length (see Goodman 1998) It is therefore questionable whether Bod s sampling technique can be scaled to larger domains such as the WSJ portion in the Penn Treebank. Goodman (1996, 1998) showed how DOP1 can be reduced to a compact stochastic contextfree grammar (SCFG) which contains exactly eight SCFG rules for each node in the training set trees. Although Goodman s method does still not allow for an efficient computation of the most probable parse (in fact, the problem of ....
J. Goodman, 1996. Efficient Algorithms for Parsing the DOP Model, Proceedings Empirical Methods in Natural Language Processing, Philadelphia, PA.
....parsing has been investigated by several researchers over the past years. However, in the most general formulation of DOP, finding the most probable parse tree (MPP) has proven to be an NP hard problem (Sima an 96a) Therefore various approximated MPP search have been developed (Bod 92; Goodman 96; Chappelier Rajman 00) However, another alternative consists in restricting the set of elementary trees used in the DOP grammar in such a way that finding the MPP is no longer NP hard. The purpose of this contribution is to present and evaluate such an approach. The paper first provides a ....
....principle for DOP, we have tested it on two different corpora: the ATIS corpus (Hemphill et al. 90) and the Susanne #3 corpus (Sampson 94) As illustrated in table 2, these two corpora have quite different characteristics. Contrary to most of the experiments performed so far (Sima an 96b; Goodman 96) we did not turn the trees into a binary form (Chomsky Normal Form) but tried instead to keep the corpora as close to the original annotated data as possible. 7 In the same perspective, we did not restrict ourselves to parse Part ofSpeech tag sequences but worked on the original real word ....
J. Goodman. Efficient algorithms for parsing the dop model. In Proc. of the Conf. on Empirical Methods in Natural Language Processing, pages 143--152, May 1996.
....size] can be very large, as it comprises all distinct subtrees seen in the corpus (Collins 1999: p. 442) Apart from the fact that Collins uses again his narrow definition of DOP, it 3 is surprising that he does not mention Joshua Goodman s work on reducing the DOP1 model to an efficient PCFG (Goodman 1996, 98) this is surprising because in his conclusion Collins strongly recommends Goodman (1998) to the reader. Goodman (1996, 98) found a reduction of my so called DOP1 model to an isomorphic probabilistic context free grammar (PCFG) which contains exactly eight PCFG rules for each node in the ....
....fact that Collins uses again his narrow definition of DOP, it 3 is surprising that he does not mention Joshua Goodman s work on reducing the DOP1 model to an efficient PCFG (Goodman 1996, 98) this is surprising because in his conclusion Collins strongly recommends Goodman (1998) to the reader. Goodman (1996, 98) found a reduction of my so called DOP1 model to an isomorphic probabilistic context free grammar (PCFG) which contains exactly eight PCFG rules for each node in the training set trees. Thus, the complexity of DOP1 is not exponential in the grammar size G, as Collins suggests, but linear. Or ....
Goodman, J. 1996. "Efficient Algorithms for Parsing the DOP Model", Proceedings Empirical Methods in Natural Language Processing, Philadelphia, PA.
.... is NP hard (Sima an 1996) The most probable parse can be estimated by iterative Monte Carlo sampling (Bod 1995) but efficient algorithms exist only for sub optimal solutions such as the most likely derivation of a sentence (Bod 1995, Sima an 1995) or the labelled recall parse of a sentence (Goodman 1996). So far, the syntactic DOP model has been tested on the ATIS corpus and the Wall Street Journal corpus, obtaining significantly better test results than other stochastic parsers (Charniak 1996) For example, Goodman (1998) compares the results of his DOP parser to a replication of Pereira ....
J. Goodman, 1996. "Efficient Algorithms for Parsing the DOP Model", Proceedings Empirical Methods in Natural Language Processing, Philadelphia (PA).
.... years, a new approach to language processing has started to emerge, which has become known under various labels such as data oriented parsing , corpus based interpretation , and tree bank grammar (cf. van den Berg et al. 1994; Bod 1992 96; Bod et al. 1996a b; Bonnema 1996; Charniak 1996a b; Goodman 1996; Kaplan 1996; Rajman 1995a b; Scha 1990 92; Sekine Grishman 1995; Sima an et al. 1994; Sima an 1995 96; Tugwell 1995) This approach, which we will call data oriented processing or DOP, embodies the assumption that human language perception and production works with representations of concrete ....
....(Bod, 1992 95) Then we will give an overview of the other models that instantiate the DOP approach. Many of these models also employ labelled phrasestructure trees, but use different criteria for extracting subtrees from the corpus or for computing probabilities (Bod 1996b; Charniak 1996a b; Goodman 1996; Rajman 1995a b; Sekine Grishman 1995; Sima an 1995 96) other models use richer formalisms for their corpus annotations (van den Berg et al. 1994; Bod et al. 1996a b; Bonnema 1996; Kaplan 1996; Tugwell 1995) 2. A First Data Oriented Processing System: DOP1 We now define an instance of the ....
[Article contains additional citation context not shown here]
J. Goodman, 1996. "Efficient Algorithms for Parsing the DOP Model", Proceedings Empirical Methods in Natural Language Processing, Philadelphia, PA.
....a parse forest for a sentence in DOP1. To select the most probable parse from a forest, Bod (1993 95) and Rajman (1995a,b) give Monte Carlo approximation algorithms. Sima an (1995) gives an efficient polynomial algorithm for selecting the parse corresponding to the most probable derivation. In Goodman (1996), an efficient parsing strategy is given that maximizes the expected number of correct constituents. 1 The DOP1 model, and some variations of it, have been tested by Bod (1993 1995) Sima an (1995 1996) Sekine Grishman (1995) Goodman (1996) and Charniak (1996) 3 How does DOP perform on ....
....corresponding to the most probable derivation. In Goodman (1996) an efficient parsing strategy is given that maximizes the expected number of correct constituents. 1 The DOP1 model, and some variations of it, have been tested by Bod (1993 1995) Sima an (1995 1996) Sekine Grishman (1995) Goodman (1996), and Charniak (1996) 3 How does DOP perform on unedited data Our first question is concerned with the performance of DOP1 on unedited data. To deal with this question, we use ATIS p o s trees as found in the Penn Treebank (Marcus et al. 1993) This paper contains the first published results ....
[Article contains additional citation context not shown here]
J. Goodman, 1996. "Efficient Algorithms for Parsing the DOP Model", Proceedings Empirical Methods in Natural Language Processing, Philadelphia.
....Thus, these algorithms improve performance not only on the measures that they were designed for, but also on related criteria. Furthermore, in some cases these techniques can make parsing fast when it was previously impractical. We have used the technique outlined in this paper in other work (Goodman, 1996) to efficiently parse the DOP model; in that model, the only previously known algorithm which summed over all the possible derivations was a slow Monte Carlo algorithm (Bod, 1993) However, by maximizing the 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 Iteration Number Labelled Tree ....
Goodman, Joshua. 1996. Efficient algorithms for parsing the DOP model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. To appear.
.... Inside Out Joshua Goodman TR 07 98 June 1998 Computer Science Group Harvard University Cambridge, Massachusetts Parsing Inside Out A thesis presented by Joshua T. Goodman to The Division of Engineering and Applied Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the ....
Joshua Goodman. 1996a. Efficient algorithms for parsing the DOP model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 143--152, May.
No context found.
Goodman, J. (1996). Efficient algorithms for parsing the DOP model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 96), pages 143-152.
No context found.
Joshua Goodman[1996b], "Efficient algorithms for parsing the DOP model," in Proceedings of the Conference on Empirical Methods in Natural Language Processing , 143--152.
No context found.
Goodman, J. (1996a). Efficient algorithms for parsing the DOP model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 143--152 Philadelphia, PA.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC