Results 1 - 10
of
11
Non-Projective Dependency Parsing in Expected Linear Time
"... We present a novel transition system for dependency parsing, which constructs arcs only between adjacent words but can parse arbitrary non-projective trees by swapping the order of words in the input. Adding the swapping operation changes the time complexity for deterministic parsing from linear to ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
(Show Context)
We present a novel transition system for dependency parsing, which constructs arcs only between adjacent words but can parse arbitrary non-projective trees by swapping the order of words in the input. Adding the swapping operation changes the time complexity for deterministic parsing from linear to quadratic in the worst case, but empirical estimates based on treebank data show that the expected running time is in fact linear for the range of data attested in the corpora. Evaluation on data from five languages shows state-of-the-art accuracy, with especially good results for the labeled exact match score. 1
Optimal reduction of rule length in Linear Context-Free Rewriting Systems
- In Proceedings of the 2009 Meeting of the North American chapter of the Association for Computational Linguistics (NAACL-09
, 2009
"... ..."
Generalized Higher-Order Dependency Parsing with Cube Pruning
, 2012
"... State-of-the-art graph-based parsers use features over higher-order dependencies that rely on decoding algorithms that are slow and difficult to generalize. On the other hand, transition-based dependency parsers can easily utilize such features without increasing the linear complexity of the shift-r ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
(Show Context)
State-of-the-art graph-based parsers use features over higher-order dependencies that rely on decoding algorithms that are slow and difficult to generalize. On the other hand, transition-based dependency parsers can easily utilize such features without increasing the linear complexity of the shift-reduce system beyond a constant. In this paper, we attempt to address this imbalance for graph-based parsing by generalizing the Eisner (1996) algorithm to handle arbitrary features over higherorder dependencies. The generalization is at the cost of asymptotic efficiency. To account for this, cube pruning for decoding is utilized (Chiang, 2007). For the first time, label tuple and structural features such as valencies can be scored efficiently with third-order features in a graph-based parser. Our parser achieves the state-of-art unlabeled accuracy of 93.06% and labeled accuracy of 91.86 % on the standard test set for English, at a faster speed than a reimplementation of the third-order model of Koo et al. (2010).
Datadriven parsing with probabilistic linear contextfree rewriting systems
- In Proceedings of the 23rd International Conference on Computational Linguistics
, 2010
"... This paper presents a first efficient implementation of a weighted deductive CYK parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRS), together with context-summary estimates for parse items used to speed up parsing. LCFRS, an extension of CFG, can describe discontinuities both in ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
(Show Context)
This paper presents a first efficient implementation of a weighted deductive CYK parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRS), together with context-summary estimates for parse items used to speed up parsing. LCFRS, an extension of CFG, can describe discontinuities both in constituency and dependency structures in a straightforward way and is therefore a natural candidate to be used for data-driven parsing. We evaluate our parser with a grammar extracted from the German NeGra treebank. Our experiments show that datadriven LCFRS parsing is feasible with a reasonable speed and yields output of competitive quality. 1
Direct parsing of discontinuous constituents in German
- In Proceedings of the SPMRL workshop at NAACL HLT 2010
, 2010
"... Discontinuities occur especially frequently in languages with a relatively free word order, such as German. Generally, due to the longdistance dependencies they induce, they lie beyond the expressivity of Probabilistic CFG, i.e., they cannot be directly reconstructed by a PCFG parser. In this paper, ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Discontinuities occur especially frequently in languages with a relatively free word order, such as German. Generally, due to the longdistance dependencies they induce, they lie beyond the expressivity of Probabilistic CFG, i.e., they cannot be directly reconstructed by a PCFG parser. In this paper, we use a parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRS), a formalism with high expressivity, to directly parse the German NeGra and TIGER treebanks. In both treebanks, discontinuities are annotated with crossing branches. Based on an evaluation using different metrics, we show that an output quality can be achieved which is comparable to the output quality of PCFG-based systems. In most constituency treebanks, sentence annotation is restricted to having the shape of trees without crossing branches, and the non-local dependencies induced by the discontinuities are modeled by an additional mechanism. In the Penn Treebank (PTB) (Marcus et al., 1994), e.g., this mechanism is a combination of special labels and empty nodes, establishing implicit additional edges. In the German TüBa-D/Z (Telljohann et al., 2006), additional edges are established by a combination of topological field annotation and special edge labels. As an example, Fig. 1 shows a tree from TüBa-D/Z with the annotation of (1). Note here the edge label ON-MOD on the relative clause which indicates that the subject of the sentence (alle Attribute) is modified. 1
Discontinuity and nonprojectivity: Using mildly contextsensitive formalisms for data-driven parsing
- In Proceedings of TAG
, 2010
"... We present a parser for probabilistic Linear Context-Free Rewriting Systems and use it for constituency and dependency treebank pars-ing. The choice of LCFRS, a formalism with an extended domain of locality, enables us to model discontinuous constituents and non-projective dependencies in a straight ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
We present a parser for probabilistic Linear Context-Free Rewriting Systems and use it for constituency and dependency treebank pars-ing. The choice of LCFRS, a formalism with an extended domain of locality, enables us to model discontinuous constituents and non-projective dependencies in a straight-forward way. The parsing results show that, firstly, our parser is efficient enough to be used for data-driven parsing and, secondly, its result quality for constituency parsing is comparable to the output quality of other state-of-the-art results, all while yielding structures that display dis-continuous dependencies. 1
Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
"... We study the problem of finding the best headdriven parsing strategy for Linear Context-Free Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We study the problem of finding the best headdriven parsing strategy for Linear Context-Free Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing. 1
An Incremental Earley Parser for Simple Range Concatenation Grammar
"... We present an Earley-style parser for simple range concatenation grammar, a formalism strongly equivalent to linear context-free rewriting systems. Furthermore, we present different filters which reduce the number of items in the parsing chart. An implementation shows that parses can be obtained in ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We present an Earley-style parser for simple range concatenation grammar, a formalism strongly equivalent to linear context-free rewriting systems. Furthermore, we present different filters which reduce the number of items in the parsing chart. An implementation shows that parses can be obtained in a reasonable time. 1
Efficient Parsing with Linear Context-Free Rewriting Systems
"... Previous work on treebank parsing with discontinuous constituents using Linear Context-Free Rewriting systems (LCFRS) has been limited to sentences of up to 30 words, for reasons of computational complexity. There have been some results on binarizing an LCFRS in a manner that minimizes parsing compl ..."
Abstract
- Add to MetaCart
(Show Context)
Previous work on treebank parsing with discontinuous constituents using Linear Context-Free Rewriting systems (LCFRS) has been limited to sentences of up to 30 words, for reasons of computational complexity. There have been some results on binarizing an LCFRS in a manner that minimizes parsing complexity, but the present work shows that parsing long sentences with such an optimally binarized grammar remains infeasible. Instead, we introduce a technique which removes this length restriction, while maintaining a respectable accuracy. The resulting parser has been applied to a discontinuous treebank with favorable results. 1
PLCFRSParsing ofEnglish Discontinuous Constituents
"... This paper proposes a direct parsing of non-local dependencies in English. To this end,we useprobabilisticlinearcontext-free rewriting systems for data-driven parsing, following recent work on parsing German. Inordertodoso,wefirstperformatransformation of the Penn Treebank annotation of non-local de ..."
Abstract
- Add to MetaCart
(Show Context)
This paper proposes a direct parsing of non-local dependencies in English. To this end,we useprobabilisticlinearcontext-free rewriting systems for data-driven parsing, following recent work on parsing German. Inordertodoso,wefirstperformatransformation of the Penn Treebank annotation of non-local dependencies into an annotation usingcrossingbranches. Theresultingtreebank can be used for PLCFRS-based parsing. Our evaluation shows that, compared to PCFG parsing with the same techniques, PLCFRS parsing yields slightly better results. Inparticularwhenevaluatingonlythe parsingresultsconcerninglong-distancedependencies,thePLCFRSapproachwithdiscontinuousconstituentsis able to recognize about88 % of the dependenciesof type *T* and *T*-PRN encoded in the Penn Treebank. Even the evaluation results concerning local dependencies, which can in principle be captured by a PCFG-based model, are better with our PLCFRS model. This demonstratesthatbydiscardinginformation onnon-localdependenciesthePCFGmodel losesimportantinformationonsyntacticdependenciesin general. 1