Results 1 -
5 of
5
Introduction and overview
- COVEN Platform, Presence: Teleoperators and Virtual Environments
, 2001
"... motor vehicle accidents ..."
Punctuation: Making a Point in Unsupervised Dependency Parsing
"... We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include punctuation marks in the grammar (as if they were wo ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include punctuation marks in the grammar (as if they were words) do not perform well with Klein and Manning’s Dependency Model with Valence (DMV). Instead, we split a sentence at punctuation and impose parsing restrictions over its fragments. Our grammar inducer is trained on the Wall Street Journal (WSJ) and achieves 59.5 % accuracy out-of-domain (Brown sentences with 100 or fewer words), more than 6 % higher than the previous best results. Further evaluation, using the 2006/7 CoNLL sets, reveals that punctuation aids grammar induction in 17 of 18 languages, for an overall average net gain of 1.3%. Some of this improvement is from training, but more than half is from parsing with induced constraints, in inference. Punctuation-aware decoding works with existing (even already-trained) parsing models and always increased accuracy in our experiments. 1
Designing agreement features for realization ranking
- In Proc. Coling 2010: Posters
, 2010
"... This paper shows that incorporating linguistically motivated features to ensure correct animacy and number agreement in an averaged perceptron ranking model for CCG realization helps improve a state-ofthe-art baseline even further. Traditionally, these features have been modelled using hard constrai ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
This paper shows that incorporating linguistically motivated features to ensure correct animacy and number agreement in an averaged perceptron ranking model for CCG realization helps improve a state-ofthe-art baseline even further. Traditionally, these features have been modelled using hard constraints in the grammar. However, given the graded nature of grammaticality judgements in the case of animacy we argue a case for the use of a statistical model to rank competing preferences. Though subject-verb agreement is generally viewed to be syntactic in nature, a perusal of relevant examples discussed in the theoretical linguistics literature (Kathol, 1999; Pollard and Sag, 1994) points toward the heterogeneous nature of English agreement. Compared to writing grammar rules, our method is more robust and allows incorporating information from diverse sources in realization. We also show that the perceptron model can reduce balanced punctuation errors that would otherwise require a post-filter. The full model yields significant improvements in BLEU scores on Section 23 of the CCGbank and makes many fewer agreement errors. 1
Perceptron Reranking for CCG Realization
"... This paper shows that discriminative reranking with an averaged perceptron model yields substantial improvements in realization quality with CCG. The paper confirms the utility of including language model log probabilities as features in the model, which prior work on discriminative training with lo ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper shows that discriminative reranking with an averaged perceptron model yields substantial improvements in realization quality with CCG. The paper confirms the utility of including language model log probabilities as features in the model, which prior work on discriminative training with log linear models for HPSG realization had called into question. The perceptron model allows the combination of multiple n-gram models to be optimized and then augmented with both syntactic features and discriminative n-gram features. The full model yields a stateof-the-art BLEU score of 0.8506 on Section 23 of the CCGbank, to our knowledge the best score reported to date using a reversible, corpus-engineered grammar. 1
Corpus Conversion
"... Implementation • Transform grammar engineering from a one-shot task to an evolving, iterable process. • Augment the CCGbank (Hockenmaier and Steedman (2007)) with deeper linguistic insights: ◦ Propbank roles (Boxwell and White (2008)) ◦ Derivational restructuring for punctuation analysis (White and ..."
Abstract
- Add to MetaCart
Implementation • Transform grammar engineering from a one-shot task to an evolving, iterable process. • Augment the CCGbank (Hockenmaier and Steedman (2007)) with deeper linguistic insights: ◦ Propbank roles (Boxwell and White (2008)) ◦ Derivational restructuring for punctuation analysis (White and Rajkumar (2008)) ◦ Head lexicalization for case-marking prepositions, named entity annotation, lemmatization

