Results 1 -
8 of
8
Unsupervised induction of labeled parse trees by clustering with syntactic features. COLING ’08
, 2008
"... We present an algorithm for unsupervised induction of labeled parse trees. The algorithm has three stages: bracketing, initial labeling, and label clustering. Bracketing is done from raw text using an unsupervised incremental parser. Initial labeling is done using a merging model that aims at minimi ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
at minimizing the grammar description length. Finally, labels are clustered to a desired number of labels using syntactic features extracted from the initially labeled trees. The algorithm obtains 59% labeled f-score on the WSJ10 corpus, as compared to 35 % in previous work, and substantial error reduction over
Upper Bounds for Unsupervised Parsing with Unambiguous Non-Terminally Separated Grammars
"... Unambiguous Non-Terminally Separated (UNTS) grammars have properties that make them attractive for grammatical inference. However, these properties do not state the maximal performance they can achieve when they are evaluated against a gold treebank that is not produced by an UNTS grammar. In this p ..."
Abstract
- Add to MetaCart
that optimizes a metric we define. We show a way to translate this score into an upper bound for the F 1. In particular, we show that the F 1 parsing score of any UNTS grammar can not be beyond 82.2% when the gold treebank is the WSJ10 corpus. 1
PCFG Induction for Unsupervised Parsing and Language Modelling
"... The task of unsupervised induction of probabilistic context-free grammars (PCFGs) has attracted a lot of attention in the field of computational linguistics. Although it is a difficult task, work in this area is still very much in demand since it can contribute to the advancement of language parsing ..."
Abstract
- Add to MetaCart
and infers correctly even from small samples. Our analysis shows that the type of grammars induced by our algorithm are, in theory, capable of modelling natural language. One of our experiments shows that our algorithm can potentially outperform the state-of-the-art in unsupervised parsing on the WSJ10
2006b. Reranking and self-training for parser adaptation
- ACL-COLING
"... Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led to concer ..."
Abstract
-
Cited by 89 (2 self)
- Add to MetaCart
Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led
A Supervised Algorithm for Verb Disambiguation into VerbNet Classes
"... VerbNet (VN) is a major large-scale English verb lexicon. Mapping verb instances to their VN classes has been proven useful for several NLP tasks. However, verbs are polysemous with respect to their VN classes. We introduce a novel supervised learning model for mapping verb instances to VN classes, ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
, using rich syntactic features and class membership constraints. We evaluate the algorithm in both in-domain and corpus adaptation scenarios. In both cases, we use the manually tagged Semlink WSJ corpus as training data. For indomain (testing on Semlink WSJ data), we achieve 95.9 % accuracy, 35.1 % error
Semantic Role Labeling via Instance-Based Learning
"... This paper demonstrates two methods to improve the performance of instancebased learning (IBL) algorithms for the problem of Semantic Role Labeling (SRL). Two IBL algorithms are utilized: k-Nearest Neighbor (kNN), and Priority Maximum Likelihood (PML) with a modified back-off combination method. The ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. The experimental data are the WSJ23 and Brown Corpus test sets from the CoNLL-2005 Shared Task. It is shown that applying the Tree-Based Predicate-Argument Recognition Algorithm (PARA) to the data as a preprocessing stage allows kNN and PML to deliver F1: 68.61 and 71.02 respectively on the WSJ23, and F1: 56
Segmental Neural Net Optimization for Continuous Speech Recognition
- In Advances in Neural Information Processing Systems 6
, 1994
"... Previously, we had developed the concept of a Segmental Neural Net (SNN) for phonetic modeling in continuous speech recognition (CSR). This kind of neu-ral network technology advanced the state-of-the-art of large-vocabulary CSR, which employs Hidden Marlcov Models (HMM), for the ARPA 1oo0-word Re-s ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
-source Management corpus. More Recently, we started porting the neural net system to a larger, more challenging corpus- the ARPA 20,Ooo-word Wall Street Journal (WSJ) corpus. During the porting, we explored the following research directions to refine the system: i) training context-dependent models with a reg
Continuous Speech Dictation at LIMSI
"... One of our major research activities at LIMSI is multilingual, speaker-independent, large vocabulary speech dictation. The multilingual aspect of this work is of particular importance in Europe, where each country has it's own national language. Speakerindependence and large vocabulary are c ..."
Abstract
- Add to MetaCart
,000 word lexicon with an unrestricted vocabulary test the word error for WSJ is 10% and for BREF is 16%.