• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 8 of 8

Unsupervised induction of labeled parse trees by clustering with syntactic features. COLING ’08

by Roi Reichart , 2008
"... We present an algorithm for unsupervised induction of labeled parse trees. The algorithm has three stages: bracketing, initial labeling, and label clustering. Bracketing is done from raw text using an unsupervised incremental parser. Initial labeling is done using a merging model that aims at minimi ..."
Abstract - Cited by 10 (4 self) - Add to MetaCart
at minimizing the grammar description length. Finally, labels are clustered to a desired number of labels using syntactic features extracted from the initially labeled trees. The algorithm obtains 59% labeled f-score on the WSJ10 corpus, as compared to 35 % in previous work, and substantial error reduction over

Upper Bounds for Unsupervised Parsing with Unambiguous Non-Terminally Separated Grammars

by Franco M. Luque, Gabriel Infante-lopez
"... Unambiguous Non-Terminally Separated (UNTS) grammars have properties that make them attractive for grammatical inference. However, these properties do not state the maximal performance they can achieve when they are evaluated against a gold treebank that is not produced by an UNTS grammar. In this p ..."
Abstract - Add to MetaCart
that optimizes a metric we define. We show a way to translate this score into an upper bound for the F 1. In particular, we show that the F 1 parsing score of any UNTS grammar can not be beyond 82.2% when the gold treebank is the WSJ10 corpus. 1

PCFG Induction for Unsupervised Parsing and Language Modelling

by James Scicluna, Colin De La Higuera
"... The task of unsupervised induction of probabilistic context-free grammars (PCFGs) has attracted a lot of attention in the field of computational linguistics. Although it is a difficult task, work in this area is still very much in demand since it can contribute to the advancement of language parsing ..."
Abstract - Add to MetaCart
and infers correctly even from small samples. Our analysis shows that the type of grammars induced by our algorithm are, in theory, capable of modelling natural language. One of our experiments shows that our algorithm can potentially outperform the state-of-the-art in unsupervised parsing on the WSJ10

2006b. Reranking and self-training for parser adaptation

by David Mcclosky, Eugene Charniak, Mark Johnson - ACL-COLING
"... Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led to concer ..."
Abstract - Cited by 89 (2 self) - Add to MetaCart
Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led

A Supervised Algorithm for Verb Disambiguation into VerbNet Classes

by Omri Abend, Roi Reichart
"... VerbNet (VN) is a major large-scale English verb lexicon. Mapping verb instances to their VN classes has been proven useful for several NLP tasks. However, verbs are polysemous with respect to their VN classes. We introduce a novel supervised learning model for mapping verb instances to VN classes, ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
, using rich syntactic features and class membership constraints. We evaluate the algorithm in both in-domain and corpus adaptation scenarios. In both cases, we use the manually tagged Semlink WSJ corpus as training data. For indomain (testing on Semlink WSJ data), we achieve 95.9 % accuracy, 35.1 % error

Semantic Role Labeling via Instance-Based Learning

by Chi-san Althon Lin, Tony C. Smith
"... This paper demonstrates two methods to improve the performance of instancebased learning (IBL) algorithms for the problem of Semantic Role Labeling (SRL). Two IBL algorithms are utilized: k-Nearest Neighbor (kNN), and Priority Maximum Likelihood (PML) with a modified back-off combination method. The ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
. The experimental data are the WSJ23 and Brown Corpus test sets from the CoNLL-2005 Shared Task. It is shown that applying the Tree-Based Predicate-Argument Recognition Algorithm (PARA) to the data as a preprocessing stage allows kNN and PML to deliver F1: 68.61 and 71.02 respectively on the WSJ23, and F1: 56

Segmental Neural Net Optimization for Continuous Speech Recognition

by Ymg Zhao, Richard Schwartz, John Makhoul, George Zavaliagkos - In Advances in Neural Information Processing Systems 6 , 1994
"... Previously, we had developed the concept of a Segmental Neural Net (SNN) for phonetic modeling in continuous speech recognition (CSR). This kind of neu-ral network technology advanced the state-of-the-art of large-vocabulary CSR, which employs Hidden Marlcov Models (HMM), for the ARPA 1oo0-word Re-s ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
-source Management corpus. More Recently, we started porting the neural net system to a larger, more challenging corpus- the ARPA 20,Ooo-word Wall Street Journal (WSJ) corpus. During the porting, we explored the following research directions to refine the system: i) training context-dependent models with a reg

Continuous Speech Dictation at LIMSI

by J.L. Gauvain, L. F. Lamel, M. Adda-decker
"... One of our major research activities at LIMSI is multilingual, speaker-independent, large vocabulary speech dictation. The multilingual aspect of this work is of particular importance in Europe, where each country has it's own national language. Speakerindependence and large vocabulary are c ..."
Abstract - Add to MetaCart
,000 word lexicon with an unrestricted vocabulary test the word error for WSJ is 10% and for BREF is 16%.
Results 1 - 8 of 8
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University