• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Incorporating non-local information into information extraction systems by gibbs sampling. (2005)

by J R Finkel, T Grenager, C D Manning
Venue:In ACL,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 730
Next 10 →

Distant supervision for relation extraction without labeled data

by Mike Mintz, Steven Bills, Rion Snow, Dan Jurafsky
"... Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACEstyle algorithms, and allowing the use of corpora ..."
Abstract - Cited by 239 (3 self) - Add to MetaCart
Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACEstyle algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relations, to provide distant supervision. For each pair of entities that appears in some Freebase relation, we find all sentences containing those entities in a large unlabeled corpus and extract textual features to train a relation classifier. Our algorithm combines the advantages of supervised IE (combining 400,000 noisy pattern features in a probabilistic classifier) and unsupervised IE (extracting large numbers of relations from large corpora of any domain). Our model is able to extract 10,000 instances of 102 relations at a precision of 67.6%. We also analyze feature performance, showing that syntactic parse features are particularly helpful for relations that are ambiguous or lexically distant in their expression. 1
(Show Context)

Citation Context

...features Every feature contains, in addition to the content described above, named entity tags for the two entities. We perform named entity tagging using the Stanford four-class named entity tagger (=-=Finkel et al., 2005-=-). The tagger provides each word with a label from {person, location, organization, miscellaneous, none}. 5.4 Feature conjunction Rather than use each of the above features in the classifier independe...

Named entity recognition in tweets: An experimental study.

by Alan Ritter , Mausam Sam Clark , Oren Etzioni - In Proceedings of Empirical Methods for Natural Language Processing EMNLP, , 2011
"... Abstract People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issu ..."
Abstract - Cited by 143 (11 self) - Add to MetaCart
Abstract People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F 1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F 1 by 25% over ten common entity types. Our NLP tools are available at: http:// github.com/aritter/twitter_nlp
(Show Context)

Citation Context

...outputs of T-POS, T-CHUNK and T-CAP in generating features. We report results at segmenting named entities in Table 6. Compared with the state-of-the-art newstrained Stanford Named Entity Recognizer (=-=Finkel et al., 2005-=-), T-SEG obtains a 52% increase in F1 score. 3.2 Classifying Named Entities Because Twitter contains many distinctive, and infrequent entity types, gathering sufficient training data for named entity ...

Design challenges and misconceptions in named entity recognition

by Lev Ratinov, Dan Roth - PROCEEDINGS OF THE THIRTEENTH CONFERENCE ON COMPUTATIONAL NATURAL LANGUAGE LEARNING (CONLL) , 2009
"... We analyze some of the fundamental design challenges and misconceptions that underlie the development of an efficient and robust NER system. In particular, we address issues such as the representation of text chunks, the inference approach needed to combine local NER decisions, the sources of prior ..."
Abstract - Cited by 142 (8 self) - Add to MetaCart
We analyze some of the fundamental design challenges and misconceptions that underlie the development of an efficient and robust NER system. In particular, we address issues such as the representation of text chunks, the inference approach needed to combine local NER decisions, the sources of prior knowledge and how to use them within an NER system. In the process of comparing several solutions to these challenges we reach some surprising conclusions, as well as develop an NER system that achieves 90.8 F1 score on the CoNLL-2003 NER shared task, the best reported result for this dataset.
(Show Context)

Citation Context

...r and used the strongest provided model trained on the CoNLL03 data with distributional similarity features. The results we obtained on the CoNLL03 test set were consistent with what was reported in (=-=Finkel et al., 2005-=-). Our goal was to compare the performance of the taggers across several datasets. For the most realistic comparison, we have presented each system with a raw text, and relied on the system’s sentence...

The Stanford CoreNLP Natural Language Processing Toolkit

by Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Prismatic Inc, Steven J. Bethard, David Mcclosky - In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations , 2014
"... We describe the design and use of the Stanford CoreNLP toolkit, an extensible pipeline that provides core natural lan-guage analysis. This toolkit is quite widely used, both in the research NLP community and also among commercial and govern-ment users of open source NLP technol-ogy. We suggest that ..."
Abstract - Cited by 124 (9 self) - Add to MetaCart
We describe the design and use of the Stanford CoreNLP toolkit, an extensible pipeline that provides core natural lan-guage analysis. This toolkit is quite widely used, both in the research NLP community and also among commercial and govern-ment users of open source NLP technol-ogy. We suggest that this follows from a simple, approachable design, straight-forward interfaces, the inclusion of ro-bust and good quality analysis compo-nents, and not requiring use of a large amount of associated baggage. 1
(Show Context)

Citation Context

... in text (that is, their likely case in well-edited text), where this information was lost, e.g., for all upper case text. This is implemented with a discriminative model using a CRF sequence tagger (=-=Finkel et al., 2005-=-). pos Labels tokens with their part-of-speech (POS) tag, using a maximum entropy POS tagger (Toutanova et al., 2003). lemma Generates the lemmas (base forms) for all tokens in the annotation. gender ...

Deriving a Large Scale Taxonomy from Wikipedia

by Simone Paolo Ponzetto, Michael Strube - In Proceedings of AAAI , 2007
"... We take the category system in Wikipedia as a conceptual network. We label the semantic relations between categories using methods based on connectivity in the network and lexicosyntactic matching. As a result we are able to derive a large scale taxonomy containing a large amount of subsumption, i.e ..."
Abstract - Cited by 112 (6 self) - Add to MetaCart
We take the category system in Wikipedia as a conceptual network. We label the semantic relations between categories using methods based on connectivity in the network and lexicosyntactic matching. As a result we are able to derive a large scale taxonomy containing a large amount of subsumption, i.e. isa, relations. We evaluate the quality of the created resource by comparing it with ResearchCyc, one of the largest manually annotated ontologies, as well as computing semantic similarity between words in benchmarking datasets.
(Show Context)

Citation Context

...ying these patterns, we use only the lexical head of the categories which were not identified as named entities. That is, if the lexical head of a category is identified by a Named Entity Recognizer (=-=Finkel et al., 2005-=-) as belonging to a named entity, e.g. Brands in YUM!BRANDS, we use the full category name, else we simply use the head, e.g. albums in MILES DAVIS ALBUMS. In order to ensure precision in applying the...

Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

by Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, Daniel S. Weld
"... Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web’s natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially ..."
Abstract - Cited by 104 (15 self) - Add to MetaCart
Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web’s natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially unbounded number of relation extractors. Recently, researchers have developed multiinstance learning algorithms to combat the noisy training data that can come from heuristic labeling, but their models assume relations are disjoint — for example they cannot extract the pair Founded(Jobs, Apple) and CEO-of(Jobs, Apple). This paper presents a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts. We apply our model to learn extractors for NY Times text using weak supervision from Freebase. Experiments show that the approach runs quickly and yields surprising gains in accuracy, at both the aggregate and sentence level. 1
(Show Context)

Citation Context

..., both relation-independent and relation-specific. 6.1 Data Generation We used the same data sets as Riedel et al. (2010) for weak supervision. The data was first tagged with the Stanford NER system (=-=Finkel et al., 2005-=-) and then entity mentions were found by collecting each continuous phrase where words were tagged identically (i.e., as a person, location, or organization). Finally, these phrases were matched to th...

Robust Disambiguation of Named Entities in Text

by Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau
"... Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases ..."
Abstract - Cited by 100 (10 self) - Add to MetaCart
Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.

A Multi-Pass Sieve for Coreference Resolution

by Karthik Raghunathan, Heeyoung Lee, Sudarshan Rangarajan, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky, Christopher Manning
"... Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we ..."
Abstract - Cited by 93 (7 self) - Add to MetaCart
Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier’s entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sievebased approaches could be applied to other NLP tasks. 1
(Show Context)

Citation Context

... next section). For a fair comparison with previous work, we do not use gold named entity labels or mention types but, instead, take the labels provided by the Stanford named entity recognizer (NER) (=-=Finkel et al., 2005-=-). 3.2 Evaluation Metrics We use three evaluation metrics widely used in the literature: (a) pairwise F1 (Ghosh, 2003) – computed over mention pairs in the same entity cluster; (b) MUC (Vilain et al.,...

Modeling relations and their mentions without labeled text

by Sebastian Riedel, Limin Yao, Andrew Mccallum - In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part III , 2010
"... Abstract. Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the ..."
Abstract - Cited by 75 (3 self) - Add to MetaCart
Abstract. Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relation. Here we argue that this leads to noisy patterns that hurt precision, in particular if the knowledge base is not directly related to the text we are working with. We present a novel approach to distant supervision that can alleviate this problem based on the following two ideas: First, we use a factor graph to explicitly model the decision whether two entities are related, and the decision whether this relation is mentioned in a given sentence; second, we apply constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in our training KB. We apply our approach to extract relations from the New York Times corpus and use Freebase as knowledge base. When compared to a state-of-the-art approach for relation extraction under distant supervision, we achieve 31 % error reduction. 1
(Show Context)

Citation Context

...mentioned in the same sentence: again for the year 2007 we find about 170,000 such cases. 6.2 Preprocessing In order to find entity mentions in text we first used the Stanford named entity recognizer =-=[12]-=-. The NER tagger segments each document into sentences and classifies each token into four categories: PERSON, ORGANIZATION, LOCATION and NONE. We treat consecutive tokens which share the same categor...

Recognizing named entities in tweets

by Xiaohua Liu, See Profile, Shaodian Zhang, Ming Zhou, Xiaohua Liu, Shaodian Zhang, Furu Wei, Ming Zhou - In Proc. of ACL , 2011
"... All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."
Abstract - Cited by 73 (1 self) - Add to MetaCart
All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.
(Show Context)

Citation Context

...smatch, current systems trained on non-tweets perform poorly on tweets, a new genre of text, which are short, informal, ungrammatical and noise prone. For example, the average F1 of the Stanford NER (=-=Finkel et al., 2005-=-) , which is trained on the CoNLL03 shared task data set and achieves state-of-the-art performance on that task, drops from 90.8% (Ratinov and Roth, 2009) to 45.8% on tweets. Thus, building a domain s...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University