Results 1 - 10
of
372
Shallow Parsing with Conditional Random Fields
, 2003
"... Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluati ..."
Abstract
-
Cited by 581 (8 self)
- Add to MetaCart
Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluation datasets and extensive comparison among methods. We show here how to train a conditional random field to achieve performance as good as any reported base noun-phrase chunking method on the CoNLL task, and better than any reported single model. Improved training methods based on modern optimization algorithms were critical in achieving these results. We present extensive comparisons between models and training methods that confirm and strengthen previous results on shallow parsing and training methods for maximum-entropy models.
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract
-
Cited by 561 (18 self)
- Add to MetaCart
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial distributions over a discrete vocabulary, and the HMM parameters are set to maximize the likelihood of the observations. This paper presents a new Markovian sequence model, closely related to HMMs, that allows observations to be represented as arbitrary overlapping features (such as word, capitalization, formatting, part-of-speech), and defines the conditional probability of state sequences given observation sequences. It does this by using the maximum entropy framework to fit a set of exponential models that represent the probability of a state given an observation and the previous state. We present positive experimental results on the segmentation of FAQ’s.
A Machine Learning Approach to Coreference Resolution of Noun Phrases
, 2001
"... this paper, we present a learning approach to coreference resolution of noun phrases in unrestricted text. The approach learns from a small, annotated corpus and the task includes resolving not just a certain type of noun phrase (e.g., pronouns) but rather general noun phrases. It also does not rest ..."
Abstract
-
Cited by 270 (3 self)
- Add to MetaCart
this paper, we present a learning approach to coreference resolution of noun phrases in unrestricted text. The approach learns from a small, annotated corpus and the task includes resolving not just a certain type of noun phrase (e.g., pronouns) but rather general noun phrases. It also does not restrict the entity types of the noun phrases; that is, coreference is assigned whether they are of "organization," "person," or other types. We evaluate our approach on common data sets (namely, the MUC-6 and MUC-7 coreference corpora) and obtain encouraging results, indicating that on the general noun phrase coreference task, the learning approach holds promise and achieves accuracy comparable to that of nonlearning approaches. Our system is the first learning-based system that offers performance comparable to that of state-of-the-art nonlearning systems on these data sets
Kernel Methods for Relation Extraction
, 2002
"... We present an application of kernel methods to extracting relations from unstructured natural language sources. ..."
Abstract
-
Cited by 219 (0 self)
- Add to MetaCart
We present an application of kernel methods to extracting relations from unstructured natural language sources.
Shallow semantic parsing using Support Vector Machines
, 2004
"... In this paper, we propose a machine learning algorithm for shallow semantic parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give an improvement in performance over earlier classifiers. We sh ..."
Abstract
-
Cited by 176 (9 self)
- Add to MetaCart
In this paper, we propose a machine learning algorithm for shallow semantic parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give an improvement in performance over earlier classifiers. We show performance improvements through a number of new features and measure their ability to generalize to a new test set drawn from the AQUAINT corpus. 1
Multimodal Video Indexing: A Review of the State-of-the-art
- Multimedia Tools and Applications
, 2003
"... Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video in ..."
Abstract
-
Cited by 173 (19 self)
- Add to MetaCart
Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video indexing have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. Therefore, instead of separately treating the different information sources involved, and their specific algorithms, we focus on the similarities and differences between the modalities. To that end we put forward a unifying and multimodal framework, which views a video document from the perspective of its author. This framework forms the guiding principle for identifying index types, for which automatic methods are found in literature. It furthermore forms the basis for categorizing these different methods.
Coupled Semi-Supervised Learning for Information Extraction
"... We consider the problem of semi-supervised learning to extract categories (e.g., academic fields, athletes) and relations (e.g., PlaysSport(athlete, sport)) from web pages, starting with a handful of labeled training examples of each category or relation, plus hundreds of millions of unlabeled web d ..."
Abstract
-
Cited by 137 (6 self)
- Add to MetaCart
(Show Context)
We consider the problem of semi-supervised learning to extract categories (e.g., academic fields, athletes) and relations (e.g., PlaysSport(athlete, sport)) from web pages, starting with a handful of labeled training examples of each category or relation, plus hundreds of millions of unlabeled web documents. Semi-supervised training using only a few labeled examples is typically unreliable because the learning task is underconstrained. This paper pursues the thesis that much greater accuracy can be achieved by further constraining the learning task, by coupling the semi-supervised training of many extractors for different categories and relations. We characterize several ways in which the training of category and relation extractors can be coupled, and present experimental results demonstrating significantly improved accuracy as a result. Categories and Subject Descriptors I.2.6 [Artificial Intelligence]: Learning—knowledge acquisition;
Named Entity Recognition using an HMM-based Chunk Tagger
, 2002
"... This paper proposes an HMM-based chunk tagger, from which a named entity recognition system is built to combine four internal and external evidences: 1) simple internal feature such as capitalization and digitalization; 2) internal semantic feature of important triggers; 3) internal gazetteer fea ..."
Abstract
-
Cited by 111 (7 self)
- Add to MetaCart
This paper proposes an HMM-based chunk tagger, from which a named entity recognition system is built to combine four internal and external evidences: 1) simple internal feature such as capitalization and digitalization; 2) internal semantic feature of important triggers; 3) internal gazetteer feature; 4) external macro context feature.
Support Vector Learning for Semantic Argument Classification
, 2005
"... The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing—the process of assigning a WHO did WHAT to WHOM, WHEN, WHERE, WHY,HOW etc. structure to plain text. This process entails identifying groups of words in a sentence ..."
Abstract
-
Cited by 106 (12 self)
- Add to MetaCart
(Show Context)
The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing—the process of assigning a WHO did WHAT to WHOM, WHEN, WHERE, WHY,HOW etc. structure to plain text. This process entails identifying groups of words in a sentence that represent these semantic arguments and assigning specific labels to them. It could play a key role in NLP tasks like Information Extraction, Question Answering and Summarization. We propose a machine learning algorithm for semantic role parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give large improvement in performance over earlier classifiers. We show performance improvements through a number of new features designed to improve generalization to unseen data, such as automatic clustering of verbs. We also report on various analytic studies examining which features are most important, comparing our classifier to other machine learning algorithms in the literature, and testing its generalization to new test set from different genre. On the task of assigning semantic labels to the PropBank (Kingsbury, Palmer, & Marcus, 2002) corpus, our final system has a precision of 84 % and a recall of 75%, which are the best results currently reported for this task. Finally, we explore a completely different architecture which does not requires a deep syntactic parse. We reformulate the task as a combined chunking and classification problem, thus allowing our algorithm to be applied to new languages or genres of text for which statistical syntactic parsers may not be available.
Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration method
- In Proceedings of the ACM SIGKDD Conference
, 2004
"... We consider the problem of improving named entity recognition (NER) systems by using external dictionaries—more specifically, the problem of extending state-of-the-art NER systems by incorporating information about the similarity of extracted entities to entities in an external dictionary. This is d ..."
Abstract
-
Cited by 98 (6 self)
- Add to MetaCart
(Show Context)
We consider the problem of improving named entity recognition (NER) systems by using external dictionaries—more specifically, the problem of extending state-of-the-art NER systems by incorporating information about the similarity of extracted entities to entities in an external dictionary. This is difficult because most high-performance named entity recognition systems operate by sequentially classifying words as to whether or not they participate in an entity name; however, the most useful similarity measures score entire candidate names. To correct this mismatch we formalize a semi-Markov extraction process which relaxes the usual Markov assumptions. This process is based on sequentially classifying segments of several adjacent words, rather than single words. In addition to allowing a natural way of coupling NER and high-performance record linkage methods, this formalism also allows the direct use of other useful entity-level features, and provides a more natural formulation of the NER problem than sequential word classification. Experiments in multiple domains show that the new model can substantially improve extraction performance, relative to previously published methods for using external dictionaries in NER.