Results 1 - 10
of
36
Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data
- IN ICML
, 2004
"... In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longrange dependencies exist. We present dynamic conditional random fields (DCRFs), a generalization of linear-chain cond ..."
Abstract
-
Cited by 171 (13 self)
- Add to MetaCart
(Show Context)
In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longrange dependencies exist. We present dynamic conditional random fields (DCRFs), a generalization of linear-chain conditional random fields (CRFs) in which each time slice contains a set of state variables and edges---a distributed state representation as in dynamic Bayesian networks (DBNs)---and parameters are tied across slices. Since exact
Collective segmentation and labeling of distant entities in information extraction.
, 2004
"... Abstract In information extraction, we often wish to identify all mentions of an entity, such as a person or organization. Traditionally, a group of words is labeled as an entity based only on local information. But information from throughout a document can be useful; for example, if the same word ..."
Abstract
-
Cited by 91 (17 self)
- Add to MetaCart
(Show Context)
Abstract In information extraction, we often wish to identify all mentions of an entity, such as a person or organization. Traditionally, a group of words is labeled as an entity based only on local information. But information from throughout a document can be useful; for example, if the same word is used multiple times, it is likely to have the same label each time. We present a CRF that explicitly represents dependencies between the labels of pairs of similar words in a document. On a standard information extraction data set, we show that learning these dependencies leads to a 13.7% reduction in error on the field that had caused the most repetition errors.
Multi-level Boundary Classification for Information Extraction
- IN ECML
, 2004
"... We investigate the application of classification techniques to the problem of information extraction (IE). In particular we use support vector machines and several different feature-sets to build a set of classifiers for IE. We show that this approach is competitive with current state-of-the-art ..."
Abstract
-
Cited by 37 (1 self)
- Add to MetaCart
We investigate the application of classification techniques to the problem of information extraction (IE). In particular we use support vector machines and several different feature-sets to build a set of classifiers for IE. We show that this approach is competitive with current state-of-the-art IE algorithms based on specialized learning algorithms. We also introduce a new technique for improving the recall of our IE algorithm. This approach uses a two-level ensemble of classifiers to improve the recall of the extracted fragments while maintaining high precision. We show that this approach outperforms current state-of-the-art IE algorithms on several benchmark IE tasks.
Composition of Conditional Random Fields for Transfer Learning
- PROCEEDINGS OF HLT/EMNLP
, 2005
"... Many learning tasks have subtasks for which much training data exists. Therefore, we want to transfer learning from the old, generalpurpose subtask to a more specific new task, for which there is often less data. While work in transfer learning often considers how the old task should affect learning ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
(Show Context)
Many learning tasks have subtasks for which much training data exists. Therefore, we want to transfer learning from the old, generalpurpose subtask to a more specific new task, for which there is often less data. While work in transfer learning often considers how the old task should affect learning on the new task, in this paper we show that it helps to take into account how the new task affects the old. Specifically, we perform joint decoding of separately-trained sequence models, preserving uncertainty between the tasks and allowing information from the new task to affect predictions on the old task. On two standard text data sets, we show that joint decoding outperforms cascaded decoding.
Adaptive Information Extraction
"... The growing availability of on-line textual sources and the potential number of applications of knowledge acquisition from textual data has lead to an increase in Information Extraction (IE) research. Some examples of these applications are the generation of data bases from documents, as well as the ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
The growing availability of on-line textual sources and the potential number of applications of knowledge acquisition from textual data has lead to an increase in Information Extraction (IE) research. Some examples of these applications are the generation of data bases from documents, as well as the acquisition of knowledge useful for emerging technologies like question answering, informa-tion integration, and others related to text mining. However, one of the main drawbacks of the application of IE refers to its intrinsic domain dependence. For the sake of reducing the high cost of manually adapting IE applications to new domains, experiments with dierent Machine Learning (ML) techniques have been carried out by the research community. This survey describes and compares the main approaches to IE and the dierent ML techniques used to achieve Adaptive IE technology. 1
IE evaluation: Criticisms and recommendations
, 2004
"... We survey the evaluation methodology adopted in Information Extraction (IE), as defined in the MUC conferences and in later independent efforts applying machine learning to IE. We point out a number of problematic issues that may hamper the comparison between results obtained by different resea ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We survey the evaluation methodology adopted in Information Extraction (IE), as defined in the MUC conferences and in later independent efforts applying machine learning to IE. We point out a number of problematic issues that may hamper the comparison between results obtained by different researchers. Some of them are common to other NLP tasks: e.g., the difficulty of exactly identifying the effects on performance of the data (sample selection and sample size), of the domain theory (features selected), and of algorithm parameter settings.
Information Extraction by Convergent Boundary Classification
- IN AAAI-2004 WORKSHOP ON ADAPTIVE TEXT EXTRACTION AND MINING
, 2004
"... We investigate the application of classification techniques to the problem of information extraction (IE). In particular we use support vector machines and several different feature-sets to build a set of classifiers for information extraction. We show that this approach is competitive with curr ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
We investigate the application of classification techniques to the problem of information extraction (IE). In particular we use support vector machines and several different feature-sets to build a set of classifiers for information extraction. We show that this approach is competitive with current state-ofthe -art information extraction algorithms based on specialized learning algorithms. We also introduce a new technique for improving the recall of IE systems called convergent boundary classification. We show that this can give significant improvement in the performance of our IE system and gives a system with both high precision and high recall.
Robust Approach to Abbreviating Terms: A Discriminative Latent Variable Model with Global Information
"... The present paper describes a robust approach for abbreviating terms. First, in order to incorporate non-local information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model, or alternatively, the label encoding approach with global informa ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
The present paper describes a robust approach for abbreviating terms. First, in order to incorporate non-local information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model, or alternatively, the label encoding approach with global information. Although the two approaches compete with one another, we demonstrate that these approaches are also complementary. By combining these two approaches, experiments revealed that the proposed abbreviation generator achieved the best results for both the Chinese and English languages. Moreover, we directly apply our generator to perform a very different task from tradition, the abbreviation recognition. Experiments revealed that the proposed model worked robustly, and outperformed five out of six state-of-the-art abbreviation recognizers. 1
An Overview and Classification of Adaptive Approaches to Information Extraction
- JOURNAL ON DATA SEMANTICS, IV:172–212. LNCS 3730
, 2005
"... Most of the information stored in digital form is hidden in natural language texts. Extracting and storing it in a formal representation (e.g. in form of relations in databases) allows efficient querying, easy administration and further automatic processing of the extracted data. The area of informa ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Most of the information stored in digital form is hidden in natural language texts. Extracting and storing it in a formal representation (e.g. in form of relations in databases) allows efficient querying, easy administration and further automatic processing of the extracted data. The area of information extraction (IE) comprises techniques, algorithms and methods performing two important tasks: finding (identifying) the desired, relevant data and storing it in appropriate form for future use. The rapidly increasing number and diversity of IE systems are the evidence of continuous activity and growing attention to this field. At the same time it is becoming more and more difficult to overview the scope of IE, to see advantages of certain approaches and differences to others. In this paper we identify and describe promising approaches to IE. Our focus is adaptive systems that can be customized for new domains through training or the use of external knowledge sources. Based on the observed origins and requirements of the examined IE techniques a classification of different types of adaptive IE systems is established.
Segment-based Hidden Markov Models for Information Extraction
"... Hidden Markov models (HMMs) are powerful statistical models that have found successful applications in Information Extraction (IE). In current approaches to applying HMMs to IE, an HMM is used to model text at the document level. This modelling might cause undesired redundancy in extraction in the s ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Hidden Markov models (HMMs) are powerful statistical models that have found successful applications in Information Extraction (IE). In current approaches to applying HMMs to IE, an HMM is used to model text at the document level. This modelling might cause undesired redundancy in extraction in the sense that more than one filler is identified and extracted. We propose to use HMMs to model text at the segment level, in which the extraction process consists of two steps: a segment retrieval step followed by an extraction step. In order to retrieve extractionrelevant segments from documents, we introduce a method to use HMMs to model and retrieve segments. Our experimental results show that the resulting segment HMM IE system not only achieves near zero extraction redundancy, but also has better overall extraction performance than traditional document HMM IE systems. 1