Results 1 - 10
of
4,246
A Maximum-Entropy-Inspired Parser
, 1999
"... We present a new parser for parsing down to Penn tree-bank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5,9,10,15,17] "stan- dard" se ..."
Abstract
-
Cited by 971 (19 self)
- Add to MetaCart
" sections of the Wall Street Journal tree- bank. This represents a 13% decrease in error rate over the best single-parser results on this corpus [9]. The major technical innova- tion is the use of a "maximum-entropy-inspired" model for conditioning and smoothing that let us successfully to test
A Maximum Entropy approach to Natural Language Processing
- COMPUTATIONAL LINGUISTICS
, 1996
"... The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper we des ..."
Abstract
-
Cited by 1366 (5 self)
- Add to MetaCart
The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper we
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract
-
Cited by 561 (18 self)
- Add to MetaCart
, capitalization, formatting, part-of-speech), and defines the conditional probability of state sequences given observation sequences. It does this by using the maximum entropy framework to fit a set of exponential models that represent the probability of a state given an observation and the previous state. We
Discriminative Training and Maximum Entropy Models for Statistical Machine Translation
, 2002
"... We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source -channel approach as a special case. All knowledge sources are treated as feature functions, which depend on the source language senten ..."
Abstract
-
Cited by 508 (30 self)
- Add to MetaCart
We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source -channel approach as a special case. All knowledge sources are treated as feature functions, which depend on the source language
A Maximum Entropy Model for Part-Of-Speech Tagging
, 1996
"... This paper presents a statistical model which trains from a corpus annotated with Part-OfSpeech tags and assigns them to previously unseen text with state-of-the-art accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "features" t ..."
Abstract
-
Cited by 580 (1 self)
- Add to MetaCart
This paper presents a statistical model which trains from a corpus annotated with Part-OfSpeech tags and assigns them to previously unseen text with state-of-the-art accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms
, 2002
"... We describe new algorithms for training tagging models, as an alternative to maximum-entropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a modific ..."
Abstract
-
Cited by 660 (13 self)
- Add to MetaCart
We describe new algorithms for training tagging models, as an alternative to maximum-entropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a
Fast and robust fixed-point algorithms for independent component analysis
- IEEE TRANS. NEURAL NETW
, 1999
"... Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. In this paper, we use a combination of two different approaches for linear ICA: Comon’s informat ..."
Abstract
-
Cited by 884 (34 self)
- Add to MetaCart
information-theoretic approach and the projection pursuit approach. Using maximum entropy approximations of differential entropy, we introduce a family of new contrast (objective) functions for ICA. These contrast functions enable both the estimation of the whole decomposition by minimizing mutual information
TnT - A Statistical Part-Of-Speech Tagger
, 2000
"... Trigrams'n'Tags (TnT) is an efficient statistical part-of-speech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov models performs at least as well as other current approaches, including the Maximum Entropy framework. A recent comparison h ..."
Abstract
-
Cited by 540 (5 self)
- Add to MetaCart
Trigrams'n'Tags (TnT) is an efficient statistical part-of-speech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov models performs at least as well as other current approaches, including the Maximum Entropy framework. A recent comparison
Thumbs up? Sentiment Classification using Machine Learning Techniques
- IN PROCEEDINGS OF EMNLP
, 2002
"... We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three mac ..."
Abstract
-
Cited by 1101 (7 self)
- Add to MetaCart
machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging
Shallow Parsing with Conditional Random Fields
, 2003
"... Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluati ..."
Abstract
-
Cited by 581 (8 self)
- Add to MetaCart
optimization algorithms were critical in achieving these results. We present extensive comparisons between models and training methods that confirm and strengthen previous results on shallow parsing and training methods for maximum-entropy models.
Results 1 - 10
of
4,246