Results 11  20
of
6,048
Using mutual information for selecting features in supervised neural net learning
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1994
"... This paper investigates the application of the mutual infor“ criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables, it is ..."
Abstract

Cited by 358 (1 self)
 Add to MetaCart
, it is suitable for assessing the “information content ” of features in complex classification tasks, where methods bases on linear relations (like the correlation) are prone to mistakes. The fact that the mutual information is independent of the coordinates chosen permits a robust estimation. Nonetheless
P.C.: Minimum phone error and Ismoothing for improved discriminative training
 In: Proc. ICASSP
, 2002
"... In this paper we introduce the Minimum Phone Error (MPE) and Minimum Word Error (MWE) criteria for the discriminative training of HMM systems. The MPE/MWE criteria are smoothed approximations to the phone or word error rate respectively. We also discuss Ismoothing which is a novel technique for s ..."
Abstract

Cited by 250 (13 self)
 Add to MetaCart
for smoothing discriminative training criteria using statistics for maximum likelihood estimation (MLE). Experiments have been performed on the Switchboard/Call Home corpora of telephone conversations with up to 265 hours of training data. It is shown that for the maximum mutual information estimation (MMIE
Using Maximum Entropy for Text Classification
, 1999
"... This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, partofspeech tagging, and text segmentation. The underlying principl ..."
Abstract

Cited by 326 (6 self)
 Add to MetaCart
This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, partofspeech tagging, and text segmentation. The underlying
Strictly Proper Scoring Rules, Prediction, and Estimation
, 2007
"... Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if he ..."
Abstract

Cited by 373 (28 self)
 Add to MetaCart
if he or she issues the probabilistic forecast F, rather than G ̸ = F. It is strictly proper if the maximum is unique. In prediction problems, proper scoring rules encourage the forecaster to make careful assessments and to be honest. In estimation problems, strictly proper scoring rules provide
Statistical shape influence in geodesic active contours
 In Proc. 2000 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head, SC
, 2000
"... A novel method of incorporating shape information into the image segmentation process is presented. We introduce a representation for deformable shapes and define a probability distribution over the variances of a set of training shapes. The segmentation process embeds an initial curve as the zero l ..."
Abstract

Cited by 396 (4 self)
 Add to MetaCart
level set of a higher dimensional surface, and evolves the surface such that the zero level set converges on the boundary of the object to be segmented. At each step of the surface evolution, we estimate the maximum a posteriori (MAP) position and shape of the object in the image, based on the prior
Mutual information and minimum meansquare error in Gaussian channels
 IEEE TRANS. INFORM. THEORY
, 2005
"... This paper deals with arbitrarily distributed finitepower input signals observed through an additive Gaussian noise channel. It shows a new formula that connects the inputoutput mutual information and the minimum meansquare error (MMSE) achievable by optimal estimation of the input given the out ..."
Abstract

Cited by 288 (34 self)
 Add to MetaCart
This paper deals with arbitrarily distributed finitepower input signals observed through an additive Gaussian noise channel. It shows a new formula that connects the inputoutput mutual information and the minimum meansquare error (MMSE) achievable by optimal estimation of the input given
Maximum Conditional Mutual Information Projection for Speech Recognition
 in Proc. Eurospeech 2003
, 2003
"... Linear discriminant analysis (LDA) in its original modelfree formulation is best suited to classification problems with equalcovariance classes. Heteroscedastic discriminant analysis (HDA) removes this equal covariance constraint, and therefore is more suitable for automatic speech recognition (ASR ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
of the LDA projection matrix is a maximum mutual information estimation problem in the lowerdimensional space with some constraints on the model of the joint conditional and unconditional probability density functions (PDF) of the features, and then, by relaxing these constraints, we develop a
A hidden Markov model for predicting transmembrane helices in protein sequences
 In Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology (ISMB
, 1998
"... A novel method to model and predict the location and orientation of alpha helices in membrane spanning proteins is presented. It is based on a hidden Markov model (HMM) with an architecture that corresponds closely to the biological system. The model is cyclic with 7 types of states for helix core, ..."
Abstract

Cited by 373 (9 self)
 Add to MetaCart
and constraints involved. Models were estimated both by maximum likelihood and a discriminative method, and a method for reassignment of the membrane helix boundaries were developed. In a cross validated test on single sequences, our transmembrane HMM, TMHMM, correctly predicts the entire topology for 77
A Maximum Entropy Approach to Adaptive Statistical Language Modeling
 Computer, Speech and Language
, 1996
"... An adaptive statistical languagemodel is described, which successfullyintegrates long distancelinguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's h ..."
Abstract

Cited by 293 (12 self)
 Add to MetaCart
, but these are shown here to be seriously deficient. Instead, we apply the principle of Maximum Entropy (ME). Each information source gives rise to a set of constraints, to be imposed on the combined estimate. The intersection of these constraints is the set of probability functions which are consistent with all
Results 11  20
of
6,048