See this document in CiteSeerX!

Journal of Machine Learning Research 3 (2003) 1415-1438 Submitted 5/02; Published 3/03 Feature Extraction by Non-Parametric Mutual Information  (Make Corrections)  
Maximization Kari Torkkola KARI. Motorola Labs 7700...



  Home/Search   Context   Related

 
View or download:
ai.informatik.uni...torkkola_2003a.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  ai.informatik.unidor...DOKUMENTE (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: We present a method for learning discriminative feature transforms using as criterion the mutual information between class labels and transformed features. Instead of a commonly used mutual information measure based on Kullback-Leibler divergence, we use a quadratic divergence measure, which allows us to make an efficient non-parametric implementation and requires no prior assumptions about class densities. In addition to linear transforms, we also discuss nonlinear transforms that are... (Update)

Similar documents (at the sentence level):
10.5%:   Journal of Machine Learning Research 3 (2003) 1415-1438 .. - Maximization Kari..   (Correct)

Active bibliography (related documents):   More   All
1.1:   Linear Discriminant Analysis in Document Classification - Torkkola (2001)   (Correct)
0.9:   SVM Decision Boundary Based Discriminative Subspace Induction - Zhang, Liu (2002)   (Correct)
0.5:   Feature Selection by Maximum Marginal Diversity - Vasconcelos (2003)   (Correct)

Similar documents based on text:
0.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

@misc{ torkkola-journal,
  author = "Maximization Kari Torkkola",
  title = "Journal of Machine Learning Research 3 (2003) 1415-1438 Submitted 5/02;
    Published 3/03 Feature Extraction by Non-Parametric Mutual Information",
  url = "citeseer.ist.psu.edu/766099.html" }
Citations (may not include all citations):
1662   Neural Networks for Pattern Recognition (context) - Bishop - 1995
1447   A mathematical theory of communication (context) - Shannon - 1948
897   Introduction to statistical pattern recognition (context) - Fukunaga - 1990
428   IEEE Transactions on Information Theory (context) - Hellman, Raviv et al. - 1970
239   Pattern recognition: A statistical approach (context) - Devijver, Kittler - 1982
209   the estimation of probability density function and the mode - Parzen - 1962
187   Nonlinear component analysis as a kernel eigenvalue problem (context) - Scholkopf, Smola et al. - 1998
88   Statistical mechanics and phase transitions in clustering (context) - Rose, Gurewitz et al. - 1990
67   LVQ PAK: A program package for the correct application of Le.. - Kohonen, Kangas et al. - 1996
59   Entropy optimization principles with applications (context) - Kapur, Kesavan - 1992
52   Using mutual information for selecting features in supervise.. (context) - Battiti - 1994
44   SVMTorch: Support Vector Machines for Large-Scale Regression.. - Collobert, Bengio - 2001
38   On measures of entropy and information (context) - Renyi - 1961
35   Feature selection for SVMs - Weston, Mukherjee et al. - 2001
33   Experiments with random projection (context) - Dasgupta - 2000
33   Transmission of Information: A Statistical theory of Communi.. (context) - Fano - 1961
29   Selecting input variables using mutual information and nonpa.. - Bonnlander, Weigend - 1994
25   Mutual information in learning feature transformations - Torkkola, Campbell - 2000
20   Information theoretic learning - Principe, Xu - 2000
18   Heteroscedastic discriminant analysis and reduced rank HMMs .. (context) - Kumar, Andreou - 1998
16   Measures of information and their applications (context) - Kapur - 1994
11   An optimal orthonormal system for discriminant analysis (context) - Okada, Tomita - 1985
7   Minimum Bayes error feature selection for continuous speech .. - Saon, Padmanabhan - 2001
5   Linear feature extractors based on mutual information - Bollacker, Ghosh - 1996
4   Data visualization and feature selection: New algorithms for.. - Yang, Moody - 2000
3   Lower bounds for bayes error estimation (context) - Antos, Devroye et al. - 1999
3   Nonparametric discriminant analysis via recursive optimizati.. (context) - Aladjem - 1998
3   Some inequalities for information divergence and related mea.. (context) - Topse - 2000
2   Communications and Signal Processing Laboratory (context) - Hero, Ma et al. - 2001
2   A common neural network model for unsupervised exploratory d.. - Girolami, Cichocki et al. - 1998
2   IEEE Transactions on Information Theory (context) - Patrick, Nonparametric - 1969
1   Bhattacharyya distance feature selection (context) - Guorong, Peiqi et al. - 1996
1   A non-parametric approach to linear feature extraction; Appl.. (context) - Hillion, Masson et al. - 1988

Documents on the same site (http://www-ai.informatik.uni-dortmund.de/DOKUMENTE):   More
Efficient Kernel Calculation for Multirelational Data - Rüping (2002)   (Correct)
Domain Knowledge and Data Mining Process Decisions - Knobbe, Schipper, Brockhausen (2000)   (Correct)
Text Categorization with Support Vector Machines: Learning with.. - Joachims (1998)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC