MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Agnostic classification of Markovian sequences (1997) [24 citations — 9 self]

Download:
pdf | ps
by Ran El-yaniv, Shai Fine, Naftali Tishby
In Advances in Neural Information Processing (NIPS-97
http://www.cs.technion.ac.il/~rani/papers/agnostic-clustering.ps
Add To MetaCart

Abstract:

Classification of finite sequences without explicit knowledge of their statistical nature is a fundamental problem with many important applications. We propose a new information theoretic approach to this problem which is based on the following ingredients: (i) sequences are similar when they are likely to be generated by the same source; (ii) cross entropies can be estimated via "universal compression"; (iii) Markovian sequences can be optimally merged. With these ingredients we can classify discrete sequences whenever they can be compressed. We introduce the method and illustrate its application for hierarchical clustering of languages and for estimating similarities of protein sequences. 1

Citations

4364 Elements of Information Theory – Cover, Thomas - 1991
374 Information Theory: Coding Theorems for Discrete Memoryless Systems – Csiszár, Körner - 1982
320 Amino acid substitution matrices from protein blocks – Henikoff, Henikoff - 1992
178 Divergence Measures Based on the Shannon Entropy – Lin - 1991
177 Testing statistical hypotheses – Lehmann - 1986
21 A measure of relative entropy between individual sequences with application to universal classification – Ziv, Merhav - 1993
3 An Improved Measure of Relative Entropy Between Individual Sequences, unpublished manuscript – Bachrach, El-Yaniv - 1997