• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A Spectral Algorithm for Learning Hidden Markov Models

Cached

  • Download as a PDF

Download Links

  • [ttic.uchicago.edu]
  • [www.cse.ucsd.edu]
  • [cseweb.ucsd.edu]
  • [www.cs.ucsd.edu]
  • [www.cse.ucsd.edu]
  • [www.cs.mcgill.ca]
  • [ttic.uchicago.edu]
  • [arxiv.org]
  • [arxiv.org]
  • [arxiv.org]
  • [arxiv.org]
  • [arxiv.org]
  • [www.cs.ucsd.edu]
  • [cseweb.ucsd.edu]
  • [www.cse.ucsd.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Daniel Hsu , Sham M. Kakade , Tong Zhang
Citations:27 - 3 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Hsu_aspectral,
    author = {Daniel Hsu and Sham M. Kakade and Tong Zhang},
    title = {A Spectral Algorithm for Learning Hidden Markov Models},
    year = {}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard; practitioners typically resort to search heuristics (such as the Baum-Welch / EM algorithm) which suffer from the usual local optima issues. We prove that under a natural separation condition (roughly analogous to those considered for learning mixture models), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations—it implicitly depends on this number through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple: it employs only a singular value decomposition and matrix multiplications. 1

Citations

6515 Elements of Information Theory - Cover, Thomas - 1991
6232 Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird, et al. - 1977
3116 A tutorial on hidden Markov models and selected applications in speech recognition - Rabiner - 1989
573 A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains - Baum, Petrie, et al. - 1970
319 On the method of bounded differences - McDiarmid - 1989
173 An inequality with applications to statistical estimation for probabilistic functions of a Markov process and to models for ecology - Baum, Eagon - 1967
145 Predictive representations of state - Littman, Sutton, et al. - 2002
89 On the Definition of a Family of Automata - Schützenberger - 1961
78 On the learnability of discrete distributions - Kearns, Mansour, et al. - 1994
67 and Ji-Guang Sun. Matrix Perturbation Theory - Stewart - 1990
58 Learning mixtures of arbitrary gaussians - Arora, Kannan - 2005
49 Observable operator models for discrete stochastic time series - Jaeger
41 The most predictable criterion - Hotelling - 1935
36 On spectral learning of mixtures of distributions - Achlioptas, McSherry
31 A spectral algorithm for learning mixtures of distributions - Vempala, Wang - 2002
18 Learning nonsingular phylogenies and hidden Markov models - Mossel, Roch
14 Reduced-rank hidden Markov models - Siddiqi, Boots, et al. - 2010
13 Realization by stochastic finite automaton - Carlyle, Paz - 1971
12 Matrices deHankel - Fliess - 1974
11 Hilbert space embeddings of hidden markov models - Song, Siddiqi, et al. - 2010
8 Subspace Methods for System Identification - Katayama - 2005
6 A probabilistic analysis of EM for mixtures of separated, spherical Gaussians - Dasgupta, Schulman
6 Planning in POMDPs using multiplicity automata - Even-Dar, Kakade, et al. - 2005
6 Learning mixtures of product distributions using correlations and independence - Chaudhuri, Rao - 2008
5 Subspace Identification of Linear Systems - Overschee, Moor - 1996
4 Learning observable operator models via the es algorithm - Jaeger, Zhao, et al. - 2006
3 Brubaker and Santosh Vempala. Isotropic PCA and affine-invariant clustering - Charles - 2008
3 The value of observation for monitoring dynamic systems - Even-Dar, Kakade, et al. - 2007
3 On the learnability of hidden Markov models - Terwijn - 2002
3 Learning hidden markov models using non-negative matrix factorization - Cybenko, Crespi - 2008
2 Learning mixutres of Gaussians - Dasgupta - 1999
2 Hadi Salmasian, and Santosh Vempala. The spectral method for general mixture models - Kannan - 2005
2 On the definition of a family of automata. Inf. Control, 4:245–270, 1961. Sebastiaan Terwijn. On the learnability of hidden Markov models - Schützenberger - 2002
1 System Identification: Theory for the User. NJ: Prentice-Hall Englewood Cliffs - Ljung - 1987
1 Linear optimal prediction and innovations representations of hidden markov models - Andersson, Ryden, et al. - 2003
1 A new approach for the identification of hidden markov models - Vanluyten, Willems, et al. - 2007
1 The error controlling algorithm for learning OOMs - Zhao, Jaeger - 2007
1 On the definition of a family of automata - Shlitzenberger - 1961
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University