We introduce a statistical model for times series data with nonlinear dynamics which iteratively segments the data into regimes with approximately linear dynamics and learns the parameters of each of those regimes. This model combines and generalizes two of the most widely used stochastic time series models---the hidden Markov model and the linear dynamical system---and is related to models that are widely used in the control and econometrics literatures. It can also be derived by extending the mixture of experts neural network model (Jacobs et al., 1991) to its fully dynamical version, in which both expert and gating networks are recurrent. Inferring the posterior probabilities of the hidden states of this model is computationally intractable, and therefore the exact Expectation Maximization (EM) alogithm cannot be applied. However, we present a variational approximation which maximizes a lower bound on the log likelihood and makes use of both the forward--backward recursions for hidden Markov models and the Kalman filter recursions for linear dynamical systems. 1
|
4387
|
Probabilistic Reasoning in Intelligent Systems
– Pearl
- 1988
|
|
4363
|
Elements of Information Theory
– Cover, Thomas
- 1991
|
|
4344
|
Maximum likelihood from incomplete data via the EM algorithm
– Dempster, Laird, et al.
- 1977
|
|
903
|
Local computations with probabilities on graphical structures and their applications to expert systems
– Lauritzen, Spiegelhalter
- 1988
|
|
593
|
Hierarchical Mixtures of Experts and the EM algorithm
– Jordan, Jacobs
- 1993
|
|
569
|
Adaptive Mixture of Local Experts
– Jacobs, Jordan, et al.
- 1991
|
|
544
|
An introduction to hidden markov models
– Rabiner, Juang
- 1986
|
|
447
|
Contour tracking by stochastic propagation of conditional density
– Isard, Blake
- 1996
|
|
415
|
A maximization technique occurring in statistical analysis of probabilistic functions of Markov chains
– Baum, Petrie, et al.
- 1970
|
|
410
|
An introduction to variational methods for graphical models
– Jordan, Ghahramani, et al.
- 1997
|
|
410
|
A New View of the EM Algorithm that Justifies Incremental and Other Variants“, Learning in Graphical Models
– Neal, Hinton
- 1993
|
|
337
|
Optimal Filtering
– Anderson, Moore
- 1979
|
|
259
|
Theory and Practice of Recursive Identification
– Ljung, Söderström
- 1983
|
|
254
|
Factorial hidden Markov models
– Ghahramani, Jordan
- 1997
|
|
194
|
A New Approach to the Economic Analysis of Nonstationary Time
– Hamilton
- 1989
|
|
166
|
Adaptive Filtering Prediction and Control
– Goodwin, Sin
- 1984
|
|
153
|
The EM algorithm for mixtures of factor analyzers
– Ghahramani, Hinton
- 1996
|
|
152
|
On Gibbs Sampling for State Space Models
– Carter, Kobn
- 1994
|
|
146
|
Probabilistic independence networks for hidden markov probability models
– Smyth, Heckerman, et al.
|
|
134
|
Stochastic simulation algorithms for dynamic probabilistic networks
– Kanazawa, Koller, et al.
- 1995
|
|
117
|
Modeling the manifolds of images of handwritten digits
– Hinton, Dayan, et al.
- 1996
|
|
111
|
Hidden Markov models of biological primary sequence information
– Baldi, Chauvin, et al.
- 1994
|
|
108
|
Statistical field theory
– Parisi
- 1988
|
|
108
|
An approach to time series smoothing and forecasting using the EM algorithm,” Journal of time series analysis
– Shumway, Stoffer
- 1982
|
|
99
|
Multitarget –Multisensor Tracking
– Bar-Shalom, Li
- 1995
|
|
94
|
An Introduction to Latent Variable Models
– Everitt
- 1984
|
|
91
|
An input/output HMM architecture
– Bengio, Frasconi
- 1996
|
|
77
|
Learning to track the visual motion of contours
– Blake, Isard, et al.
- 1995
|
|
77
|
Exploiting tractable substructures in intractable networks
– Saul, Jordan
- 1996
|
|
61
|
Dynamic Linear Models with Markov-Switching
– Kim
- 1994
|
|
60
|
Parameter estimation for linear dynamical systems
– Ghahramani, Hinton
- 1996
|
|
56
|
Mixtures of controllers for jump linear and non-linear plants
– Cacciatore, Nowlan
- 1994
|
|
51
|
ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition
– Digalakis, Rohlicek, et al.
- 1993
|
|
44
|
Hidden markov models for speech recognition
– Juang, Rabiner
- 1991
|
|
39
|
Forecasting probability densities by using hidden markov models with mixed states
– Fraser, Dimitriadis
- 1993
|
|
38
|
Dynamic linear models with switching
– Shumway, Stoffer
- 1991
|
|
34
|
Soltutions to the linear smoothing problem
– Rauch
- 1963
|
|
31
|
Hidden Markov models for fault detection in dynamic systems
– Smyth
- 1994
|
|
29
|
Bayesian forecasting (with discussion
– Harrison, Stevens
- 1976
|
|
25
|
On state estimation in switching environments
– Ackerson, Fu
- 1970
|
|
25
|
A new view of the EM algorithm that justi es incremental and other variants
– Neal, Hinton
- 1993
|
|
20
|
Deterministic annealing variant of the EM algorithm
– Ueda, Nakano
- 1995
|
|
17
|
New results in linear filtering and prediction
– Kalman, Bucy
- 1961
|
|
15
|
A mixture-of-experts framework for adaptive Kalman filtering
– Chaer, Bishop, et al.
- 1997
|
|
15
|
Learning fine motion by Markov mixtures of experts
– Meila, Jordan
- 1996
|
|
13
|
Multi-channel physiological data: description and analysis
– Rigney, Goldberger, et al.
- 1993
|
|
12
|
State estimation for discrete systems with switching parameters
– Chang, Athans
- 1978
|
|
10
|
Theory and Practice ofRecursive Identi cation
– Ljung, Soderstrom
- 1983
|
|
8
|
Recursive estimation of dynamic modular RBF networks
– Kadirkamanathan, Kadirkamanathan
- 1995
|
|
7
|
A stochastic model of speech incorporating hierarchical nonstationarity
– Deng
- 1993
|