Results 11 -
16 of
16
An EM based training algorithm for recurrent neural networks
"... Abstract. Recurrent neural networks serve as black-box models for nonlinear dynamical systems identification and time series prediction. Training of recurrent networks typically minimizes the quadratic difference of the network output and an observed time series. This implicitely assumes that the dy ..."
Abstract
- Add to MetaCart
Abstract. Recurrent neural networks serve as black-box models for nonlinear dynamical systems identification and time series prediction. Training of recurrent networks typically minimizes the quadratic difference of the network output and an observed time series. This implicitely assumes that the dynamics of the underlying system is deterministic, which is not a realistic assumption in many cases. In contrast, statespace models allow for noise in both the internal state transitions and the mapping from internal states to observations. Here, we consider recurrent networks as nonlinear state space models and suggest a training algorithm based on Expectation-Maximization. A nonlinear transfer function for the hidden neurons leads to an intractable inference problem. We investigate the use of a Particle Smoother to approximate the E-step and simultaneously estimate the expectations required in the M-step. The method is demonstrated for a sythetic data set and a time series prediction task arising in radiation therapy where it is the goal to predict the motion of a lung tumor during respiration.
Errata for “An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards”
, 2009
"... In an earlier version of this paper we made the claim that (Toussaint and Storkey, 2006) incorrectly uses the inverse dynamics to obtain the backwards transition model p(xn|xn+1). While using the inverse dynamics in this manner is indeed incorrect (see Klaas et al., 2006 for more details), after dis ..."
Abstract
- Add to MetaCart
In an earlier version of this paper we made the claim that (Toussaint and Storkey, 2006) incorrectly uses the inverse dynamics to obtain the backwards transition model p(xn|xn+1). While using the inverse dynamics in this manner is indeed incorrect (see Klaas et al., 2006 for more details), after discussions with the authors we have determined that (Toussaint and Storkey, 2006) do not use this inverse dynamics and compute the Unscented Transform correctly. Because of this confusion, and in order to clarify this point we give a brief overview of their method below. Assuming a transition model of the form p(xn+1|xn) = N (xn+1; φ(xn), Q), (1) where we have dropped dependence on the actions u, we want to compute the backwards messages β(xn) = p(xn+1|xn) β(xn+1) dxn+1. (2) The key here is that we are trying to compute this integral and are not trying to approximate p(xn|xn+1). If the messages are normally distributed β(xn+1) =
Extended Abstract Probabilistic inference for solving structured MDPs
"... Inference on structured domains has made considerable advances in recent years: Variational inference has been proposed as a basic methods to decompose the inference process (as on factored HMMs, Ghahramani & Jordan 1995), message-passing algorithms (such as loopy belief propagation, expectation pro ..."
Abstract
- Add to MetaCart
Inference on structured domains has made considerable advances in recent years: Variational inference has been proposed as a basic methods to decompose the inference process (as on factored HMMs, Ghahramani & Jordan 1995), message-passing algorithms (such as loopy belief propagation, expectation propagation, or exact inference via the Junction Tree Algorithm, Minka 2001) can efficiently handle a very broad range of structured models. Extensions of particle filters allow for more versatile belief representations in continuous domains (Klaas et al. 2006), and interesting techniques for efficient inference in relational models are currently developed (Chavira et al. 2006). To scale up to more realistic scenarios, planning (or model-based Reinforcement Learning) in stochastic environments equally needs to cope with structured descriptions of the environment, e.g., factored, hierarchical, and mixed (continuous/discrete) state representations.
Keywords Sequential Monte Carlo · Two-filter smoothing · State–space models ·
"... Abstract Two-filter smoothing is a principled approach for performing optimal smoothing in non-linear non-Gaussian state–space models where the smoothing distributions are computed through the combination of ‘forward ’ and ‘backward ’ time filters. The ‘forward ’ filter is the standard Bayesian filt ..."
Abstract
- Add to MetaCart
Abstract Two-filter smoothing is a principled approach for performing optimal smoothing in non-linear non-Gaussian state–space models where the smoothing distributions are computed through the combination of ‘forward ’ and ‘backward ’ time filters. The ‘forward ’ filter is the standard Bayesian filter but the ‘backward ’ filter, generally referred to as the backward information filter, is not a probability measure on the space of the hidden Markov process. In cases where the backward information filter can be computed in closed form, this technical point is not important. However, for general state–space models where there is no closed form expression, this prohibits the use of flexible numerical techniques such as Sequential Monte Carlo (SMC) to approximate the two-filter smoothing formula. We propose here a generalised twofilter smoothing formula which only requires approximating probability distributions and applies to any state–space model, removing the need to make restrictive assumptions used in previous approaches to this problem. SMC algorithms are developed to implement this generalised recursion and we illustrate their performance on various problems.
Technical Report Hard Wall Stochastic Control based on Hallucination-EM and Power-EP
"... We study stochastic control problems in the presence of hard wall constraints. Walls are incorporated in the dynamics of the agent by restricting its domain and hence perturbing the noise process close to the walls. A novel penalty term is introduced for bouncing off a wall. To efficiently search fo ..."
Abstract
- Add to MetaCart
We study stochastic control problems in the presence of hard wall constraints. Walls are incorporated in the dynamics of the agent by restricting its domain and hence perturbing the noise process close to the walls. A novel penalty term is introduced for bouncing off a wall. To efficiently search for a good policy we propose the “hallucination expectation maximization ” algorithm which iteratively maps the problem onto a non-Gaussian dynamical system. Hallucination weights anaesthetize the agent to render its local decisions optimal for the global planning problem. The E-step of HEM is solved using power-EP. 1
Online Empirical Evaluation of Tracking Algorithms
, 2009
"... Evaluation of tracking algorithms in the absence of ground truth is a challenging problem. There exist a variety of approaches for this problem, ranging from formal model validation techniques to heuristics that look for mismatches between track properties and the observed data. However, few of thes ..."
Abstract
- Add to MetaCart
Evaluation of tracking algorithms in the absence of ground truth is a challenging problem. There exist a variety of approaches for this problem, ranging from formal model validation techniques to heuristics that look for mismatches between track properties and the observed data. However, few of these methods scale up to the task of visual tracking where the models are usually non-linear and complex, and typically lie in a high dimensional space. Further, scenarios that cause track failures and/or poor tracking performance are also quite diverse for the visual tracking problem. In this paper, we propose an online performance evaluation strategy for tracking systems based on particle filters using a time-reversed Markov chain. The keu intuition of our proposed methodology relies on the timereversible nature of physical motion exhibited by most objects, which in turn should be possessed by a good tracker. In the presence of tracking failures due to occlusion, low SNR or modeling errors, this reversible nature of the tracker is violated. We use this property for detection of track failures. To evaluate the performance of the tracker at time instant t, we use the posterior of the tracking algorithm to initialize a time-reversed Markov chain. We compute the posterior density of track parameters at the starting time t = 0 by filtering back in time to the initial time instant. The distance between the

