Results 1  10
of
30
Compositionality of optimal control laws
 In Advances in Neural Information Processing Systems
"... We present a theory of compositionality in stochastic optimal control, showing how taskoptimal controllers can be constructed from certain primitives. The primitives are themselves feedback controllers pursuing their own agendas. They are mixed in proportion to how much progress they are making tow ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
We present a theory of compositionality in stochastic optimal control, showing how taskoptimal controllers can be constructed from certain primitives. The primitives are themselves feedback controllers pursuing their own agendas. They are mixed in proportion to how much progress they are making towards their agendas and how compatible their agendas are with the present task. The resulting composite control law is provably optimal when the problem belongs to a certain class. This class is rather general and yet has a number of unique properties – one of which is that the Bellman equation can be made linear even for nonlinear or discrete dynamics. This gives rise to the compositionality developed here. In the special case of linear dynamics and Gaussian noise our framework yields analytical solutions (i.e. nonlinear mixtures of LQG controllers) without requiring the final cost to be quadratic. More generally, a natural set of control primitives can be constructed by applying SVD to Green’s function of the Bellman equation. We illustrate the theory in the context of human arm movements. The ideas of optimality and compositionality are both very prominent in the field of motor control, yet they have been difficult to reconcile. Our work makes this possible. 1
Relative Entropy and Free Energy Dualities: Connections to Path Integral and KL control
"... Abstract — This paper integrates recent work on Path Integral ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
Abstract — This paper integrates recent work on Path Integral
Finding the Most Likely Trajectories of OptimallyControlled Stochastic Systems
 In World Congress of the International Federation of Automatic Control (IFAC
"... Abstract: Optimal trajectories of deterministic systems satisfy Pontryagin’s maximum principle and can be computed efficiently. Related results for stochastic systems exist but they lack the simplicity and computational efficiency of the deterministic case. Here we show that a certain class of both ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Abstract: Optimal trajectories of deterministic systems satisfy Pontryagin’s maximum principle and can be computed efficiently. Related results for stochastic systems exist but they lack the simplicity and computational efficiency of the deterministic case. Here we show that a certain class of both discretetime and continuoustime nonlinear stochastic control problems obey a classic maximum principle, in the sense that the most likely trajectory of the optimallycontrolled stochastic system is the solution to a deterministic optimal control problem. Apart from their theoretical significance, our results yield new numerical methods for stochastic control.
Multivariable Feedback Particle Filter
"... Abstract — In recent work it is shown that importance sampling can be avoided in the particle filter through an innovation structure inspired by traditional nonlinear filtering combined with MeanField Game formalisms [9], [19]. The resulting feedback particle filter (FPF) offers significant varianc ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
Abstract — In recent work it is shown that importance sampling can be avoided in the particle filter through an innovation structure inspired by traditional nonlinear filtering combined with MeanField Game formalisms [9], [19]. The resulting feedback particle filter (FPF) offers significant variance improvements; in particular, the algorithm can be applied to systems that are not stable. The filter comes with an upfront computational cost to obtain the filter gain. This paper describes new representations and algorithms to compute the gain in the general multivariable setting. The main contributions are, (i) Theory surrounding the FPF is improved: Consistency is established in the multivariate setting, as well as wellposedness of the associated PDE to obtain the filter gain. (ii) The gain can be expressed as the gradient of a function, which is precisely the solution to Poisson’s equation for a related MCMC diffusion (the Smoluchowski equation). This provides a bridge to MCMC as well as to approximate optimal filtering approaches such as TDlearning, which can in turn be used to approximate the gain. (iii) Motivated by a weak formulation of Poisson’s equation, a Galerkin finiteelement algorithm is proposed for approximation of the gain. Its performance is illustrated in numerical experiments. I.
Controltheoretic approach to communication with feedback.
 IEEE Transactions on Automatic Control
, 2012
"... AbstractFeedback communication is studied from a controltheoretic perspective, mapping the communication problem to a control problem in which the control signal is received through the same noisy channel as in the communication problem, and the (nonlinear and timevarying) dynamics of the system ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
AbstractFeedback communication is studied from a controltheoretic perspective, mapping the communication problem to a control problem in which the control signal is received through the same noisy channel as in the communication problem, and the (nonlinear and timevarying) dynamics of the system determine a subclass of encoders available at the transmitter. The MSE exponent is defined to be the exponential decay rate of the mean square decoding error and is used for analysis of the reliable rate of communication. A sufficient condition is provided under which the MMSE capacity, the supremum achievable MSE exponent, is equal to the informationtheoretic capacity, the supremum achievable rate. For the special class of stationary Gaussian channels and linear timeinvariant systems, a simple application of Bode's integral formula shows that the feedback capacity, recently characterized by Kim, is equal to the maximum instability that can be tolerated by any linear controller under a given power constraint. Finally, the control mapping is generalized to the N sender AWGN multiple access channel. It is shown that Kramer's code for this channel, which is known to be sum rate optimal in the class of generalized linear feedback codes, can be obtained by solving a linear quadratic Gaussian control problem.
A UNIFIED THEORY OF LINEARLY SOLVABLE OPTIMAL CONTROL KRISHNAMURTHY DVIJOTHAM
"... Abstract. We present a unified theory of Linearly Solvable Optimal Control, that is, a class of optimal control problems whose solution reduces to solving a linear equation (for finite state spaces) or a linear integral equation (for continuous state spaces). The framework presented includes all pre ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We present a unified theory of Linearly Solvable Optimal Control, that is, a class of optimal control problems whose solution reduces to solving a linear equation (for finite state spaces) or a linear integral equation (for continuous state spaces). The framework presented includes all previous work on linearly solvable optimal control as special cases. It includes both standard control problems and risksensitive control problems. The degree of risk sensitivity is a parameter of the optimal control problem and can be tuned to achieve the desired tradeoff between performance and robustness (to noise/modeling errors). Linearly Solvable Optimal Control problems also possess a number of attractive properties that we explore in this paper. We show that it is possible to construct optimal control laws for new problems by combining the control laws of previously solved optimal control problems. This leads to analytical solutions for a class of nonLQG control problems. Another property is the existence of a path integral representation of the solution to the optimal control problem, which allows us to leverage approximate probabilistic inference techniques to compute optimal control laws. Further, we show that the Inverse Optimal Control problem, that is, the problem of inferring the cost function given trajectories sampled from the optimal control law, can be posed as a convex optimization problem and solved efficiently for problems in this class. We show that the risk sensitive problems can also be viewed as zero sum stochastic games, where the degree of risk averseness grows as the adversary becomes stronger. Under this interpretation, we derive a stochastic maximum principle that characterizes the most likely trajectory of the optimally controlled closedloop system (that includes both the controller and the adversary).
Parallels between sensory and motor information processing
"... The computational problems solved by the sensory and motor systems appear very different: one has to do with inferring the state of the world given sensory data, the other with generating motor commands appropriate for given task goals. However recent mathematical developments summarized in this cha ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
The computational problems solved by the sensory and motor systems appear very different: one has to do with inferring the state of the world given sensory data, the other with generating motor commands appropriate for given task goals. However recent mathematical developments summarized in this chapter show that these two problems are in many ways related. Therefore information processing in the sensory and motor systems may be more similar than previously thought – not only in terms of computations but also in terms of algorithms and neural representations. Here we explore these similarities as well as clarify some differences between the two systems. Similarity between inference and control: an intuitive introduction Consider a control problem where we want to achieve a certain goal at some point in time in the future – say, grasp a coffee cup within 1 sec. To achieve this goal, the motor system has to generate a sequence of muscle activations which result in joint torques which act on the musculoskeletal plant in such a way that the fingers end up curled around the cup. Actually the motor system does not have to compute the entire
Approximate inference and stochastic optimal control
, 2013
"... We propose a novel reformulation of the stochastic optimal control problem as an approximate inference problem, demonstrating, that such a interpretation leads to new practical methods for the original problem. In particular we characterise a novel class of iterative solutions to the stochastic opti ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We propose a novel reformulation of the stochastic optimal control problem as an approximate inference problem, demonstrating, that such a interpretation leads to new practical methods for the original problem. In particular we characterise a novel class of iterative solutions to the stochastic optimal control problem based on a natural relaxation of the exact dual formulation. These theoretical insights are applied to the Reinforcement Learning problem where they lead to new model free, off policy methods
A DUALITY RELATIONSHIP FOR REGULAR CONDITIONAL RELATIVE ENTROPY
"... Abstract: In this paper, we present a duality relationship between regular conditional free energy and regular conditional relative entropy given a subσalgebra. This is achieved by using a relation between the RadonNikodym derivative of probability measures and that of regular conditional probabi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract: In this paper, we present a duality relationship between regular conditional free energy and regular conditional relative entropy given a subσalgebra. This is achieved by using a relation between the RadonNikodym derivative of probability measures and that of regular conditional probability measures. Some properties of the regular conditional relative entropy under consideration are also given. The duality relation can be applied in a finite horizon robust state estimation problem for finitealphabet hidden Markov models. Copyright c ○ 2005 IFAC
Optimal limitcycle control recast as bayesian inference
"... Abstract: We introduce an algorithm that generates an optimal controller for stochastic nonlinear problems with a periodic solution, e.g. locomotion. Uniquely, the quantity we approximate is neither the Value nor Policy functions, but rather the stationary statedistribution of the optimallycontroll ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract: We introduce an algorithm that generates an optimal controller for stochastic nonlinear problems with a periodic solution, e.g. locomotion. Uniquely, the quantity we approximate is neither the Value nor Policy functions, but rather the stationary statedistribution of the optimallycontrolled process. We recast the control problem as Bayesian inference over a graphical model with a ring topology. The posterior approximates the controlled stationary distribution with local gaussians along the optimal limitcycle. Linearfeedback gains and openloop controls are extracted from the covariances and the means, respectively. Complexity scales linearly or quadratically with the state dimension, depending on the dynamics approximation. We demonstrate our algorithm on a toy 2dimensional problem and then on a challenging 23dimensional simulated walking robot.