From information theoretic dualities to Path Integral and Kullback Leibler control: Continuous and Discrete Time formulations
Abstract — This paper presents a unified view of stochastic optimal control theory as developed within the machine learning and control theory communities. In particular we show the mathematical connection between recent work on Path Integral (PI) and Kullback Leibler (KL) divergence stochastic optimal control theory with earlier work on risk sensitivity and the fundamental dualities between free energy and relative entropy. We discuss the applications of the relationship between free energy and relative entropy to nonlinear stochastic dynamical systems affine in noise and nonlinear stochastic dynamics affine in control and noise. For this last class of systems, we provide the PI optimal control and its iterative formulation. In addition, we present the connection of PI control derived based on Dynamic Programming with the information theoretic dualities. Finally, we provide links to KL stochastic optimal control and discuss generalizations and future work. I.