Stochastic policy gradient reinforcement learning on a simple 3d biped (2004)

by R Tedrake, T W Zhang
Venue:In Intelligent Robots and Systems, IEEE/RSJ International Conference on