Policy gradient methods for reinforcement learning with function approximation. (1999)

by Richard S Sutton , David Mcallester , Satinder Singh , Yishay Mansour
Venue:In NIPS,