Infinite-horizon gradient-based policy search (2001)

by J Baxter, P L Bartlett
Venue:Journal of Artificial Intelligence Research