Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search. (2011)

by J Asmuth, M L Littman
Venue:In Uncertainty in Artificial Intelligence,