On-line policy improvement using montecarlo search (1997)

by G Tesauro, G R Galperin
Venue:In Advances in Neural Information Processing Systems