Finite-time analysis of the multiarmed bandit problem (2002)

by Peter Auer, Paul Fischer, Jyrki Kivinen
Venue:Machine Learning