Online regret bounds for Markov decision processes with deterministic transitions (2010)

by Ronald Ortner
Venue:ISSN 03043975. doi: 10.1016/j.tcs.2010.04.005. URL http://linkinghub.elsevier.com/retrieve/pii/S0304397510002008