MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  BL-WoLF: A framework for loss-bounded learnability in zero-sum games (2003) [7 citations — 5 self]

Download:
pdf
by Vincent Conitzer, Tuomas Sandholm
In International Conference on Machine Learning (ICML
http://www.cs.cmu.edu/~conitzer/blwolfICML03.pdf
Add To MetaCart

Abstract:

We present BL-WoLF, a framework for learnability in repeated zero-sum games where the cost of learning is measured by the losses the learning agent accrues (rather than the number of rounds). The game is adversarially chosen from some family that the learner knows. The opponent knows the game and the learner’s learning strategy. The learner tries to either not accrue losses, or to quickly learn about the game so as to avoid future losses (this is consistent with the Win or Learn Fast (WoLF) principle; BL stands for “bounded loss”). Our framework allows for both probabilistic and approximate learning. The resultant notion of BL-WoLF-learnability can be applied to any class of games, and allows us to measure the inherent disadvantage to a player that does not know which game in the class it is in. We present guaranteed BL-WoLF-learnability results for families of games with deterministic payoffs and families of games with stochastic payoffs. We demonstrate that these families are guaranteed approximately BL-WoLF-learnable with lower cost. We then demonstrate families of games (both stochastic and deterministic) that are not guaranteed BL-WoLF-learnable. We show that those families, nevertheless, are BL-WoLFlearnable. To prove these results, we use a key lemma which we derive. 1 1.

Citations

323 Markov games as a framework for multi-agent reinforcement learning – Littman - 1994
279 The theory of learning in games – Fudenberg, Levine - 1998
228 How to use expert advice – Cesa-Bianchi, Freund, et al. - 1997
190 Multiagent reinforcement learning: Theoretical framework and an algorithm – Hu, Wellman - 1998
182 Games and Decisions – Luce, Raiffa - 1957
123 Rational learning leads to Nash equilibrium – Kalai, Lehrer - 1993
107 Multiagent learning using a variable learning rate – Bowling, Veloso - 2002
103 Gambling in a rigged casino: The adversarial multi-armed bandit problem – Auer, Cesa-Bianchi, et al. - 1995
88 Near-optimal reinforcement learning in polynomial time – Kearns, Singh - 2002
87 Complexity results about Nash equilibria – Conitzer, Sandholm - 2003
86 Adaptive game playing using multiplicative weights – Freund, Schapire - 1999
55 Consistency and cautious fictitious play – Fudenberg, Levine - 1995
35 A polynomialtime nash equilibrium algorithm for repeated games – Littman, Stone - 2003
28 Efficient Learning Equilibrium – Brafman, Tennenholtz - 2004
26 A randomization rule for selecting forecasts – Foster, Vohra - 1993
19 M.: A near-optimal polynomial time algorithm for learning in certain classes of stochastic games – Brafman, Tennenholtz - 2000
16 On repeated games with incomplete information played by non-Bayesian players – Megiddo - 1980
14 On pseudo-games – Banos
11 Learning to play games in extensive form by valuation – Jehiel, Samet - 2001
3 Approximation to Bayes risk in repeated play. vol. III of Contributions to the Theory of Games – Hannan - 1957
2 Dynamic non-bayesian decision making – Monderer, Tennenholtz - 1997