Results 11  20
of
1,730
Multiplicative Updates Outperform Generic NoRegret . . .
, 2009
"... We study the outcome of natural learning algorithms in atomic congestion games. Atomic congestion games have a wide variety of equilibria often with vastly differing social costs. We show that in almost all such games, the wellknown multiplicativeweights learning algorithm results in convergence to ..."
Abstract

Cited by 28 (8 self)
 Add to MetaCart
We study the outcome of natural learning algorithms in atomic congestion games. Atomic congestion games have a wide variety of equilibria often with vastly differing social costs. We show that in almost all such games, the wellknown multiplicativeweights learning algorithm results in convergence to pure equilibria. Our results show that natural learning behavior can avoid bad outcomes predicted by the price of anarchy in atomic congestion games such as the loadbalancing game introduced by Koutsoupias and Papadimitriou, which has superconstant price of anarchy and has correlated equilibria that are exponentially worse than any mixed Nash equilibrium. Our results identify a set of mixed Nash equilibria that we call weakly stable equilibria. Our notion of weakly stable is defined gametheoretically, but we show that this property holds whenever a stability criterion from the theory of dynamical systems is satisfied. This allows us to show that in every congestion game, the distribution of play converges to the set of weakly stable equilibria. Pure Nash equilibria are weakly stable, and we show using techniques from algebraic geometry that the converse is true with probability 1 when congestion costs are selected at random independently on each edge (from any monotonically parametrized distribution). We further extend our results to show that players can use algorithms with different (sufficiently small) learning rates, i.e. they can trade off convergence speed and long term average regret differently.
Noregret algorithms for structured prediction problems
, 2005
"... Noregret algorithms are a popular class of online learning rules. Unfortunately, most noregret algorithms assume that the set Y of allowable hypotheses is small and discrete. We consider instead prediction problems where Y has internal structure: Y might be the set of strategies in a game like pok ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Noregret algorithms are a popular class of online learning rules. Unfortunately, most noregret algorithms assume that the set Y of allowable hypotheses is small and discrete. We consider instead prediction problems where Y has internal structure: Y might be the set of strategies in a game like
Noregret algorithms for online convex programs
 In Neural Information Processing Systems 19
, 2007
"... Online convex programming has recently emerged as a powerful primitive for designing machine learning algorithms. For example, OCP can be used for learning a linear classifier, dynamically rebalancing a binary search tree, finding the shortest path in a graph with unknown edge lengths, solving a str ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
structured classification problem, or finding a good strategy in an extensiveform game. Several researchers have designed noregret algorithms for OCP. But, compared to algorithms for special cases of OCP such as learning from expert advice, these algorithms are not very numerous or flexible. In learning
NoRegret Algorithms for Unconstrained Online Convex Optimization
"... Some of the most compelling applications of online convex optimization, including online prediction and classification, are unconstrained: the natural feasible set is R n. Existing algorithms fail to achieve sublinear regret in this setting unless constraints on the comparator point ˚x are known in ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Some of the most compelling applications of online convex optimization, including online prediction and classification, are unconstrained: the natural feasible set is R n. Existing algorithms fail to achieve sublinear regret in this setting unless constraints on the comparator point ˚x are known
NoRegret Learning in Oligopolies: Cournot vs Bertrand
 In Preparation
, 2009
"... Cournot and Bertrand oligopolies constitute the two most prevalent models of firm competition. The analysis of Nash equilibria in each model reveals a unique prediction about the stable state of the system. Quite alarmingly, despite the similarities of the two models, their projections expose a star ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
in disequilibrium under minimal behavioral hypotheses. Specifically, we assume that firms adapt their strategies over time, so that in hindsight their average payoffs are not exceeded by any single deviating strategy. Given this noregret guarantee, we show that in the case of Cournot oligopolies, the unique Nash
Computational Equivalence of Fixed Points and NoRegret Algorithms, and Convergence to Equilibria
"... Abstract We study the relation between notions of gametheoretic equilibria which arebased on stability under a set of deviations, and empirical equilibria which are reached by rational players. Rational players are modeled by players using noregret algorithms, which guarantee that their payoff in t ..."
Abstract
 Add to MetaCart
Abstract We study the relation between notions of gametheoretic equilibria which arebased on stability under a set of deviations, and empirical equilibria which are reached by rational players. Rational players are modeled by players using noregret algorithms, which guarantee that their payoff
Computational Equivalence of Fixed Points and NoRegret Algorithms, and Convergence to Equilibria
"... Abstract We study the relation between notions of gametheoretic equilibria which arebased on stability under a set of deviations, and empirical equilibria which are reached by rational players. Rational players are modeled by players using noregret algorithms, which guarantee that their payoff in t ..."
Abstract
 Add to MetaCart
Abstract We study the relation between notions of gametheoretic equilibria which arebased on stability under a set of deviations, and empirical equilibria which are reached by rational players. Rational players are modeled by players using noregret algorithms, which guarantee that their payoff
Blackwell Approachability and NoRegret Learning are Equivalent
"... We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger
Unifying Convergence and Noregret in Multiagent Learning
"... Abstract. We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR. ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilib ..."
Abstract
 Add to MetaCart
dependent on time can overcome both of these assumptions. Consequently, RVσ(t)theoretically achieves (a’) convergence to nearbest response against eventually stationary opponents, (b’) noregret payoff against arbitrary opponents and (c’) convergence to some Nash equilibrium policy in some classes of games
On the convergence of noregret learning in selfish routing
"... We study the repeated, nonatomic routing game, in which selfish players make a sequence of routing decisions. We consider a model in which players use regretminimizing algorithms as the learning mechanism, and study the resulting dynamics. We are concerned in particular with the convergence to t ..."
Abstract
 Add to MetaCart
to the set of Nash equilibria of the routing game. Noregret learning algorithms are known to guarantee convergence of a subsequence of population strategies. We are concerned with convergence of the actual sequence. We show that convergence holds for a large class of online learning algorithms, inspired
Results 11  20
of
1,730