Results 1 - 10
of
14
No-regret learning in convex games
, 2007
"... Quite a bit is known about minimizing different kinds of regret in experts problems, and how these regret types relate to types of equilibria in the multiagent setting of repeated matrix games. Much less is known about the possible kinds of regret in online convex programming problems (OCPs), or abo ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Quite a bit is known about minimizing different kinds of regret in experts problems, and how these regret types relate to types of equilibria in the multiagent setting of repeated matrix games. Much less is known about the possible kinds of regret in online convex programming problems (OCPs), or about equilibria in the analogous multiagent setting of repeated convex games. This gap is unfortunate, since convex games are much more expressive than matrix games, and since many important machine learning problems can be expressed as OCPs. In this paper, we work to close this gap: we analyze a spectrum of regret types which lie between external and swap regret, along with their corresponding equilibria, which lie between coarse correlated and correlated equilibrium. We also analyze algorithms for minimizing these regret types. As examples of our framework, we derive algorithms for learning correlated equilibria in polyhedral convex games and extensive-form correlated equilibria in extensive-form games. The former is exponentially more efficient than previous algorithms, and the latter is the first of its type. 1.
Global Nash convergence of Foster and Young’s regret testing
- Games and Economic Behavior
, 2007
"... We construct an uncoupled randomized strategy of repeated play such that, if every player plays according to it, mixed action profiles converge almost surely to a Nash equilibrium of the stage game. The strategy requires very little in terms of information about the game, as players ’ actions are ba ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We construct an uncoupled randomized strategy of repeated play such that, if every player plays according to it, mixed action profiles converge almost surely to a Nash equilibrium of the stage game. The strategy requires very little in terms of information about the game, as players ’ actions are based only on their own past payoffs. Moreover, in a variant of the procedure, players need not know that there are other players in the game and that payoffs are determined through other players ’ actions. The procedure works for finite generic games and is based on appropriate modifications of a simple stochastic learning rule introduced by Foster and Young [12]. Keywords Regret testing; Regret-based learning; Random search; Stochastic dynamics; Uncoupled dynamics; Global convergence to
The communication complexity of uncoupled Nash equilibrium procedures
- Games and Economic Behavior
, 2006
"... We study the question of how long it takes players to reach a Nash equilibrium in uncoupled setups, where each player initially knows only his own payoff function. We derive lower bounds on the communication complexity of reaching a Nash equilibrium, i.e., on the number of bits that need to be trans ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
We study the question of how long it takes players to reach a Nash equilibrium in uncoupled setups, where each player initially knows only his own payoff function. We derive lower bounds on the communication complexity of reaching a Nash equilibrium, i.e., on the number of bits that need to be transmitted, and thus also on the required number of steps. Specifically, we show lower bounds that are exponential in the number of players in each one of the following cases: (1) reaching a pure Nash equilibrium; (2) reaching a pure Nash equilibrium in a Bayesian setting; and (3) reaching a mixed Nash equilibrium. We then show that, in contrast, the communication complexity of reaching a correlated equilibrium is polynomial in the number of players.
No-Regret Learning and a Mechanism for Distributed Multiagent Planning
, 2008
"... We develop a novel mechanism for coordinated, distributed multiagent planning. We consider problems stated as a collection of single-agent planning problems coupled by common soft constraints on resource consumption. (Resources may be real or fictitious, the latter introduced as a tool for factoring ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
We develop a novel mechanism for coordinated, distributed multiagent planning. We consider problems stated as a collection of single-agent planning problems coupled by common soft constraints on resource consumption. (Resources may be real or fictitious, the latter introduced as a tool for factoring the problem). A key idea is to recast the distributed planning problem as learning in a repeated game between the original agents and a newly introduced group of adversarial agents who influence prices for the resources. The adversarial agents benefit from arbitrage: that is, their incentive is to uncover violations of the resource usage constraints and, by selfishly adjusting prices, encourage the original agents to avoid plans that cause such violations. If all agents employ no-regret learning algorithms in the course of this repeated interaction, we are able to show that our mechanism can achieve design goals such as social optimality (efficiency), budget balance, and Nash-equilibrium convergence to within an error which approaches zero as the agents gain experience. In particular, the agents’ average plans converge to a socially optimal solution for the original planning task. We present experiments in a simulated network routing domain demonstrating our method’s ability to reliably generate sound plans.
Characterization and Computation of Correlated Equilibria in Infinite Games
, 2007
"... Motivated by recent work on computing Nash equilibria in two-player zero-sum games with polynomial payoffs by semidefinite programming and in arbitrary polynomiallike games by discretization techniques, we consider the problems of characterizing and computing correlated equilibria in games with inf ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Motivated by recent work on computing Nash equilibria in two-player zero-sum games with polynomial payoffs by semidefinite programming and in arbitrary polynomiallike games by discretization techniques, we consider the problems of characterizing and computing correlated equilibria in games with infinite strategy sets. We prove several characterizations of correlated equilibria in continuous games which are more analytically tractable than the standard definition and may be of independent interest. Then we use these to construct algorithms for approximating correlated equilibria of polynomial games with arbitrary accuracy, including a sequence of semidefinite programming relaxation algorithms and discretization algorithms.
Correlated Equilibria in Continuous Games: Characterization and Computation
, 2008
"... We present several new characterizations of correlated equilibria in games with continuous utility functions. These have the advantage of being more computationally and analytically tractable than the standard definition in terms of departure functions. We use these characterizations to construct ef ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present several new characterizations of correlated equilibria in games with continuous utility functions. These have the advantage of being more computationally and analytically tractable than the standard definition in terms of departure functions. We use these characterizations to construct effective algorithms for approximating a single correlated equilibrium or the entire set of correlated equilibria of a game with polynomial utility functions. We then exhibit the rich structure of the set of correlated equilibria by analyzing the simplest of polynomial games, the mixed extension of matching pennies. We show that while the correlated equilibrium set is convex, the structure of its extreme points can be quite complicated. In finite games there can be a superexponential separation between the number of extreme Nash and extreme correlated equilibria. In polynomial games there can exist extreme correlated equilibria which are not finitely supported; we construct a large family of examples using techniques from ergodic theory. These examples show that in general the set of correlated equilibrium distributions of a polynomial game cannot be described by conditions on finitely many joint moments, in marked contrast to the set of Nash equilibria which is always expressible in terms of finitely many moments.
Hybrid Stochastic-Adversarial On-line Learning
"... Most of the research in online learning focused either on the problem of adversarial classification (i.e., both inputs and labels are arbitrarily chosen by an adversary) or on the traditional supervised learning problem in which samples are independently generated from a fixed probability distributi ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Most of the research in online learning focused either on the problem of adversarial classification (i.e., both inputs and labels are arbitrarily chosen by an adversary) or on the traditional supervised learning problem in which samples are independently generated from a fixed probability distribution. Nonetheless, in a number of domains the relationship between inputs and labels may be adversarial, whereas input instances are generated according to a constant distribution. This scenario can be formalized as an hybrid classification problem in which inputs are stochastic, while labels are adversarial. In this paper, we introduce this hybrid stochastic-adversarial classification problem, we propose an online learning algorithm for its solution, and we analyze its performance. In particular, we show that, given a hypothesis space H with finite VC dimension, it is possible to incrementally build a suitable finite set of hypotheses that can be used as input for an exponentially weighted forecaster achieving a cumulative regret of order O ( √ nV C(H)log n) with overwhelming probability. Finally, we discuss extensions to multi-label classification, learning from experts and bandit settings with stochastic side information, and application to games. 1
A Practical No-Linear-Regret Algorithm for Convex Games
"... For convex games, connections between playing by no-regret algorithms and playing equilibrium strategies have previously been made for Φ-regret, a generalization of external regret [5]. In particular, Gordon et al. present a no-Φ-regret algorithm for several different classes of transformations Φ [4 ..."
Abstract
- Add to MetaCart
For convex games, connections between playing by no-regret algorithms and playing equilibrium strategies have previously been made for Φ-regret, a generalization of external regret [5]. In particular, Gordon et al. present a no-Φ-regret algorithm for several different classes of transformations Φ [4]. In this paper, we instantiate the algorithm for the class of linear transformations using a variety of optimization techniques and give experimental results on several games including Indian poker, a simple but substantially large-scale variant of poker. Our results show that both no-external-regret and no-linear-regret algorithms can achieve better regret performances than what the current theory guarantees. To the best of our knowledge, this is the first work empirically demonstrating the benefits of a no-Φ-regret algorithm for general convex games where Φ is stronger than external. 1
How Bad are Selfish Investments in Network Security?
"... Internet security does not only depend on the security-related investments of individual users, but also on how these users affect each other. In a non-cooperative environment, each user chooses a level of investment to minimize his own security risk plus the cost of investment. Not surprisingly, t ..."
Abstract
- Add to MetaCart
Internet security does not only depend on the security-related investments of individual users, but also on how these users affect each other. In a non-cooperative environment, each user chooses a level of investment to minimize his own security risk plus the cost of investment. Not surprisingly, this selfish behavior often results in undesirable security degradation of the overall system. In this paper, (1) we first characterize the price of anarchy (POA) of network security under two models: an “Effective-investment ” model, and a “Bad-traffic ” model. We give insight on how the POA depends on the network topology, individual users ’ cost functions, and their mutual influence. We also introduce the concept of “weighted POA ” to bound the region of all feasible payoffs. (2) In a repeated game, on the other hand, users have more incentive to cooperate for their long term interests. We consider the socially best outcome that can be supported by the repeated game, and give a ratio between this outcome and the social optimum. (3) Next, we compare the benefits of improving security technology or improving incentives, and show that improving technology alone may not offset the efficiency loss due to the lack of incentives. (4) Finally, we characterize the performance of correlated equilibrium (CE) in the security game. Although the paper focuses on Internet security, many results are generally applicable to games with positive externalities.

