Results 1  10
of
21
Discrete MDL Predicts in Total Variation
, 2009
"... The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationa ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance. Implications for noni.i.d. domains like timeseries forecasting, discriminative learning, and reinforcement learning are discussed.
Cooperation through communication in decentralized Markov games
 in « IEEE International Conference on Advances in Intelligent Systems  Theory and Applications  AISTA’2004, Kirchberg, Luxembourg »„ IEEE
, 2004
"... Abstract — In this paper, we present a comunicationintegrated reinforcementlearning algorithm for a generalsum Markov game or MG played by independent, cooperative agents. The algorithm assumes that agents can communicate but do not know the purpose (the semantic) of doing so. We model agents tha ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract — In this paper, we present a comunicationintegrated reinforcementlearning algorithm for a generalsum Markov game or MG played by independent, cooperative agents. The algorithm assumes that agents can communicate but do not know the purpose (the semantic) of doing so. We model agents that have different tasks, some of which may be commonly beneficial. The objective of the agents is to determine which are the commonly beneficial tasks, and learn a sequence of actions that achieves the common tasks. In other words, the agents play a multistage coordination game, of which they know niether the stagewise payoff matrix nor the stage transition matrix. Our principal interest is in imposing realistic conditions of learning on the agents. Towards this end, we assume that they operate in a strictly imperfect monitoring setting wherein they do not observe one another’s actions or rewards. A learning algorithm for a Markov game under this stricter condition of learning has not been proposed yet to our knowledge. We describe this Markov game with individual reward functions as a new formalism, decentralized Markov game or DecMG, a formalism borrowed from DecMDP (Markov decison process). For the communicatory aspect of the learning conditions, we propose a series of communication frameworks graduated in terms of facilitation of information exchange amongst the agents. We present results of testing our algorithm in a toy problem MG called a total guessing game. A. Reinforcement learning I.
Synthetic Social Construction for Autonomous Characters
 AAAI Workshop on Modular Construction of HumanLike Intelligence
, 2005
"... Borrowing ideas from the notion of social construction of self, this paper puts forth the idea of synthetic social construction multiagent systems in which agents socially construct each others ’ roles and behaviors via their interactions with one another. An example implementation of synthetic soc ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Borrowing ideas from the notion of social construction of self, this paper puts forth the idea of synthetic social construction multiagent systems in which agents socially construct each others ’ roles and behaviors via their interactions with one another. An example implementation of synthetic social construction is presented demonstrating one use of this method to facilitate behavior akin to social learning. Synthetic social construction represents a novel approach to adaptive behavior in multiagent systems informed by human behaviors.
Learning to Negotiate Optimally in NonStationary Environments
"... Abstract. We adopt the Markov chain framework to model bilateral negotiations among agents in dynamic environments and use Bayesian learning to enable them to learn an optimal strategy in incomplete information settings. Specifically, an agent learns the optimal strategy to play against an opponent ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We adopt the Markov chain framework to model bilateral negotiations among agents in dynamic environments and use Bayesian learning to enable them to learn an optimal strategy in incomplete information settings. Specifically, an agent learns the optimal strategy to play against an opponent whose strategy varies with time, assuming no prior information about its negotiation parameters. In so doing, we present a new framework for adaptive negotiation in such nonstationary environments and develop a novel learning algorithm, which is guaranteed to converge, that an agent can use to negotiate optimally over time. We have implemented our algorithm and shown that it converges quickly in a wide range of cases. 1
Learning to Identify Winning Coalitions in the PAC Model
"... Agents situated in a realworld setting may frequently lack complete knowledge of their environment and the capabilities of other agents. Researchers have addressed this problem by exploiting tools from learning theory. In particular, advances in reinforcement learning have yielded learning algorith ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Agents situated in a realworld setting may frequently lack complete knowledge of their environment and the capabilities of other agents. Researchers have addressed this problem by exploiting tools from learning theory. In particular, advances in reinforcement learning have yielded learning algorithms that converge to various solution concepts for stochastic games. However, few studies have attempted to tackle learning in coalition formation, and even fewer have applied Probably Approximately Correct (PAC) theory to multiagent systems. In this paper, we consider PAC learning of simple cooperative games, in which the coalitions are partitioned into “winning ” and “losing ” coalitions. We analyze the sample complexity of a suitable concept class by calculating its VapnikChervonenkis (VC) dimension, and provide an algorithm that efficiently learns this class. Furthermore, we study constrained simple games; we demonstrate that the VC dimension can be dramatically reduced when there only exists a single minimum winning coalition (even more so when this coalition has cardinality 1), while other interesting constraints do not significantly lower the dimension. 1
ABSTRACT Multiagent Reinforcement Learning Algorithm to Handle Beliefs of Other Agents ’ Policies and Embedded Beliefs
"... We have developed a new series of multiagent reinforcement learning algorithms that choose a policy based on beliefs about coplayers’ policies. The algorithms are applicable to situations where a state is fully observable by the agents, but there is no limit on the number of players. Some of the a ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We have developed a new series of multiagent reinforcement learning algorithms that choose a policy based on beliefs about coplayers’ policies. The algorithms are applicable to situations where a state is fully observable by the agents, but there is no limit on the number of players. Some of the algorithms employ embedded beliefs to handle the cases that coplayers are also choosing a policy based on their beliefs of others ’ policies. Simulation experiments on Iterated Prisoners ’ Dilemma games show that the algorithms using on policybased belief converge to highly mutuallycooperative behavior, unlike the existing algorithms based on actionbased belief.
Poutré, ‘Learning from induced changes in opponent (re)actions in multiagent games
 in AAMAS’06
, 2006
"... Multiagent learning is a growing area of research. An important topic is to formulate how an agent can learn a good policy in the face of adaptive, competitive opponents. Most research has focused on extensions of single agent learning techniques originally designed for agents in more static envi ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Multiagent learning is a growing area of research. An important topic is to formulate how an agent can learn a good policy in the face of adaptive, competitive opponents. Most research has focused on extensions of single agent learning techniques originally designed for agents in more static environments. These techniques however fail to incorporate a notion of the effect of own previous actions on the development of the policy of the other agents in the system. We argue that incorporation of this property is beneficial in competitive settings. In this paper, we present a novel algorithm to capture this notion, and present experimental results to validate our claims.
A Comprehensive Survey of Multiagent Reinforcement Learning
"... Abstract—Multiagent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must, i ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Multiagent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must, instead, discover a solution on their own, using learning. A significant part of the research on multiagent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multiagent reinforcement learning (MARL). A central issue in the field is the formal statement of the multiagent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents ’ learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim—either explicitly or implicitly—at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied. Finally, an outlook for the field is provided. Index Terms—Distributed control, game theory, multiagent systems, reinforcement learning. I.