Results 1 - 10
of
38
Multiagent Planning with Factored MDPs
- In NIPS-14
, 2001
"... We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication between the agents is not imposed, but derived directly from the system dynamics and function approximation architecture ..."
Abstract
-
Cited by 176 (15 self)
- Add to MetaCart
(Show Context)
We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication between the agents is not imposed, but derived directly from the system dynamics and function approximation architecture. We view the entire multiagent system as a single, large Markov decision process (MDP), which we assume can be represented in a factored way using a dynamic Bayesian network (DBN). The action space of the resulting MDP is the joint action space of the entire set of agents. Our approach is based on the use of factored linear value functions as an approximation to the joint value function. This factorization of the value function allows the agents to coordinate their actions at runtime using a natural message passing scheme. We provide a simple and efficient method for computing such an approximate value function by solving a single linear program, whose size is determined by the interaction between the value function structure and the DBN. We thereby avoid the exponential blowup in the state and action space. We show that our approach compares favorably with approaches based on reward sharing. We also show that our algorithm is an efficient alternative to more complicated algorithms even in the single agent case.
An introduction to collective intelligence
- Handbook of Agent technology. AAAI
, 1999
"... ..."
(Show Context)
Coordinated Reinforcement Learning
- In Proceedings of the ICML-2002 The Nineteenth International Conference on Machine Learning
, 2002
"... We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value function. This structure is leveraged in an approach we call coordinated reinforcement learning, by which agents coordinate ..."
Abstract
-
Cited by 113 (6 self)
- Add to MetaCart
(Show Context)
We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value function. This structure is leveraged in an approach we call coordinated reinforcement learning, by which agents coordinate both their action selection activities and their parameter updates. Within the limits of our parametric representations, the agents will determine a jointly optimal action without explicitly considering every possible action in their exponentially large joint action space. Our methods differ from many previous reinforcement learning approaches to multiagent coordination in that structured communication and coordination between agents appears at the core of both the learning algorithm and the execution architecture. Our experimental results, comparing our approach to other RL methods, illustrate both the quality of the policies obtained and the additional benefits of coordination. 1.
Solving transition independent decentralized Markov decision processes
- JAIR
, 2004
"... Formal treatment of collaborative multi-agent systems has been lagging behind the rapid progress in sequential decision making by individual agents. Recent work in the area of decentralized Markov Decision Processes (MDPs) has contributed to closing this gap, but the computational complexity of thes ..."
Abstract
-
Cited by 107 (13 self)
- Add to MetaCart
Formal treatment of collaborative multi-agent systems has been lagging behind the rapid progress in sequential decision making by individual agents. Recent work in the area of decentralized Markov Decision Processes (MDPs) has contributed to closing this gap, but the computational complexity of these models remains a serious obstacle. To overcome this complexity barrier, we identify a specific class of decentralized MDPs in which the agents ’ transitions are independent. The class consists of independent collaborating agents that are tied together through a structured global reward function that depends on all of their histories of states and actions. We present a novel algorithm for solving this class of problems and examine its properties, both as an optimal algorithm and as an anytime algorithm. To the best of our knowledge, this is the first algorithm to optimally solve a non-trivial subclass of decentralized MDPs. It lays the foundation for further work in this area on both exact and approximate algorithms. 1.
Decentralized control of cooperative systems: Categorization and complexity analysis
- Journal of Artificial Intelligence Research
, 2004
"... Decentralized control of cooperative systems captures the operation of a group of decision-makers that share a single global objective. The difficulty in solving optimally such problems arises when the agents lack full observability of the global state of the system when they operate. The general pr ..."
Abstract
-
Cited by 89 (9 self)
- Add to MetaCart
Decentralized control of cooperative systems captures the operation of a group of decision-makers that share a single global objective. The difficulty in solving optimally such problems arises when the agents lack full observability of the global state of the system when they operate. The general problem has been shown to be NEXP-complete. In this paper, we identify classes of decentralized control problems whose complexity ranges between NEXP and P. In particular, we study problems characterized by independent transitions, independent observations, and goal-oriented objective functions. Two algorithms are shown to solve optimally useful classes of goal-oriented decentralized processes in polynomial time. This paper also studies information sharing among the decision-makers, which can improve their performance. We distinguish between three ways in which agents can exchange information: indirect communication, direct communication and sharing state features that are not controlled by the agents. Our analysis shows that for every class of problems we consider, introducing direct or indirect communication does not change the worst-case complexity. The results provide a better understanding of the complexity of decentralized control problems that arise in practice and facilitate the development of planning algorithms for these problems. 1.
A survey of collectives
- IN COLLECTIVES AND THE DESIGN OF COMPLEX SYSTEMS
, 2004
"... Due to the increasing sophistication and miniaturization of computational components, complex, distributed systems of interacting agents are becoming ubiquitous. Such systems, where each agent aims to optimize its own performance, but where there is a welldefined set of system-level performance cr ..."
Abstract
-
Cited by 28 (12 self)
- Add to MetaCart
(Show Context)
Due to the increasing sophistication and miniaturization of computational components, complex, distributed systems of interacting agents are becoming ubiquitous. Such systems, where each agent aims to optimize its own performance, but where there is a welldefined set of system-level performance criteria, are called collectives. The fundamental problem in analyzing/designing such systems is in determining how the combined actions of a large number of agents leads to “coordinated ” behavior on the global scale. Examples of artificial systems which exhibit such behavior include packet routing across a data network, control of an array of communication satellites, coordination of multiple rovers, and dynamic job scheduling across a distributed computer grid. Examples of natural systems include ecosystems, economies, and the organelles within a living cell. No current scientific discipline provides a thorough understanding of the relation between the structure of collectives and how well they meet their overall performance criteria. Although still very young, research on collectives has resulted in successes both in understanding and designing such systems. It is expected that as it matures and draws upon other disciplines related to collectives, this field will greatly expand the range of computationally addressable tasks. Moreover, in addition to drawing on them, such a fully developed field of collective intelligence may provide insight into already established scientific fields, such as mechanism design, economics, game theory, and population biology. This chapter provides a survey to the emerging science of collectives.
Wireless sensor networks for commercial lighting control: Decision making with multi-agent systems
- In AAAI Workshop on Sensor Networks
, 2004
"... The application of wireless sensor networks to commercial lighting control provides a practical application that can benefit directly from artificial intelligence techniques. This application requires decision making in the face of ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
The application of wireless sensor networks to commercial lighting control provides a practical application that can benefit directly from artificial intelligence techniques. This application requires decision making in the face of
Multi-agent systems by incremental gradient reinforcement learning
- In Proceedings of Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01
, 2001
"... A new reinforcement learning (RL) methodology is proposed to design multi-agent systems. In the realistic setting of situated agents with local perception, the task of automatically building a coordinated system is of crucial importance. We use simple reactive agents which learn their own behavior i ..."
Abstract
-
Cited by 21 (8 self)
- Add to MetaCart
A new reinforcement learning (RL) methodology is proposed to design multi-agent systems. In the realistic setting of situated agents with local perception, the task of automatically building a coordinated system is of crucial importance. We use simple reactive agents which learn their own behavior in a decentralized way. To cope with the difficulties inherent to RL used in that framework, we have developed an incremental learning algorithm where agents face more and more complex tasks. We illustrate this general framework on a computer experiment where agents have to coordinate to reach a global goal. 1
Adaptivity in agentbased routing for data networks
- In Proceedings of the Fourth International Conference on Autonomous Agents (Agents 2000
, 2000
"... Adaptivity, both of the individual agents and of the interaction structure among the agents, seems indispensable for scaling up multi-agent systems (MAS’s) in noisy environments. One important consideration in designing adaptive agents is choosing their action spaces to be as amenable as possible to ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
(Show Context)
Adaptivity, both of the individual agents and of the interaction structure among the agents, seems indispensable for scaling up multi-agent systems (MAS’s) in noisy environments. One important consideration in designing adaptive agents is choosing their action spaces to be as amenable as possible to machine learning techniques, especially to reinforcement learning (RL) techniques [22]. One important way to have the interaction structure connecting agents itself be adaptive is to have the intentions and/or actions of the agents be in the input spaces of the other agents, much as in Stackelberg games [2, 16, 15, 18]. We consider both kinds of adaptivity in the design of a MAS to control network packet routing [21, 6, 17, 12] We demonstrate on the OPNET
Adaptive, distributed control of constrained multi-agent systems
- Proceedings of AAMAS 04
, 2004
"... Product Distribution (PD) theory was recently developed as a broad framework for analyzing and optimizing distributed systems. Here we demonstrate its use for adaptive distributed control of Multi-Agent Systems (MAS’s), i.e., for distributed stochastic optimization using MAS’s. First we review one m ..."
Abstract
-
Cited by 15 (11 self)
- Add to MetaCart
(Show Context)
Product Distribution (PD) theory was recently developed as a broad framework for analyzing and optimizing distributed systems. Here we demonstrate its use for adaptive distributed control of Multi-Agent Systems (MAS’s), i.e., for distributed stochastic optimization using MAS’s. First we review one motivation of PD theory, as the informationtheoretic extension of conventional full-rationality game theory to the case of bounded rational agents. In this extension the equilibrium of the game is the optimizer of a Lagrangian of the (probability distribution of) the joint state of the agents. When the game in question is a team game with constraints, that equilibrium optimizes the expected value of the team game utility, subject to those constraints. One common way to £nd that equilibrium is to have each agent run a Reinforcement Learning (RL) algorithm. PD theory reveals this to be a particular type of search algorithm for minimizing the Lagrangian. Typically that algorithm is quite inef£cient. A more principled alternative is to use a variant of Newton’s method to minimize the Lagrangian. Here we compare this alternative to RL-based search in three sets of computer experiments. These are the N Queen’s problem and bin-packing problem from the optimization literature, and the Bar problem from the distributed RL literature. Our results con£rm that the PD-theory-based approach outperforms the RL-based scheme in all three domains. 1.