Results 1 - 10
of
233
Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork
- ARTIFICIAL INTELLIGENCE
, 1999
"... Multi-agent domains consisting of teams of agents that need to collaborate in an adversarial environment offer challenging research opportunities. In this article, we introduce periodic team synchronization (PTS) domains as time-critical environments in which agents act autonomously with low commu ..."
Abstract
-
Cited by 220 (20 self)
- Add to MetaCart
Multi-agent domains consisting of teams of agents that need to collaborate in an adversarial environment offer challenging research opportunities. In this article, we introduce periodic team synchronization (PTS) domains as time-critical environments in which agents act autonomously with low communication, but in which they can periodically synchronize in a full-communication setting. The two main contributions of this article are a flexible team agent structure and a method for inter-agent communication in domains with unreliable, single-channel, low-bandwidth communication. First, the novel team agent structure allows agents to capture and reason about team agreements. We achieve collaboration between agents through the introduction of formations. A formation decomposes the task space defining a set of roles. Homogeneous agents can flexibly switch roles within formations, and agents can change formations dynamically, according to pre-defined triggers to be evaluated at run-time. This flexibility increases the performance of the overall team. Our teamwork structure further includes pre-planning for frequent situations. Second, the novel communication method is designed for use during the lowcommunication periods in PTS domains. It overcomes the obstacles to inter-agent communication in multi-agent environments with unreliable, high-cost, low-bandwidth communication. We fully implemented both the flexible teamwork structure and the communication method in the domain of simulated robotic soccer, and conducted controlled empirical experiments to verify their effectiveness. In addition, our simulator team made it to the semi-finals of the RoboCup-97 competition, in which 29 teams participated.
Algorithms for Sequential Decision Making
, 1996
"... Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of ..."
Abstract
-
Cited by 213 (8 self)
- Add to MetaCart
(Show Context)
Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of states, "do" is one of a finite set of actions, "should" is maximize a long-run measure of reward, and "I" is an automated planning or learning system (agent). In particular,
Cooperative Multi-Agent Learning: The State of the Art
- Autonomous Agents and Multi-Agent Systems
, 2005
"... Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract
-
Cited by 182 (8 self)
- Add to MetaCart
(Show Context)
Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multi-agent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multi-agent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multi-agent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multi-agent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multi-agent learning problem domains, and a list of multi-agent learning resources. 1
An introduction to collective intelligence
- Handbook of Agent technology. AAAI
, 1999
"... ..."
(Show Context)
Evolutionary function approximation for reinforcement learning
- Journal of Machine Learning Research
, 2006
"... Ø�ÓÒ�ÔÔÖÓÜ�Ñ�Ø�ÓÒ�ÒÓÚ�Ð�ÔÔÖÓ��ØÓ�ÙØÓÑ�Ø��ÐÐÝ× � Ø�ÓÒ�Ð���×�ÓÒ×Ì��ר��×�×�ÒÚ�ר���Ø�×�ÚÓÐÙØ�ÓÒ�ÖÝ�ÙÒ �Ò�ÓÖ�Ñ�ÒØÐ��ÖÒ�Ò�ÔÖÓ�Ð�Ñ×�Ö�Ø��×Ù�×�ØÓ�Ø��×�Ø�×� × ÁÒÑ�ÒÝÑ���Ò�Ð��ÖÒ�Ò�ÔÖÓ�Ð�Ñ×�Ò���ÒØÑÙרÐ��ÖÒ Ñ�ÒØ���Òר�ÒØ��Ø�ÓÒÓ��ÚÓÐÙØ�ÓÒ�ÖÝ�ÙÒØ�ÓÒ�ÔÔÖÓÜ�Ñ � Ù�Ðר��Ø�Ö���ØØ�Ö��Ð�ØÓÐ��ÖÒÁÔÖ�×�ÒØ��ÙÐÐÝ�ÑÔÐ � Ø�Ó ..."
Abstract
-
Cited by 110 (17 self)
- Add to MetaCart
Ø�ÓÒ�ÔÔÖÓÜ�Ñ�Ø�ÓÒ�ÒÓÚ�Ð�ÔÔÖÓ��ØÓ�ÙØÓÑ�Ø��ÐÐÝ× � Ø�ÓÒ�Ð���×�ÓÒ×Ì��ר��×�×�ÒÚ�ר���Ø�×�ÚÓÐÙØ�ÓÒ�ÖÝ�ÙÒ �Ò�ÓÖ�Ñ�ÒØÐ��ÖÒ�Ò�ÔÖÓ�Ð�Ñ×�Ö�Ø��×Ù�×�ØÓ�Ø��×�Ø�×� × ÁÒÑ�ÒÝÑ���Ò�Ð��ÖÒ�Ò�ÔÖÓ�Ð�Ñ×�Ò���ÒØÑÙרÐ��ÖÒ Ñ�ÒØ���Òר�ÒØ��Ø�ÓÒÓ��ÚÓÐÙØ�ÓÒ�ÖÝ�ÙÒØ�ÓÒ�ÔÔÖÓÜ�Ñ � Ù�Ðר��Ø�Ö���ØØ�Ö��Ð�ØÓÐ��ÖÒÁÔÖ�×�ÒØ��ÙÐÐÝ�ÑÔÐ � Ø�ÓÒÛ���ÓÑ��Ò�ׯ��Ì�Ò�ÙÖÓ�ÚÓÐÙØ�ÓÒ�ÖÝÓÔØ�Ñ�Þ � Ð�Ø�Ò��ÙÒØ�ÓÒ�ÔÔÖÓÜ�Ñ�ØÓÖÖ�ÔÖ�×�ÒØ�Ø�ÓÒר��Ø�Ò��Ð� Ø�ÓÒØ��Ò�ÕÙ�Û�Ø�ÉÐ��ÖÒ�Ò��ÔÓÔÙÐ�ÖÌ�Ñ�Ø�Ó�Ì� � �Æ��ÒØ�Ò��Ú��Ù�ÐÐ��ÖÒ�Ò�Ì��×Ñ�Ø�Ó��ÚÓÐÚ�×�Ò��Ú� � ÓÔØ�Ñ�Þ�Ø�ÓÒ��ÐÐ�ÒØ��×�Ø��ÓÖÝ��Ú�ÐÓÔ�Ò��«�Ø�Ú�Ö��Ò �ÓÖÁÒר����ØÖ���Ú�×ÓÒÐÝÔÓ×�Ø�Ú��Ò�Ò���Ø�Ú�Ö�Û�Ö� × ÔÖÓ�Ð�Ñ××Ù��×ÖÓ�ÓØÓÒØÖÓÐ��Ñ�ÔÐ�Ý�Ò��Ò�×Ýר�Ñ �ÒÛ���Ø�����ÒØÒ�Ú�Ö×��×�Ü�ÑÔÐ�×Ó�ÓÖÖ�Ø����Ú 1.
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems
- In Proceedings of the Seventeenth International Conference on Machine Learning
, 2000
"... The article focuses on distributed reinforcement learning in cooperative multiagent -decision-processes, where an ensemble of simultaneously and independently acting agents tries to maximize a discounted sum of rewards. We assume that each agent has no information about its teammates' beh ..."
Abstract
-
Cited by 99 (11 self)
- Add to MetaCart
The article focuses on distributed reinforcement learning in cooperative multiagent -decision-processes, where an ensemble of simultaneously and independently acting agents tries to maximize a discounted sum of rewards. We assume that each agent has no information about its teammates' behaviour. Thus, in contrast to single-agent reinforcement-learning each agent has to consider its teammates' behaviour and to nd a cooperative policy. We propose a model-free distributed Q-learning algorithm for cooperative multi-agent-decision-processes. It can be proved to nd optimal policies in deterministic environments. No additional expense is needed in comparison to the non-distributed case. Further there is no need for additional communication between the agents. 1. Introduction Reinforcement learning has originally been discussed for Markov Decision Processes (MDPs): a single agent has to learn a policy that maximizes the discounted sum of rewards in a stochastic environment...
Ants and Reinforcement Learning: A Case Study in Routing in Dynamic Networks
- In IJCAI (2
, 1998
"... We investigate two new distributed routing algorithms for data networks based on simple biological "ants" that explore the network and rapidly learn good routes, using a novel variation of reinforcement learning. These two algorithms are fully adaptive to topology changes and changes in li ..."
Abstract
-
Cited by 89 (0 self)
- Add to MetaCart
We investigate two new distributed routing algorithms for data networks based on simple biological "ants" that explore the network and rapidly learn good routes, using a novel variation of reinforcement learning. These two algorithms are fully adaptive to topology changes and changes in link costs in the network, and have space and computational overheads that are competitive with traditional packet routing algorithms: although they can generate more routing traffic when the rate of failures in a network is low, they perform much better under higher failure rates. Both algorithms are more resilient than traditional algorithms, in the sense that random corruption of routing state has limited impact on the computation of paths. We present convergence theorems for both of our algorithms drawing on the theory of non-stationary and stationary discrete-time Markov chains over the reals. We present an extensive empirical evaluation of our algorithms on a simulator that is widely used in the c...
Distributed Value Functions
- In Proceedings of the Sixteenth International Conference on Machine Learning
, 1999
"... Many interesting problems, such as power grids, network switches, and traffic flow, that are candidates for solving with reinforcement learning (RL), also have properties that make distributed solutions desirable. We propose an algorithm for distributed reinforcement learning based on distributing t ..."
Abstract
-
Cited by 74 (1 self)
- Add to MetaCart
(Show Context)
Many interesting problems, such as power grids, network switches, and traffic flow, that are candidates for solving with reinforcement learning (RL), also have properties that make distributed solutions desirable. We propose an algorithm for distributed reinforcement learning based on distributing the representation of the value function across nodes. Each node in the system only has the ability to sense state locally, choose actions locally, and receive reward locally (the goal of the system is to maximize the sum of the rewards over all nodes and over all time). However each node is allowed to give its neighbors the current estimate of its value function for the states it passes through. We present a value function learning rule, using that information, that allows each node to learn a value function that is an estimate of a weighted sum of future rewards for all the nodes in the network. With this representation, each node can choose actions to improve the performance of the overall...
Using Collective Intelligence To Route Internet Traffic
- IN ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 1999
"... A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to cont ..."
Abstract
-
Cited by 65 (24 self)
- Add to MetaCart
(Show Context)
A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to control internet traffic routing. These experiments indicate that COINs outperform all previously investigated RL-based, shortest path routing algorithms.