Results 1 - 10
of
11
Online Planning for Ad Hoc Autonomous Agent Teams
- PROCEEDINGS OF THE TWENTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
"... We propose a novel online planning algorithm for ad hoc team settings—challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
We propose a novel online planning algorithm for ad hoc team settings—challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose actions. The utility function in each stage game is estimated via Monte-Carlo tree search using the UCT algorithm. We establish analytically the convergence of the algorithm and show that it performs well in a variety of ad hoc team domains.
Leading Ad Hoc Agents in Joint Action Settings with Multiple Teammates
, 2012
"... The growing use of autonomous agents in practice may require agents to cooperate as a team in situations where they have limited prior knowledge about one another, cannot communicate directly, or do not share the same world models. These situations raise the need to design ad hoc team members, i.e., ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
(Show Context)
The growing use of autonomous agents in practice may require agents to cooperate as a team in situations where they have limited prior knowledge about one another, cannot communicate directly, or do not share the same world models. These situations raise the need to design ad hoc team members, i.e., agents that will be able to cooperate without coordination in order to reach an optimal team behavior. This paper considers the problem of leading N-agent teams by an agent toward their optimal joint utility, where the agents compute their next actions based only on their most recent observations of their teammates’ actions. We show that compared to previous results in two-agent teams, in larger teams the agent might not be able to lead the team to the action with maximal joint utility, thus its optimal strategy is to lead the team to the best possible reachable cycle of joint actions. We describe a graphical model of the problem and a polynomial time algorithm for solving it. We then consider other variations of the problem, including leading teams of agents where they base their actions on longer history of past observations, leading a team by more than one ad hoc agent, and leading a teammate while the ad hoc agent is uncertain of its behavior.
Ad Hoc Teamwork for Leading a Flock
"... Designing agents that can cooperate with other agents as a team, without prior coordination or explicit communication, is becoming more desirable as autonomous agents become more prevalent. In this paper we examine an aspect of the problem of leading teammates in an ad hoc teamwork setting, where th ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
(Show Context)
Designing agents that can cooperate with other agents as a team, without prior coordination or explicit communication, is becoming more desirable as autonomous agents become more prevalent. In this paper we examine an aspect of the problem of leading teammates in an ad hoc teamwork setting, where the designed ad hoc agents lead the other teammates to a desired behavior that maximizes team utility. Specifically, we consider the problem of leading a flock of agents to a desired orientation using a subset of ad hoc agents. We examine the problem theoretically, and set bounds on the extent of influence the ad hoc agents can have on the team when the agents are stationary. We use these results to examine the complicated problem of orienting a stationary team to a desired orientation using a set of nonstationary ad hoc agents. We then provide an empirical evaluation of the suggested solution using our custom-designed simulator FlockSim.
Comparative evaluation of MAL algorithms in a diverse set of ad hoc team problems.
- In Proc. of AAMAS,
, 2012
"... ABSTRACT This paper is concerned with evaluating different multiagent learning (MAL) algorithms in problems where individual agents may be heterogenous, in the sense of utilizing different learning strategies, without the opportunity for prior agreements or information regarding coordination. Such ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
ABSTRACT This paper is concerned with evaluating different multiagent learning (MAL) algorithms in problems where individual agents may be heterogenous, in the sense of utilizing different learning strategies, without the opportunity for prior agreements or information regarding coordination. Such a situation arises in ad hoc team problems, a model of many practical multiagent systems applications. Prior work in multiagent learning has often been focussed on homogeneous groups of agents, meaning that all agents were identical and a priori aware of this fact. Also, those algorithms that are specifically designed for ad hoc team problems are typically evaluated in teams of agents with fixed behaviours, as opposed to agents which are adapting their behaviours. In this work, we empirically evaluate five MAL algorithms, representing major approaches to multiagent learning but originally developed with the homogeneous setting in mind, to understand their behaviour in a set of ad hoc team problems. All teams consist of agents which are continuously adapting their behaviours. The algorithms are evaluated with respect to a comprehensive characterisation of repeated matrix games, using performance criteria that include considerations such as attainment of equilibrium, social welfare and fairness. Our main conclusion is that there is no clear winner. However, the comparative evaluation also highlights the relative strengths of different algorithms with respect to the type of performance criteria, e.g., social welfare vs. attainment of equilibrium.
Teaching and Leading an Ad Hoc Teammate: Collaboration without Pre-Coordination
, 2013
"... As autonomous agents proliferate in the real world, both in software and robotic settings, they will increasingly need to band together for cooperative activities with previously unfamiliar teammates. In such ad hoc team settings, team strategies cannot be developed a priori. Rather, an agent must b ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
As autonomous agents proliferate in the real world, both in software and robotic settings, they will increasingly need to band together for cooperative activities with previously unfamiliar teammates. In such ad hoc team settings, team strategies cannot be developed a priori. Rather, an agent must be prepared to cooperate with many types of teammates: it must collaborate without pre-coordination. This article defines two aspects of collaboration in two-player teams, involving either simultaneous or sequential decision making. In both cases, the ad hoc agent is more knowledgeable of the environment, and attempts to influence the behavior of its teammate such that they will attain the optimal possible joint utility.
Ad Hoc Teamwork Modeled with Multiarmed Bandits: An Extension to Discounted Infinite Rewards
- In Proc. Int. Conf. Autonomous Agents and Multiagent Systems - Adaptive Learning Agents Workshop
, 2011
"... Before deployment, agents designed for multiagent team settings are commonly developed together or are given standardized communication and coordination protocols. However, in many cases this pre-coordination is not possible because the agents do not know what agents they will encounter, resulting i ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Before deployment, agents designed for multiagent team settings are commonly developed together or are given standardized communication and coordination protocols. However, in many cases this pre-coordination is not possible because the agents do not know what agents they will encounter, resulting in ad hoc team settings. In these problems, the agents must learn to adapt and cooperate with each other on the fly. We extend existing research on ad hoc teams, providing theoretical results for handling cooperative multi-armed bandit problems with infinite discounted rewards. Categories and SubjectDescriptors
Weighted synergy graphs for effective team formation with heterogeneous ad hoc agents
- Artif. Intell
, 2014
"... Previous approaches to select agents to form a team rely on single-agent capabilities, and team performance is treated as a sum of such known capabilities. Motivated by complex team formation situations, we address the problem where both single-agent capabilities may not be known upfront, e.g., as i ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Previous approaches to select agents to form a team rely on single-agent capabilities, and team performance is treated as a sum of such known capabilities. Motivated by complex team formation situations, we address the problem where both single-agent capabilities may not be known upfront, e.g., as in ad hoc teams, and where team performance goes beyond single-agent capabilities and depends on the specific synergy among agents. We formally introduce a novel weighted synergy graph model to capture new interactions among agents. Agents are represented as vertices in the graph, and their capabilities are represented as Normally-distributed variables. The edges of the weighted graph represent how well the agents work together, i.e., their synergy in a team. We contribute a learning algorithm that learns the weighted synergy graph using observations of performance of teams of only two and three agents. Further, we contribute two team formation algorithms, one that finds the optimal team in exponential time, and one that approximates the optimal team in polynomial time. We extensively evaluate our learning algorithm, and demonstrate the expressiveness of the weighted synergy graph in a variety of problems. We show our approach in a rich ad hoc team formation problem capturing a rescue domain, namely the RoboCup Rescue domain, where simulated robots rescue civilians and put out fires in a simulated urban disaster. We show that the weighted synergy graph outperforms a competing algorithm, thus illustrating the efficacy of our model and algorithms. 1
Representation, Planning, and Learning of Dynamic Ad Hoc Robot Teams
, 2011
"... Task allocation involves the division of tasks among a team of robots, such that each robot is responsible for a subset of the tasks. Similarly, in role assignment, roles are typically defined to be performed by a single robot whose performance is independent of the composition of its team. Complex ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Task allocation involves the division of tasks among a team of robots, such that each robot is responsible for a subset of the tasks. Similarly, in role assignment, roles are typically defined to be performed by a single robot whose performance is independent of the composition of its team. Complex tasks, that cannot be sub-divided and require multiple robots cooperating, require the formation of a coalition of robots to complete. We are interested in forming an effective ad hoc team to solve a task, through observations of the robots ’ performance in the task and modeling the synergistic effects among robots in the team. Ad hoc teams are common in sports such as soccer, where human players without prior interactions form a team and are capable of playing the game. Currently, while robots within a team can play soccer well, they are unable to form ad hoc teams with robots developed by multiple research groups. This general problem is also seen in urban search-and-rescue (USAR), where large groups of heterogeneous robots would be deployed to solve complex tasks such as putting out fires, rescuing people and clearing road blockages. This thesis represents team performance as a function of the individual capabilities of the
Ad hoc coordination in multiagent systems with applications to human-machine interaction
- In International Conference on Autonomous Agents and Multi-agent Systems (AAMAS
, 2013
"... This thesis is concerned with the ad hoc coordination prob-lem, in which the goal is to design an autonomous agent which is able to achieve optimal flexibility and efficiency in a multiagent system with no mechanisms for prior behavioural coordination. The thesis is primarily motivated by human-mach ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
This thesis is concerned with the ad hoc coordination prob-lem, in which the goal is to design an autonomous agent which is able to achieve optimal flexibility and efficiency in a multiagent system with no mechanisms for prior behavioural coordination. The thesis is primarily motivated by human-machine interaction problems, which can often be formulated in this setting. This paper gives a brief account of the current state of the thesis and future milestones.
Comparison of Multiagent Learning Algorithms in Ad Hoc Teams
, 2011
"... Multiagent Learning (MAL) is the algorithmic study of learning in a group of two or more agents. If the agents are based on different algorithms, and if there is no form of prior coordination between the agents, then this is called an ad hoc team problem [48]. Following a literature review [2] and a ..."
Abstract
- Add to MetaCart
Multiagent Learning (MAL) is the algorithmic study of learning in a group of two or more agents. If the agents are based on different algorithms, and if there is no form of prior coordination between the agents, then this is called an ad hoc team problem [48]. Following a literature review [2] and a research proposal [1], the work at hand compares the performance of five MAL algorithms in ad hoc teams. These include the Joint Action Learner [12], the Conditional Joint Action Learner [3], Win or Learn Fast with Policy Hill Climbing [6], Modified Regret-Matching [21], and the Nash Q-Learner [24]. The algorithms are evaluated in a range of strategic games, including no-conflict games in which the players agree on what is most preferred, and conflict games in which the players disagree on what is most preferred [40]. In addition, we use an evaluation procedure proposed by Stone et al. [48]. Our performance criteria include the convergence rate, the final expected payoff, social welfare and fairness, and the rates of different solution types. From the results we conclude that (a) all algorithms perform well in some sense (i.e., there is no clear winner), and (b) the performance of an algorithm ultimately