Results 21 - 30
of
58
A Mechanism for Emotion Signalling in Multiple Intelligent Virtual Agents
- University of Salford, Salford, United Kingdom
, 2004
"... Contents Abstract xiii Declaration xiv Copyright xv Declaration of honesty xvi ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Contents Abstract xiii Declaration xiv Copyright xv Declaration of honesty xvi
Automatic Generation of an Agent's Basic Behaviors
"... The agent approach, as seen by [9], intends to design "intelligent" behaviors. Yet, Reinforcement Learning (RL) methods often fail when confronted with complex tasks. We are therefore trying to develop a methodology for the automated design of agents (in the framework of Markov Decision Processes) i ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
The agent approach, as seen by [9], intends to design "intelligent" behaviors. Yet, Reinforcement Learning (RL) methods often fail when confronted with complex tasks. We are therefore trying to develop a methodology for the automated design of agents (in the framework of Markov Decision Processes) in the case where the global task can be decomposed into simpler-possibly concurrent-sub-tasks. Our main idea is to automatically combine basic behaviors using RL methods. This led us to propose two complementary mechanisms presented in the current paper. The first mechanism builds a global policy using a weighted combination of basic policies (which are reusable), the weights being learned by the agent (using Simulated Annealing in our case). An agent designed this way is highly scalable as, without further refinement of the global behavior, it can automatically combine several instances of the same basic behavior to take into account concurrent occurences of the same subtask. The second mechanism aims at creating new basic behaviors for combination. It is based on an incremental learning method that builds on the approximate solution obtained through the combination of older behaviors.
Learning To Do Without Cognition
- In [57
"... In this paper we show that a phenomenon in animal learning theory (the outcome devaluation effect) for which there is dispute over whether explicit representations and symbolic reasoning is required for its performance, does not require such things. This is done using a reactive motivational model, ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this paper we show that a phenomenon in animal learning theory (the outcome devaluation effect) for which there is dispute over whether explicit representations and symbolic reasoning is required for its performance, does not require such things. This is done using a reactive motivational model, previously inspired from ethological thought, to which some simple reinforcement learning rules are attached. An instantation of the model is used as the control system of an animat in a spatial computer simulation and it succeeds in learning the necessary parameters to allow the behaviour sequencing system to exhibit the phenomenon. 1 Introduction How complex can a reactive animat's behaviours get before some begin to appeal for a return to the well established rational techniques in classical artificial intelligence ? This paper offers an analysis and performance of a phenomenon in animal learning theory that provokes controversy about the type and complexity of the cognitive machinery ...
From SAB94 to SAB2000: What's New, Animat?
- In Proceedings of the Sixth International Conference on Simulation of Adaptive Behavior
, 2000
"... This paper is complementary to a previous review of signicant research on adaptive behavior in animats. It summarizes the current stateof -the art and outlines directions for possible progress. 1. Introduction In the proceedings of SAB94, we published a review of signicant research on adaptive ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper is complementary to a previous review of signicant research on adaptive behavior in animats. It summarizes the current stateof -the art and outlines directions for possible progress. 1. Introduction In the proceedings of SAB94, we published a review of signicant research on adaptive behavior in animats since the rst SAB conference, held in 1990 (MEYE94). This review summarized the state-of-the art, insofar as the proceedings of three dedicated conferences could help delineate it. Now that three other SAB conferences have been held, we considered that it would be useful to update that earlier review, in order to assess the corresponding progress, to infer the directions in which interesting developments are likely to be expected, and to stress needs for specic additional research eorts. As in the preceding review, this one makes reference only to SAB conference proceedings (SAB96, SAB98, SAB00), on the premise that this perspective, although voluntarily limited, d...
Learning Decision Trees for Action Selection in Soccer Agents
- In Proc. of Workshop on Agents in dynamic and real-time environments
, 2004
"... In highly-dynamic domains such as robotic soccer it is important for agents to take action rapidly, often in the order of a fraction of a second. This requires, a possible longer-term planning component notwithstanding, some form of reactive action selection mechanism. In this paper we report on res ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In highly-dynamic domains such as robotic soccer it is important for agents to take action rapidly, often in the order of a fraction of a second. This requires, a possible longer-term planning component notwithstanding, some form of reactive action selection mechanism. In this paper we report on results employing decision-tree learning to provide a ball-possessing soccer agent in the SIMULATION LEAGUE with such a mechanism. The approach has payed off in at least two ways. For one, the resulting decision tree applies to a much larger set of game situations than those previously reported and performs well in practice. For another, the learning method yielded a set of qualitative features to classify game situations, which are useful beyond reactive decision making.
Learning to Weigh Basic Behaviors in Scalable Agents
"... Agents, especially in the context of Multi-Agents Systems, are confronted to complex tasks. We propose a methodology for the automated design of such agents in the case where the global task can be decomposed into simpler sub-tasks that can be concurrent. This is accomplished by automatically combin ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Agents, especially in the context of Multi-Agents Systems, are confronted to complex tasks. We propose a methodology for the automated design of such agents in the case where the global task can be decomposed into simpler sub-tasks that can be concurrent. This is accomplished by automatically combining basic behaviors using Reinforcement Learning methods. Basic behaviors are either learned or reused from previous tasks as they do not need to be tuned to the specific task being learned. Furthermore, the agents designed by our methodology are highly scalable as, without further refinement of the global behavior, they can automatically combine several instances of the same basic behavior to take into account concurrent occurences of the same subtask.
Evolving symbolic controllers
- Applications of Evolutionary Computing, LNCS 2611
, 2003
"... Abstract. The idea of symbolic controllers tries to bridge the gap between the top-down manual design of the controller architecture, as advocated in Brooks ’ subsumption architecture, and the bottom-up designerfree approach that is now standard within the Evolutionary Robotics community. The design ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract. The idea of symbolic controllers tries to bridge the gap between the top-down manual design of the controller architecture, as advocated in Brooks ’ subsumption architecture, and the bottom-up designerfree approach that is now standard within the Evolutionary Robotics community. The designer provides a set of elementary behavior, and evolution is given the goal of assembling them to solve complex tasks. Two experiments are presented, demonstrating the efficiency and showing the recursiveness of this approach. In particular, the sensitivity with respect to the proposed elementary behaviors, and the robustness w.r.t. generalization of the resulting controllers are studied in detail. 1
On the Difficulty of Modular Reinforcement Learning for Real-World Partial Programming
, 2006
"... In recent years there has been a great deal of interest in “modular reinforcement learning” (MRL). Typically, problems are decomposed into concurrent subgoals, allowing increased scalability and state abstraction. An arbitrator combines the subagents’ preferences to select an action. In this work, w ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In recent years there has been a great deal of interest in “modular reinforcement learning” (MRL). Typically, problems are decomposed into concurrent subgoals, allowing increased scalability and state abstraction. An arbitrator combines the subagents’ preferences to select an action. In this work, we contrast treating an MRL agent as a set of subagents with the same goal with treating an MRL agent as a set of subagents who may have different, possibly conflicting goals. We argue that the latter is a more realistic description of real-world problems, especially when building partial programs. We address a range of algorithms for single-goal MRL, and leveraging social choice theory, we present an impossibility result for applications of such algorithms to multigoal MRL. We suggest an alternative formulation of arbitration as scheduling that avoids the assumptions of comparability of preference that are implicit in single-goal MRL. A notable feature of this formulation is the explicit codification of the tradeoffs between the subproblems. Finally, we introduce A²BL, a language that encapsulates many of these ideas.
An adaptive robot motivational system
- In Animals to Animats 9: Proceedings of the 9th International Conference on Simulation of Adaptive Behavior (SAB-06
, 2006
"... Abstract. We present a robot motivational system design framework. The framework represents the underlying (possibly conflicting) goals of the robot as a set of drives, while ensuring comparable drive levels and providing a mechanism for drive priority adaptation during the robot’s lifetime. The res ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. We present a robot motivational system design framework. The framework represents the underlying (possibly conflicting) goals of the robot as a set of drives, while ensuring comparable drive levels and providing a mechanism for drive priority adaptation during the robot’s lifetime. The resulting drive reward signals are compatible with existing reinforcement learning methods for balancing multiple reward functions. We illustrate the framework with an experiment that demonstrates some of its benefits. 1
Optimistic Initial Q-values and the max Operator
- University of Edinburgh Printing Services
, 2001
"... This paper provides a surprising new insight into the role of the max operator used by reinforcement learning algorithms to estimate the future return available to an agent. It is shown how optimistic Q-value estimates prevent learning updates from being eective at quickly minimising the error in th ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
This paper provides a surprising new insight into the role of the max operator used by reinforcement learning algorithms to estimate the future return available to an agent. It is shown how optimistic Q-value estimates prevent learning updates from being eective at quickly minimising the error in the predicted available maximum future return. Experimental results show that, when the eect of optimism on the agent's exploration strategy is accounted for, learning generally proceeds more quickly if non-optimistic initial Q-values are provided. In existing work, optimistic Q-values are frequently used when agents need to manage a tradeo between exploration and exploitation. This paper presents a simple way to avoid the learning problems this can cause.

