Home     Top: Machine Learning: Reinforcement Learning    [Case-based Learning   Fuzzy Systems   Genetic Algorithms   Neural Networks   Pattern Recognition   Reinforcement Learning   Rule Based Systems   Vision]

Change ordering:   Authority   Hubs (tutorials)   Date   Expected authority       Show titles only
Reverse date order

This directory is created automatically and some papers may be mislabeled. Only document within the CiteSeer database are listed. The directory is intended to provide entry points for browsing the database and is not intended to be authoritative. Papers may not appear in all relevant categories. For example, papers in a sub-category may not appear in higher level categories.

Selection of Behavior in Social Situations: Application to the.. - Delepoulle, Preux, Darcheville (2037)   (Correct)
The law of eect is a very simple law which relates the probability of emission of a behavior by a living being to the consequences of the emission of this behavior by this living being in the past.... /

Multiagent Learning Using a Variable Learning Rate - Bowling, Veloso (2002)   (Correct)
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the o... / We then contribute a new reinforcement learning technique using a br Multiagent learning reinforcement learning game theory

Information Retrieval on the World Wide Web and Active Logic: A.. - Barfourosh, Nezhad, Anderson, Perlis (2002)   (Correct)
As more information becomes available on the World Wide Web (there are currently over 4 billion pages covering most areas of human endeavor), it becomes more difficult to provide effective search tool... / documents . . Reinforcement Learning . . Case Based br . . we will review reinforcement learning from this category of

Pixel-based Behavior Learning - Hugues, Drogoul (2002)   (Correct)
In this paper we address the problem of learning behaviors for autonomous mobile robots. We particularly focus on methods which enable a human user to train a robot in its real destination environment... / methods such as reinforcement learning or genetic br behaviorbased robots using reinforcement learning'Artificial

Algorithm-Directed Exploration for Model-Based Reinforcement Learning .. - Guestrin, Patrascu, Schuurmans (2002)   (Correct)
One of the central challenges in reinforcement learning is to balance the exploration/exploitation tradeoff while scaling up to large problems. Although model-based reinforcement learning has been... /

A Comparison between ATNoSFERES and XCSM - Landau, Picault, Sigaud, al. (2002)   (Correct)
In this paper we present ATNoSFERES, a new framework based on an indirect encoding Genetic Algorithm which builds finite-state automata controllers able to deal with perceptual aliasing. We compare it... /

Existence of Multiagent Equilibria with Limited Agents - Bowling, Veloso (2002)   (Correct)
Multiagent learning is a neccessary yet challenging problem as multiagent systems become more prevalent and environments become more dynamic. Much of the groundbreaking work in this area draws on nota... / multiagent learning reinforcement learning multiagent systems br of attention as a way for reinforcement learning to scale to large

Coordinated Reinforcement Learning - Guestrin, Lagoudakis, Parr (2002)   (Correct)
We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value function. This struc... / Coordinated Reinforcement Learning Carlos Guestrin br algorithms for multiagent reinforcement learning. A common feature of

A Multiagent Reinforcement Learning Algorithm by Dynamically Merging.. - Ghavamzadeh, Mahadevan (2002)   (Correct)
One general strategy for accelerating the learning of cooperative multiagent tasks is to reuse (good or optimal) solutions to the task when each agent is acting alone. In this paper, we formalize this... /

MySpiders : Evolve your own intelligent Web crawlers - Pant, Menczer (2002)   (Correct)
The dynamic nature of the World Wide Web makes it a challenge to find information that is both relevant and recent. Intelligent agents can complement the power of search engines to meet this challenge... / by evolutionary and reinforcement learning. The goal is to maintain

Experience Stack Reinforcement Learning for Off-Policy Control - Reynolds (2002)   (Correct)
This paper introduces a novel method for allowing backwards replay to be applied as an online learning algorithm. The general technique can be adapted to provide analogues of most existing eligibili... /

Differential Join Prices for Parallel Queues: Social Optimality.. - Parijat Dube Vivek (2002)   (Correct)
We consider a system of identical parallel queues served by a single server and distinguished only by the price charged at entry. A Poisson stream of customers joins the queue by a greedy policy that ... / programming equation and a reinforcement learning based online pricing br methods based on reinforcement learning see e.g.

Accelerated Focused Crawling through Online Relevance Feedback - Chakrabarti, Punera, Subramanyam (2002)   (Correct)
The organization of HTML into a tag tree structure, which is rendered by browsers as roughly rectangular regions with embedded text and HREF links, greatly helps surfers locate and click on links that... / Document object model Reinforcement learning. Introduction br paradigm is related to reinforcement learning and AI programs that

Genetic Programming and Multi-Agent Layered Learning by Reinforcements - Hsu, Gustafson (2002)   (Correct)
We present an adaptation of the standard genetic program (GP) to hierarchically decomposable, multi-agent learning problems. To break down a problem that requires cooperation of multiple agents, we us... /

Two Views of Classifier Systems - Kovacs (2002)   (Correct)
This work suggests two ways of looking at Michigan classifier systems; as Genetic Algorithm-based systems, and as Reinforcement Learning-based systems, and argues that the former is more suitable for ... /

Co-Evolutionary Auction Mechanism Design: A Preliminary Report - Phelps, McBurney, Parsons, Sklar (2002)   (Correct)
Auctions can be thought of as resource allocation mechanisms. The economic theory behind such systems is mechanism design. Traditionally, economists have approached design problems by studying... unkn... /

Learning with Deictic Representation - Finney, Gardiol, Kaelbling, Oates (2002)   (Correct)
Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic representatio... /

A Reinforcement-Learning Approach to Power Management - Steinbach (2002)   (Correct)
We describe an adaptive, mid-level approach to the wireless device power management problem. Our approach is based on reinorcement learning, a machine learning framework for autonomous agents. We desc... /

Adaptive Spatio-Temporal Organization in Groups of Robots - Dahl, Mataric, al. (2002)   (Correct)
The complex dynamics of multi-robot systems make it difficult to design general control algorithms for groups of robots. By general we mean algorithms that are able to do similar tasks well over a ran... /

Adaptive Combination of Behaviors in an Agent - Buffet, Dutech, Charpillet (2002)   (Correct)
Agents are of interest mainly when confronted with complex tasks. We propose a methodology for the automated design of such agents (in the framework of Markov Decision Processes) in the case where the... / basic behaviors using Reinforcement Learning methods. The main idea is br only local perceptions. Reinforcement Learning RL can be applied in

A Comparison of Decision Making Criteria and Optimization Methods for .. - Mihaylova, Lefebvre, Bruyninckx.. (2002)   (Correct)
This work presents a comparison of decision making criteria and optimization methods for active sensing in robotics. Active sensing incorporates the following aspects: (i) where to position sensors,... / procedures reinforcement learning the robot needs to br . R. Sutton and A. Barto Reinforcement Learning An introduction. MIT

Model Minimization in Hierarchical Reinforcement Learning - Ravindran, Barto (2002)   (Correct)
When applied to real world problems Markov Decision Pro- cesses (MDPs) often exhibit considerable implicit redundancy, especially when there are symmetries in the problem. In this article we present... /

Efficient Learning of Reactive Robot Behaviors with a Neural-Q.. - Carreras, Ridao, Batlle, Nicosevici (2002)   (Correct)
The purpose of this paper is to propose a NeuralQ_learning approach designed for online learning of simple and reactive robot behaviors. In this approach, the Q_function is generalized by a multi-laye... /

Learning Reactive Robot Behaviors with a Neural-Q Learning Approach - Carreras, Ridao, Garcia, Ursulovici (2002)   (Correct)
The purpose of this paper is to propose a NeuralQ_learning approach designed for online learning of simple and reactive robot behaviors. In this approach, the Q_function is generalized by a multi-laye... /

Selection of Behavioral Parameters: Integration of Discontinuous.. - Lee, Likhachev, Arkin (2002)   (Correct)
This paper studies the effects of the integration of two learning algorithms, Case-Based Reasoning (CBR) and Learning Momentum (LM), for the selection of behavioral parameters in real-time for robotic... /

Fixed vs Dynamic Sub-transfer in Reinforcement Learning - Carroll (2002)   (Correct)
We survey various transfer methods in Q-learning, a type of reinforcement learning, and present a variation on fixed sub-transfer which we call dynamic sub-transfer. We describe the pros and cons of d... /

Connectionist Learning Classifier System - Vasilyev (2002)   (Correct)
Impetuous development of artificial neural networks makes it possible to transfer many ideas from this area into adjacent areas. This work investigates an opportunity of mapping learning classifier sy... /

Adaptive Behaviour, Autonomy and Value systems. Normative function in .. - Barandiaran (2002)   (Correct)
Computational functionalism [5] fails to understand the embodied and sit- uated nature of behaviour by taking steady state functions as theoretical primitives, and by interpreting cognitive behaviour ... /

Multi-Robot Task-Allocation through Vacancy Chains - Dahl, Mataric, Sukhatme (2002)   (Correct)
This paper presents an algorithm for task allocation in groups of homogeneous robots. The algorithm is based on vacancy chains, a resource distribution strategy common in human and animal societies. W... / and demonstrate how Reinforcement Learning can be used to make br local task selection and Reinforcement Learning RL for estimation of

IEEE-TTTC International Conference on Automation, Quality and Testing, - Robotics May Cluj-Napoca (2002)   (Correct)
The purpose of this paper is to propose a Neural-Q_learning approach designed for online learning of simple and reactive robot behaviors. In this approach, the Q_function is generalized by a multi-lay... /

Qualitative Velocity and Ball Interception - Stolzenburg, Obst, Murray (2002)   (Correct)
In many approaches for qualitative spatial reasoning, navigation of an agent in a more or less static environment is considered (e.g. in the double-cross calculus [13]). However, in general, the env... /

Performance and Population State Metrics for - Rule-Based Learning Systems (2002)   (Correct)
We distinguish two types of metric for the evaluation of rule-based learning systems: performance metrics are derived from the feedback to the learning agent from its teacher or environment, while pop... /

The Role of Expressiveness and Attention in Human-Robot Interaction - Bruce, Nourbakhsh, Simmons (2002)   (Correct)
This paper presents the results of an experiment in human-robot social interaction. Its purpose was to measure the impact of certain features and behaviors on people's willingness to engage in a short... /

Decision-Theoretic Robotic Surveillance - Massios (2002)   (Correct)
ix Acknowledgments First and foremost I would like to express my gratitude to Leo Dorst and Frans Voorbraak. This thesis is mainly the result of the weekly interaction I had with them and the expert ... /

Autonomous Agent Control Using Connectionist XCS Classifier System - Vasilyev (2002)   (Correct)
In this paper a new connectionist classifier system CXCS is proposed, which uses a layer of competitive artificial neurons for decision making. New algorithms of the CXCS learning in multi-step proble... /

Asr System Modeling For Automatic Evaluation And Optimization Of.. - Pietquin, Renals (2002)   (Correct)
Though the field of spoken dialogue systems has developed quickly in the last decade, rapid design of dialogue strategies remains uneasy. Several approaches to the problem of automatic strategy learni... / proposed and the use of Reinforcement Learning introduced by Levin and br Processes MDPs and Reinforcement Learning RL was proposed by

Optimizing Dialogue Management with Reinforcement Learning.. - Satinder Singh Diane (2002)   (Correct)
Designing the dialogue policy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a dialogue policy, whic... / Dialogue Management with Reinforcement Learning Experiments with the br This paper presents a reinforcement learning approach for automatically

Reinforcement Learning with Long Short-Term Memory - Bakker (2002)   (Correct)
This paper presents reinforcement learning with a Long ShortTerm Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage### learning and directed exploration can solve non-Mark... / Reinforcement Learning with Long Short-Term br This paper presents reinforcement learning with a Long ShortTerm

Numerical Optimization with Neuroevolution - Greer (2002)   (Correct)
Neuroevolution techniques have been successful in many sequential decision tasks such as robot control and game playing. This paper aims at establishing whether they can be useful in numerical optimiz... / more powerful than standard reinforcement learning techniques in many br tasks faster than standard reinforcement learning methods and other

Synthesis of Robot's Behaviors from few Examples - Hugues, Drogoul (2002)   (Correct)
This paper addresses the problem of acquiring robot's behaviors for real environments. It insists on the interest of learning behaviors during robot's interaction with the environment under the contro... / methods such as reinforcement learning Mahadevan and Connell br behavior-based robots using reinforcement learning. Artificial

Sparse Coding In The Primate Cortex - Földiak (2002)   (Correct)
INTRODUCTION Brain function can be seen as computation, i.e. the manipulation of information necessary for survival. Computation itself is an abstract process but it must be performed or implemented ... /

Co-Evolution of Auction Mechanisms and Trading Strategies: Towards a.. - Phelps, Parsons, McBurney, Sklar (2002)   (Correct)
Mechanism design is the economic theory of the design of effective resource allocation mechanisms, such as auctions. Traditionally, economists have approached design problems by studying the analytic ... /

XCS with Average Reward Criterion in Multi-step Environment - Tharakunnel, Goldberg (2002)   (Correct)
In multi-step environment, the XCS prediction parameter is an estimate of the discounted sum of successive rewards. Thus, XCS is designed to address sequential (multi-step) decision problems with the ... /

jfipa - an Architecture for Agent-based Grid Computing - Tveit (2002)   (Correct)
With the increasing focus on grid development, there is a need for proper abstractions for modelling grid applications. Viewed from a distributed AI perspective the most suitable abstraction is the co... /

Incremental Learning of Factorial Markov Decision Processes - Rohanimanesh, Mahadevan (2002)   (Correct)
We investigate a general approach to approximately learning a compact and structured representation of the transition model for Factorial Markov Decision Processes (FMDPs). FMDPs are based on mixed me... /

A neural model for multi-expert architectures - Toussaint (2002)   (Correct)
We present a generalization of conventional artificial neural networks that allows for a functional equivalence to multi-expert systems. The new model provides an architectural freedom going beyond ex... /

Theory of Generalization and Learning in XCS - Butz, Kovacs, Lanzi, Wilson (2002)   (Correct)
The XCS classifier system evolves accurate, maximally general solutions to a wide variety of machine learning, data, and robotics problems, but a theoretical basis for these properties has not been pr... /

High-Level Control Of Autonomous Robots Using A Behavior based Scheme .. - Carreras, Yuh, Batlle (2002)   (Correct)
This paper proposes a behavior-based scheme for high-level control of autonomous robots. Two main characteristics can be highlighted in the control scheme. unknown HIGH-LEVEL CONTROL OF AUTONOMOUS RO... /

Value Function Approximation in Zero-Sum Markov Games - Lagoudakis, Parr (2002)   (Correct)
This paper investigates value function approximation in the context of zero-sum Markov games, which can be viewed as a generalization of the Markov decision process (MDP) framework to the two-agen... /

Collective Intelligence, Data Routing and Braess' Paradox - Wolpert, Tumer (2002)   (Correct)
We consider the problem of designing the the utility functions of the utility-maximizing agents in a multi-agent system (MAS) so that they work synergistically to maximize a global utility. The part... /

Learning to Autonomously Select Landmarks for Navigation And.. - Marsland (2002)   (Correct)
Selecting landmarks for use by a navigating mobile robot is important for map-building systems. However, it can also provide a way by which robots can communicate route information, so that one robot ... /

Learning behavior-selection in a multi-goal robot task - Gadanho, Custódio (2002)   (Correct)
The purpose of the work reported were is the development of an autonomous robot controller which learns to perform a multi-goal and multi-step task when faced with real world problems such as continuo... /

An Algorithmic Description of ACS2 - Butz, Stolzmann (2002)   (Correct)
The various modi cations and extensions of the anticipatory classi er system (ACS) recently led to the introduction of ACS2, an enhanced and modi ed version of ACS. This chapter provides an overv... /

Tournamentselection in XCS - Butz, Sastry, Goldberg (2002)   (Correct)
Selection in the accuracy-based learning classifier system XCS, introduced by Wilson in 1995, has always been done by the means of proportionate selection. Although it is known from GA literature that... /

Bornholm Web Mining Techniques - Nielsen (2002)   (Correct)
he classi er for the document relevance with downloaded documents Some interesting document might be separated by non-relevant documents, e.g., { Reinforcement learning (Rennie and McCallum, 1999... / documents e.g.Reinforcement learning Rennie and McCallum br McCallum A. Using reinforcement learning to spider the web

Model-Based Reinforcement Learning in Dynamic Environments - Wiering (2002)   (Correct)
We study using reinforcement learning in particular dynamic environments. Our environments can contain... unknown Model-Based Reinforcement Learning in Dynamic Environments Marco A. Wiering marco@c... / Model-Based Reinforcement Learning in Dynamic Environments br Abstract We study using reinforcement learning in particular dynamic

PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in.. - Pickett, Barto (2002)   (Correct)
We present PolicyBlocks, an algorithm by which a reinforcement learning agent can extract useful macro-actions from a set of related tasks. The agent creates macroactions by finding commonalities in s... /

Biasing Exploration in an Anticipatory Learning Classifier System - Butz (2002)   (Correct)
The chapter investigates how model and behavioral learning can be improved in an anticipatory learning classifier system by biasing exploration. First, the applied system ACS2 is explained. Next, an... /

Adaptive Dialogue Systems - Interaction with Interact - Jokinen, Kerminen, Kaipainen.. (2002)   (Correct)
Technological development has made computer interaction more common and also commercially feasible, and the number of interactive systems has grown rapidly. At the same time, the systems should be abl... / evaluators are based on reinforcement learning. Simple examples of br management comes from the reinforcement learning algorithm of this

Machine Control Using Radial Basis Value Functions and Inverse State.. - Buck, Stulp, Beetz, Schmitt (2002)   (Correct)
In this paper, we propose (1) to use radial basis functions for value function approximation in continuous space reinforcement learning and (2) the use of learned inverse projection functions for stat... / learning algorithms. Reinforcement learning has proven to be such a br solved successfully by reinforcement learning as well as tasks with

Reinforcement Learning Using Neural Networks, with Applications to.. - Coulom (2002)   (Correct)
This thesis is a study of practical methods to estimate value functions with feedforward neural networks in model-based reinforcement learning. Focus is placed on problems in continuous time and space... / . Reinforcement Learning using Neural Networks . br neural networks and reinforcement learning-can help to solve such

Neural Networks for Multi-Instance Learning - Zhou, Zhang (2002)   (Correct)
Multi-instance learning originates from the investigation on drug activity prediction, where the task is to predict whether an unseen molecule could be used to make some drug. Such a problem is diffic... /

An integrated approach to hierarchy and abstraction for POMDPs - Pineau, Thrun (2002)   (Correct)
This paper presents an algorithm for planning in structured partially observable Markov Decision Processes (POMDPs). The new algorithm, named PolCA (for Policy-Contingent Abstraction) uses an action-b... /

Robust Non-Linear Control through Neuroevolution - Gomez, Miikkulainen (2002)   (Correct)
Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but often it is unclear what... /

Towards a Game Agent - Niederberger, Gross (2002)   (Correct)
The objective of this report is to give the reader a survey on state-of-the-art techniques and academic research in the field of artificial life where the simulation of complex and emergent behavior i... /

Event-Learning And Robust Policy Heuristics - Lörincz, Pólik, Szita (2001)   (Correct)
In this paper we introduce a novel form of reinforcement learning called event-learning or E-learning. Events are ordered pairs of consecutive states. We define the corresponding event-value functio... / introduce a novel form of reinforcement learning called event-learning or br Key words and phrases. reinforcement learning robust control event

Web Interaction and the Navigation Problem in Hypertext written for.. - Levene, Loizou (2001)   (Correct)
The web has become a ubiquitous tool, used in day-to-day work, to find information and conduct business, and it is revolutionising the role and availability of information. One of the problems encou... / method we describe is a reinforcement learning algorithm that attaches br a web view is a reinforcement learning algorithm that attaches

Scaling Reinforcement Learning toward RoboCup Soccer - Stone, Sutton (2001)   (Correct)
RoboCup simulated soccer presents many challenges to reinforcement learning methods, including a large state space, hidden and uncertain state, multiple agents, and long and variable delays in the... / Scaling Reinforcement Learning toward RoboCup Soccer br many challenges to reinforcement learning methods including a

Ant Colony Control for Autonomous Decentralized Shop Floor Routing - Cicirello, Smith (2001)   (Correct)
In this paper, we introduce a new approach to autonomous decentralized shop floor routing. Our system, which we call Ant Colony Control (AC 2 ), applies the analogy of a colony of ants foraging for ... / of a dispatch policy using reinforcement learning techniques incorporating br and M. Riedmiller. A neural reinforcement learning approach to learn local

Hierarchical Multi Agent Reinforcement Learning - Makar, Mahadevan (2001)   (Correct)
Hierarchical reinforcement learning methods have previously been shown to speed up learning primarily in single-agent domains. In this paper we explore the use of this spatio-temporal abstraction m... / Hierarchical Multi Agent Reinforcement Learning Rajbala Makar br Abstract Hierarchical reinforcement learning methods have previously

Evolving Neural Networks through Augmenting Topologies - Stanley, Miikkulainen (2001)   (Correct)
An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights. We present a method, NeuroEvolution of Augmenting Topologies (NEAT) that... / on a challenging benchmark reinforcement learning task. We claim that the br great promise in complex reinforcement learning tasks Gomez and

Variable Resolution Discretization in Optimal Control - Munos, Moore (2001)   (Correct)
The problem of state abstraction is of central importance in optimal control, reinforcement learning and Markov decision processes. This paper studies the case of variable resolution state abstraction... / in optimal control reinforcement learning and Markov decision br Keywords Optimal control reinforcement learning variable resolution

Actor-Critic Algorithms - Konda, Tsitsiklis (2001)   (Correct)
In this paper, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal dierence (TD) learning with a linearly parameterized ... / dicult to identify. Reinforcement Learning RL and Neuro-Dynamic br approximation and reinforcement learning. SIAM Journal on Control

DEDUCTIVE VERSUS INDUCTIVE EQUILIBRIUM Selection: Experimental Results - Haruvy, Stahl (2001)   (Correct)
The debate in equilibrium selection appears to have culminated in the formation of two schools of thought: those that favor equilibrium selection based on rational coordination and those that favor ze... / race among seven action-reinforcement learning models and found that a br A's Roth-Erev reinforcement learning predicts A's and

Economic Value of EWA Lite: A Functional Theory of Learning in Games - Ho, Camerer, Chong (2001)   (Correct)
This paper describes a theory of learning in decisions and games called EWA Lite, with only one parameter. EWA Lite predicts the time path of individual behavior in any normal-form game (given initial... / best but one kind of reinforcement learning predicts well in games br versions of belief and reinforcement learning and quantal response

Hierarchical Multi-Agent Reinforcement Learning - Makar, Mahadevan, al. (2001)   (Correct)
In this paper we investigate the use of hierarchical reinforcement learning to speed up the acquisition of cooperative multi-agent tasks. We extend the MAXQ framework to the multi-agent case. Each age... / Hierarchical Multi-Agent Reinforcement Learning Rajbala Makar br the use of hierarchical reinforcement learning to speed up the

Pigs and People - Pauls (2001)   (Correct)
Pigs and people' is a simulated environment, in which action selection mechanisms can be evaluated and compared. Action selection mechanisms attempt to solve the action selection problem faced by b... / . . Reinforcement br efforts aiming to introduce reinforcement learning see section into

Towards Bounded-Rationality in Multi-Agent Systems: A.. - Raja, Lesser (2001)   (Correct)
Sophisticated agents operating in open environments must make complex real-time control decisions on scheduling and coordination of domain activities. These decisions are made in the context of limi... / in Multi-Agent Systems A Reinforcement-Learning Based Approach Anita br made by these agents using reinforcement learning methods. Our approach is

Learning in Worlds with Objects - Kaelbling, Oates, Hernandez, Finney (2001)   (Correct)
Introduction Weareinterested in building systems that learn to interact with complex real world environments, by representing the dynamics of the world with models that allow strong generalization th... / Most work on reinforcement learning assumes that the agent br expected value. Basic reinforcement-learning methods such as

Cooperative Coevolution of Multi-Agent Systems - Yong, Miikkulainen (2001)   (Correct)
In certain tasks such as pursuit and evasion, multiple agents need to coordinate their behavior to achieve a common goal. An interesting question is, how can such behavior best be evolved? When the ag... / efficient in single-agent reinforcement learning tasks is first extended br by robot teams using reinforcement learning. He found that when the

Genetic Algorithms And Reinforcement Learning For The Tactical Fixed.. - Santos, Jr., ZHONG (2001)   (Correct)
this paper, we explore unknown International Journal on Arti cial Intelligence Tools Vol.10, No.1-2 (2001) 000|000 c World Scienti c Publishing Company GENETIC ALGORITHMS AND REINFORCEMENT LEARNING ... / Genetic Algorithms And Reinforcement Learning For The Tactical Fixed br kinds of problems. We use a reinforcement learning system to adaptively

Reinforcement Learning with Function Approximation Converges to a.. - Gordon (2001)   (Correct)
Many algorithms for approximate reinforcement learning are not known to converge. In fact, there are counterexamples showing that the adjustable weights in some algorithms may oscillate within a re... / Reinforcement Learning with Function br algorithms for approximate reinforcement learning are not known to

Population rule learning in symmetric normal-form games: theory and.. - Stahl (2001)   (Correct)
A model of population rule learning is formulated and estimated using experimental data. When predicting the population distribution of choices and accounting for the number of parameters, the populat... / and Ho consider both reinforcement learning and belief learning they br how people play games reinforcement learning in experimental games with

Goal Directed Adaptive Behavior in Second-Order Neural Networks: The.. - Crabbe, Dyer (2001)   (Correct)
The paper presents a neural network architecture (MAXSON) based on second-order connections that can learn a multiple goal approach/avoid task using reinforcement from the environment. It also enables... / autonomous agents reinforcement learning vicarious learning. br faster than traditional reinforcement learning approaches generates and

Improving The Performance Of Q-Learning With Locally Weighted.. - Aljibury (2001)   (Correct)
Oftentimes, the problem faced by researchers applying reinforcement learning to a nontrivial robotics problem is that they run head-on into the curse of dimensionality. This is a particular problem fo... / Background Of Reinforcement br Description of Reinforcement

Autonomous Helicopter Control using Reinforcement Learning Policy.. - Bagnell, Schneider (2001)   (Correct)
Many control problems in the robotics field can be cast as Partially Observed Markovian Decision Problems (POMDPs), an optimal control formalism. Finding optimal solutions to such problems in general,... / Helicopter Control using Reinforcement Learning Policy Search Methods br Traditional model-based reinforcement learning algorithms make a

Solving Hidden-Mode Markov Decision Problems - Choi, Zhang, al. (2001)   (Correct)
Hidden-Mode Markov decision processes (HM-MDPs) are a novel mathematical framework for a subclass of nonstationary reinforcement learning problems where environment dynamics change over time accor... / a subclass of nonstationary reinforcement learning problems where br subclass of nonstationary reinforcement learning problems. Unlike

A Multi-Agent, Policy-Gradient approach to Network Routing - Tao, Baxter, Weaver (2001)   (Correct)
Network routing is a distributed decision problem which naturally admits numerical performance measures, such as the average time for a packet to travel from source to destination. Olpomdp, a policy-g... / Olpomdp a policy-gradient reinforcement learning algorithm was br is treated as a multi-agent reinforcement learning problem. Each router is

Emotion-triggered Learning in Autonomous Robot Control - Gadanho, Hallam (2001)   (Correct)
The fact that emotions are considered to be essential to human reasoning suggests that they might play an important role in autonomous robots as well. In particular, the decision of when to interrup... / and integrated in a reinforcementlearning framework. Robot br to its environment using reinforcement learning. The work was done under

Learning Markov Processes - Murphy (2001)   (Correct)
this article, we restrict our attention to discrete time dynamical systems.) Typically we do not know the exact dynamics of the system, so instead we consider a probabilistic state transition function... / Of Our Actions as In Reinforcement Learning We Represent Our br mdp Widely Used In Reinforcement Learning. In An Mdp The State

Learning POMDP Policies with Internal State using Gradient Ascent - Aberdeen, Baxter (2001)   (Correct)
In [8, 9] we introduced GPOMDP, an algorithm for estimating the gradient of the average reward for arbitrary Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochas... /

Real-Time Statistical Learning For Robotics and Human Augmentation - Schaal, Vijayakumar, D'Souza.. (2001)   (Correct)
Real-time modeling of complex nonlinear dynamic processes has become increasingly important in various areas of robotics and human augmentation. To address such problems, we have been developing speci... /

Bounds on sample size for policy evaluation in Markov environments - Peshkin, Mukherjee (2001)   (Correct)
Reinforcement learning means nding the optimal course of action in Markovian environments without knowledge of the environment 's dynamics. Stochastic optimization algorithms used in the eld re... / Abstract. Reinforcement learning means nding the optimal br Introduction Research in reinforcement learning focuses on designing

Market-Based Reinforcement Learning in Partially Observable Worlds - Kwee, Hutter, Schmidhuber (2001)   (Correct)
Unlike traditional reinforcement learning (RL), market-based RL is in principle applicable to worlds described by partially observable Markov Decision Processes (POMDPs), where an agent needs to lea... / Market-Based Reinforcement Learning in Partially Observable br Unlike traditional reinforcement learning RL market-based RL is

Automated State Abstraction for Options using the U-Tree Algorithm - Jonsson, Barto (2001)   (Correct)
Learning a complex task can be significantly facilitated by defining a hierarchy of subtasks. An agent can learn to choose between various temporally abstract actions, each solving an assigned subta... / Researchers in the field of reinforcement learning have recently focused br which extends the theory of reinforcement learning to include temporally

Learning State Grounding for Optimal Visual Servo Control of Dynamic.. - Nikovski, Nourbakhsh (2001)   (Correct)
We present an experiment in sequential visual servo control of a dynamic manipulation task with unknown equations of motion and feedback from an uncalibrated camera. Our algorithm constructs a mode... / the solution. The field of reinforcement learning is specifically concerned br on a real rig using a reinforcement learning controller. Technical

Incremental Reinforcement Learning for designing Multi-Agent Systems - Buffet, Dutech, Charpillet (2001)   (Correct)
Designing individual agents so that, when put together, they reach a given global goal is not an easy task. One solution to automatically build such large Multi-Agent Systems is to use decentralized l... / Incremental Reinforcement Learning for designing Multi-Agent br behavior. To that purpose Reinforcement Learning methods are very

Using Background Knowledge to Speed Reinforcement Learning in.. - Shapiro, Langley, Shachter (2001)   (Correct)
This paper describes Icarus, an agent architecture that embeds a hierarchical reinforcement learning algorithm within a language for specifying agent behavior. An Icarus program expresses an approxima... / Knowledge to Speed Reinforcement Learning in Physical Agents br that embeds a hierarchical reinforcement learning algorithm within a

Using the Web to Create Minority Language Corpora - Ghani, Jones, Mladenic (2001)   (Correct)
The Web is a valuable source of language specific resources but the process of collecting, organizing and utilizing these resources is difficult. We describe CorpusBuilder, an approach for automatical... / and McCallum use reinforcement learning to help a crawler br et al.s WebSail uses reinforcement learning based on feedback from

Hierarchical Memory-Based Reinforcement Learning - Hernandez-Gardiol, Mahadevan (2001)   (Correct)
A key challenge for reinforcement learning is how to scale up to large partially observable domains. In this paper, we show how a hierarchy of behaviors can be used to create and select among varia... / Hierarchical Memory-Based Reinforcement Learning Natalia br A key challenge for reinforcement learning is how to scale up to

An Architecture for Action Selection in Robotic Soccer - Stone, McAllester (2001)   (Correct)
CMUnited-99 was the 1999 RoboCup robotic soccer simulator league champion. In the RoboCup-2000 competition, CMUnited-99 was entered again and despite being publicly available for the entire year, it s... / rewards in the sense of reinforcement learning One might say for br successfully learned via reinforcement learning. . CONCLUSION

Learning rates for Q-Learning - Even-Dar, Mansour (2001)   (Correct)
In this paper we derive convergence rates for Q-learning. We show an interesting relationship between the convergence rate and the learning rate used in the Q-learning. For a polynomial learning rat... / Introduction In Reinforcement Learning an agent wanders in an br the dominating approach in Reinforcement Learning SB BT An MDP

Grounding the Unobservable in the Observable: The Role and.. - Morrison, Oates, al. (2001)   (Correct)
Introduction One of the great mysteries of human cognition is how we learn to discover meaningful and useful categories and concepts about the world based on the data flowing from our sensors. Why do... / in predicted reward in a reinforcement learning setting to refine action br McCallum A. K. . Reinforcement Learning with Selective Perception

Goal Directed Adaptive Behavior in Second-Order Neural Networks.. - Crabbe, Dyer (2001)   (Correct)
The paper presents a neural network architecture (MAXSON) based on second-order connections unknown Goal Directed Adaptive Behavior in Second-Order Neural Networks: Leaning and Evolving in the MAXSON... / faster than traditional reinforcement learning approaches generates and br and perform cross-modal reinforcement learning. Both these thresholds

Layered Learning in Genetic Programming for a Cooperative Robot.. - Gustafson, Hsu (2001)   (Correct)
We present an alternative to standard genetic programming (GP) that applies layered learning techniques to decompose a problem. GP is applied to subproblems sequentially, where the population in the l... / It was applied with reinforcement learning for robotic soccer and the br challenges for researchers. Reinforcement learning hierarchical sensing

Multi-Layer Methods and the Optimal Optimizer - de Jong (2001)   (Correct)
Multi-Layer Methods are methods that act on several layers simultaneously. Examples of multi-layer methods are found in multi-agent systems (global and per-agent behavior), in learning (e.g. boostin... / that if the agents all use reinforcement learning to optimize their own br the world utility. The reinforcement learning agents by themselves

Personalized Web-Document Filtering Using Reinforcement Learning - Byoung-Tak Zhang And (2001)   (Correct)
Document filtering is increasingly deployed in Web environments to reduce information overload of users. We formulate online information filtering as a reinforcement learning problem, i.e. TD(0). The ... / Filtering Using Reinforcement Learning Byoung-Tak Zhang and br information filtering as a reinforcement learning problem i.e. TD The

Multi-Agent Systems by Incremental Gradient Reinforcement Learning - Dutech, Buffet, Charpillet (2001)   (Correct)
A new reinforcement learning (RL) methodology is proposed to design multi-agent systems. In the realistic setting of situated agents with local perception, the task of automatically building a coordin... / by Incremental Gradient Reinforcement Learning Alain Dutech Olivier br France Abstract A new reinforcement learning RL methodology is

Continuous-Time Hierarchical Reinforcement Learning - Ghavamzadeh, Mahadevan (2001)   (Correct)
Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Prior work in hierarchic... / Hierarchical Reinforcement Learning Mohammad Ghavamzadeh br Abstract Hierarchical reinforcement learning RL is a general

Direct Policy Search using Paired Statistical Tests - Strens, Moore (2001)   (Correct)
Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maximize a noisy objectiv... / a practical way to solve reinforcement learning problems involving br . Introduction Reinforcement Learning is a problem description

Convergent Reinforcement Learning with Value Function Interpolation - Szepesvári (2001)   (Correct)
We consider the convergence of a class of reinforcement learning algorithms combined with value function interpolation methods using the methods developed in (Littman & Szepesvari, 1996). As a sp... / Convergent Reinforcement Learning with Value Function br convergence of a class of reinforcement learning algorithms combined with

Towards Automatic Shaping in Robot Navigation - Peterson, Owens, Carroll (2001)   (Correct)
Shaping is a potentially powerful tool in reinforcement learning applications. Shaping often fails to function e#ectively because of a lack of understanding about its e#ects when applied in reinforcem... / powerful tool in reinforcement learning applications. Shaping br its e ects when applied in reinforcement learning settings and the use of

Mining the Web to Create Minority Language Corpora - Ghani, Jones, Mladenic (2001)   (Correct)
The Web is a valuable source of language specific resources but the process of collecting, organizing and utilizing these resources is difficult. We describe CorpusBuilder, an approach for automatical... / and McCallum use reinforcement learning to help a crawler br et al.s WebSail uses reinforcement learning based on feedback from

An Improved Grid-Based Approximation Algorithm for POMDPs - Zhou, Hansen (2001)   (Correct)
Although a partially observable Markov decision process (POMDP) provides an appealing model for problems of planning under uncertainty, exact algorithms for POMDPs are intractable. This motivates ... / and related problems of reinforcement learning. A standard approach to

Keyword Spices: A New Method for Building Domain-Specific Web Search.. - Oyama, Kokubo, Ishida, Yamada.. (2001)   (Correct)
This paper presents a new method for building unknown Keyword Spices: A New Method for Building Domain-Specific Web Search Engines Satoshi OYAMA, Takashi KOKUBO Toru ISHIDA Department of Social Inf... / explore the web by using reinforcement learning techniques. SPIRAL Cohen

A Social Reinforcement Learning Agent - Charles Lee Isbell (2001)   (Correct)
We report on our reinforcement learning work on Cobot, a software agent that resides in the well-known online chat community LambdaMOO. Our initial work on Cobot (Isbell et al., 2000) provided him wit... / A Social Reinforcement Learning Agent Charles Lee br We report on our reinforcement learning work on Cobot a software

Off-Policy Temporal-Difference Learning with Function Approximation - Precup, Sutton, Dasgupta (2001)   (Correct)
We introduce the rst algorithm for o -policy temporal-di erence learning that is stable with linear function approximation. O - policy learning is of interest because it forms the basis for popul... / the basis for popular reinforcement learning methods such as br the most popular of all reinforcement learning algorithms it has been

Place Cells and Spatial Navigation based on Vision, Path Integration, .. - Arleo, Smeraldi, Hug, Gerstner (2001)   (Correct)
We model hippocampal place cells and head-direction cells by combining allothetic (visual) and idiothetic (proprioceptive) stimuli. Visual input, provided by a video camera on a miniature robot, is ... / Path Integration and Reinforcement Learning A. Arleo F. Smeraldi br as basis functions for reinforcement learning. Experimental results for

Individual Action and Collective Function: from Sociology to.. - Sun (2001)   (Correct)
How do we characterize the process and the dynamics of co-learning, conceptually, mathematically, or computationally?  How do social structures and relations interact with co-learning of multiple ... / deals with value function reinforcement learning in certain types of br The basic approach is reinforcement learning through estimating

A New Control Scheme For Combustion Processes Using Reinforcement.. - Stephan, Debes, Gross (2001)   (Correct)
Introduction Since the immediate objective of a power plant is the production of energy, the plant operator is trying to maximize the eciency factor. Simultaneously, both the system-constraints and g... / Combustion Processes Using Reinforcement Learning Based On Neural Networks br in a power plant based on reinforcement-learning in combination with neural

Automatic Discovery of Subgoals in Reinforcement Learning using.. - McGovern, Barto (2001)   (Correct)
This paper presents a method by which a reinforcement learning agent can automatically discover certain types of subgoals online. By creating useful new subgoals while learning, the agent is able ... / Discovery of Subgoals in Reinforcement Learning using Diverse Density br a method by which a reinforcement learning agent can automatically

Direct value-approximation for factored MDPs - Schuurmans, Patrascu (2001)   (Correct)
We present a simple approach for computing near-optimal policies unknown Direct value-approximation for factored MDPs Dale Schuurmans and Relu Patrascu Department of Computer Science University of ... / stochastic environments and reinforcement learning. Standard methods such as br T. Dietterich. Hierarchical reinforcement learning with the MAXQ value

Policy Improvement for POMDPs using Normalized Importance Sampling - Christian Shelton Artificial (2001)   (Correct)
We present a new method for estimating the unknown Policy Improvement for POMDPs using Normalized Importance Sampling Christian R. Shelton Artificial Intelligence Lab Massachusetts Institute of Te... / We assume a standard reinforcement learning setup an agent interacts br before in conjunction with reinforcement learning. In particular Precup

Evolution of Reinforcement Learning in Uncertain Environments.. - Niv, Joel, Meilijson, Ruppin (2001)   (Correct)
Reinforcement learning (RL) is a fundamental process by which organisms learn to achieve a goal from interactions with the environment. We use Artificial Life techniques to derive (near-)optimal neuro... / Evolution of Reinforcement Learning in Uncertain br Abstract. Reinforcement learning RL is a fundamental

Focused Web Crawling: A Generic Framework for Specifying the User.. - Ester, Gross, Kriegel (2001)   (Correct)
Compared to the standard web search engines, focused crawlers yield good recall as well as good precision by restricting themselves to a limited domain. In this paper, we do not introduce another f... / of tunneling. RC uses reinforcement learning to train a crawler how to br J.McCallum A.Using Reinforcement Learning to Spider the Web

Lyapunov-Constrained Action Sets for Reinforcement Learning - Perkins, Barto (2001)   (Correct)
Lyapunov analysis is a standard approach to studying the stability of dynamical systems and to designing controllers. We propose to design the actions of a reinforcement learning (RL) agent to be ... / Action Sets for Reinforcement Learning Theodore J. Perkins br to design the actions of a reinforcement learning RL agent to be

Learning Preconditions for Control Policies in Reinforcement Learning - Tohgoroh Matsui Graduate (2001)   (Correct)
This paper describes a method which senses changing environment by collecting failed instances, uses concept learning for acquiring a precondition for a control policy, and modifies the policy partial... / for Control Policies in Reinforcement Learning Tohgoroh Matsui br the policy partially in reinforcement learning. The precondition of a

Rational and Convergent Learning in Stochastic Games - Bowling, Veloso (2001)   (Correct)
This paper investigates the problem of policy learning in multiagent environments using the stochastic game framework, which we briefly overview. We introduce two properties as desirable for a lear... / We examine existing reinforcement learning algorithms according to br of single agent learning. Reinforcement learning Sutton and Barto

On Verifying Game Designs and Playing Strategies using Reinforcement.. - Kalles, Kanellopoulos (2001)   (Correct)
In this paper we elaborate on the application of reinforcement learning to the details of the design and the verification of a new strategy game. We deal with playability and learning issues, using a ... / Playing Strategies using Reinforcement Learning Dimitrios Kalles br Playing Strategies using Reinforcement Learning - Abstract In

Decision-Theoretic Planning with Concurrent Temporally Extended.. - Rohanimanesh, Mahadevan (2001)   (Correct)
We investigate a model for planning under unknown Decision-Theoretic Planning with Concurrent Temporally Extended Actions Khashayar Rohanimanesh Department of Computer Science Michigan State Unive... / action in the context of reinforcement learning Sutton et al. br A s in the standard reinforcement learning framework in which A s

Personalized Webdocument Filtering Using Reinforcement Learning - Zhang, Seo (2001)   (Correct)
ch as AltaVista, Yahoo, and Excite. The other is to manually f ollow or browse the hyperlinks of the documents by a user himself. However, these methods have some drawbacks. Since Web-index services a... / Filtering Using Reinforcement Learning Byoung-Tak Zhang And br information ltering as a reinforcement learning problem i.e.TD The

Continuous State Space Q-Learning for Control of Nonlinear Systems - Hagen (2001)   (Correct)
Contents 1 Introduction 1 1.1 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Designing the state feedback controller . . . . . . . . . . . . . . . . 3 1.1.2 ... / . . Reinforcement Learning . br . Reinforcement Learning . A Discrete

Specifying Rational Agents with Statecharts and Utility Functions - Obst (2001)   (Correct)
To aid the development of the robotic soccer simulation league team RoboLog-2000, a method for the specification of multi-agent teams by state­charts has been introduced. The results in the last years... / written or are subject to reinforcement learning. The option evaluation

Creating Melodies with Evolving Recurrent Neural Networks - Chen, Miikkulainen (2001)   (Correct)
Music composition is a domain well-suited for evolutionary reinforcement learning. Instead of applying explicit composition rules, a neural network is used to generate melodies. An evolutionary algori... / for evolutionary reinforcement learning. Instead of applying br R. Efficient reinforcement learning through symbiotic

Accelerating Reinforcement Learning through the Discovery of Useful.. - McGovern, Barto (2001)   (Correct)
An ability to adjust to changing environments and unforeseen circumstances is likely to be an important component of a successful autonomous space robot. This paper shows how to augment reinforcement ... / Accelerating Reinforcement Learning through the Discovery of br Keywords Abstraction Reinforcement Learning RL Subgoals Mobile

Parameterized Logic Programs - Where Computing Meets (2001)   (Correct)
In this paper, we describe recent attempts to incorporate learning into logic programs as a step toward adaptive software that can learn from an environment. Although there are a variety of types of... / algorithm and the other for reinforcement learning by learning automatons. br learning by incorporating reinforcement learning. Reinforcement learning is

Relational Reinforcement Learning - Driessens (2001)   (Correct)
This paper presents an introduction to reinforcement learning and relational reinforcement learning at a level to be understood by students and researchers with di erent backgrounds. unknown Relatio... / Relational Reinforcement Learning Kurt Driessens br presents an introduction to reinforcement learning and relational

Heterogeneity in the Coevolved Behaviors of Mobile Robots: The.. - Potter, Meeden, Schultz (2001)   (Correct)
Many mobile robot tasks can be most efficiently solved when a group of robots is utilized. The type of organization, and the level of coordination and communication within a team of robots affects ... /

A Neuroevolution Method for Dynamic Resource Allocation on a Chip.. - Gomez, Burger, Miikkulainen (2001)   (Correct)
Technology-driven limitations will soon force microprocessor chips to contain multiple processing cores, as the scalability of individual cores peaks but transistor counts continue to increase. To obt... / characteristics of dicult reinforcement learning tasks a sequence of br for the application of reinforcement learning techniques such as arti

Multiple Goal Q-Learning: Issues and Functions - Crabbe (2001)   (Correct)
This paper addresses the concerns of agents using reinforcement learning to learn to achieve multiple simultaneous goals. It proves that an algorithm based on acting upon the maximal goal at any one t... / concerns of agents using reinforcement learning to learn to achieve br necessary for the agent's reinforcement learning system and concludes that

Hybrid Coordination of Reinforcement Learning based Behaviors For Auv .. - Carreras, Batlle, Ridao, Pags (2001)   (Correct)
This paper proposes a Hybrid Coordination method for Behavior-based Control Architectures. The hybrid method takes in advantages of the robustness and modularity in competitive approaches as well as o... /

Model-Free Least-Squares Policy Iteration - Lagoudakis, Parr (2001)   (Correct)
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. We are motivated by t... / propose a new approach to reinforcement learning which combines least br in the context of reinforcement learning. While their ability to

Experience Stack Reinforcement Learning - Reynolds (2001)   (Correct)
Experience Stack Reinforcement Learning is a novel, o - policy, online algorithm for learning optimal policies with respect to a reward signal. From existing methods it combines TD() style return ... / Experience Stack Reinforcement Learning Stuart I. Reynolds br Abstract. Experience Stack Reinforcement Learning is a novel opolicy

The Origin of the Speeches: language evolution through collaborative.. - Walshe (2001)   (Correct)
This project proposes that language evolve through reinforcement learning where agents communicate with each other and provide rewards if communication is successful. The fundamental difference betw... /

Learning Agents in a Homo Egualis Society - Nowé, Verbeeck, Lenaerts (2001)   (Correct)
Coordination is an important issue in multi-agent systems. A possible approach to tackle coordination, that recently received quite a lot of attention, is to learn the e ects of interaction in the joi... / - homo egualis -reinforcement learning -periodic policies. br be achieved by a classical reinforcement learning approach provided

Speeding up Relational Reinforcement Learning Through the Use of an.. - Driessens, Ramon, Blockeel (2001)   (Correct)
Relational reinforcement learning (RRL) is a learning technique that combines standard reinforcement learning with inductive logic programming to enable the learning system to exploit structural kno... / Speeding up Relational Reinforcement Learning Through the Use of an br Abstract. Relational reinforcement learning RRL is a learning

Hybrid Coordination of Reinforcement Learning-based Behaviors For Auv .. - Carreras, Batlle, Ridao (2001)   (Correct)
This paper proposes a Hybrid Coordination method for Behavior-based Control Architectures. The hybrid method takes in advantages of the robustness and modularity in competitive approaches as well as o... /

An Hybrid Methodology for RL-based Behavior Coordination In a Target.. - Carreras, Yuh, Batlle (2001)   (Correct)
This paper proposes a behavior-based scheme for high-level control of Autonomous Underwater Vehicles (AUVs). Two main characteristics can be highlighted in the control scheme. Behavior coordination is... /

Incentives for Sharing in Peer-to-Peer Networks - Golle, Leyton-Brown, Mironov.. (2001)   (Correct)
We consider the free-rider problem in peer-to-peer file sharing networks such as Napster: that individual users are provided with no incentive for adding value to the network. We examine the design ... / with a multi-agent reinforcement learning model. Introduction br we use a multi-agent reinforcement learning model to validate our

Inference Using Formal Logics - McAllester (2001)   (Correct)
Introduction Logic is fundamental to a variety of disciplines. Logic provides insight into the nature of mathematics and human mathematical reasoning. Logic provides insight into the syntax and seman... / genetic algorithms and reinforcement learning have also failed to

Reinforcement Learning for Weakly-Coupled MDPs and an Application to.. - Daniel Bernstein And (2001)   (Correct)
Weakly-coupled Markov decision processes can be decomposed into subprocesses that interact only through a small set of bottleneck states. We study a hierarchical reinforcement learning algorithm des... / Reinforcement Learning for Weakly-Coupled MDPs br We study a hierarchical reinforcement learning algorithm designed to take

Switch Packet Arbitration via Queue-Learning - Brown (2001)   (Correct)
In packet switches, packets queue at switch inputs and contend for outputs. unknown Switch Packet Arbitration via Queue-Learning Timothy X Brown Electrical and Computer Engineering University of Co... / of the switch. We present a reinforcement learning formulation of the br Introduction Reinforcement learning RL has been applied to

Adaptive Representation Methods for Reinforcement Learning - Reynolds (2001)   (Correct)
ween Q-function and policy representations and the RL algorithms used to generate them. Recursive Partitioning One method to improve the Q-function representation is to begin learning with a coarse,... / Representation Methods for Reinforcement Learning Stuart I. Reynolds br representation methods in reinforcement learning RL Reinforcement

Neuro Fuzzy Systems: State-of-the-art Modeling Techniques - Ajith Abraham School (2001)   (Correct)
Fusion of Artificial Neural Networks (ANN) and Fuzzy Inference Systems (FIS) have attracted the growing interest of researchers in various scientific and engineering areas due to the growing need of... /

A Reinforcement Learning Model of Selective Visual Attention - Silviu Minut Autonomous (2001)   (Correct)
This paper proposes a model of selective attention for visual search tasks, based on a framework for sequential decisionmaking. The model is implemented using a xed pan-tiltzoom camera in a visually ... / A Reinforcement Learning Model of Selective Visual br two interacting modules. A reinforcement learning module learns a policy on

Cellular Channel Assignment: a New Localized and Distributed Strategy - Battiti, Bertossi, Brunato (2001)   (Correct)
As the use of mobile communications systems grows, the need arises for new and more ecient channel allocation techniques. The total number of available channels on a real-world network is in fact a sc... / simulated annealing or reinforcement learning. These strategies usually br and Dimitri Bertsekas. Reinforcement learning for dynamic channel

Planetary Rover Control as a Markov Decision Process - Bernstein, Zilberstein, Washington.. (2001)   (Correct)
Planetary rovers must be e ective in gathering scienti c data despite uncertainty and limited resources. One step toward achieving this goal is to construct a high-level mathematical model of the prob... / Markov decision process reinforcement learning Abstract Planetary br problem. We use Monte Carlo reinforcement learning techniques to obtain a

Corpus-based dialogue simulation for automatic strategy learning and.. - Scheffler, Young (2001)   (Correct)
This paper describes a method for simulating mixed initiative human-machine dialogues using data collected by a prototype dialogue system. The behaviour of the user population is modelled probabilisti... / of dialogue strategy by reinforcement learning has been proposed by br used with a model-based reinforcement learning algorithm such as dynamic

Combinatorial Optimization through Statistical Instance-Based Learning - Telelis, Stamatopoulos (2001)   (Correct)
Different successful heuristic approaches have been proposed for solving combinatorial optimization problems. Commonly, each of them is specialized to serve a different purpose or address specific dif... /

Active Learning with Adaptive Grids - Milano, Schmidhuber, Koumoutsakos (2001)   (Correct)
Given some optimization problem and a series of typically expensive trials of solution candidates taken from a search space, how can we eciently select the next candidate? We address this fundamenta... / evolution strategies reinforcement learning algorithms tabu search

Virtual Environment For Cooperative Assistance In Teleoperation - Olivier Heguy Nancy (2001)   (Correct)
In order to help the user to accomplish a task, teleoperation systems have to integrate different tools such as visualization, divers interaction devices, planning tools, etc.... The interface must b... / rules of the dynamic and reinforcement learning the A A

Mobile Robot Learning of Delayed Response Tasks through Event.. - Linåker (2001)   (Correct)
We show how event extraction can be used for handling delayed response tasks with arbitrary delay periods between the stimulus and the cue for response. Our approach is based on a number of informa... / be based on a delayed reinforcement learning system such as

Gradient-based Reinforcement Planning in Policy-Search Methods - Kwee, Hutter, Schmidhuber (2001)   (Correct)
We introduce a learning method called "gradient-based reinforcement planning" (GREP). Unlike traditional DP methods that improve their policy backwards in time, GREP is a gradient-based method that pl... / improve convergence in reinforcement learning RL Sutton Barto br reward function In reinforcement learning RL the objective is to

Auvs' Dynamics Modeling, Position Control, And Path Planning Using.. - Sayyaadi, Ura (2001)   (Correct)
Accurate identification of nonlinear time variant MIMO systems, especially in case of AUVs is essential for implementation of control algorithms and navigation purposes. Control problems of AUVs h... / control scheme and Reinforcement Learning is used for adjusting the br problems based on the Reinforcement Learning method. This algorithm was

The Impact of Non-verbal Communication on Lexicon Formation - Paul Vogt Universiteit (2001)   (Correct)
This paper presents a series of experiments in which two mobile robots develop a shared lexicon of which the meaning is grounded in the real world. The experiments investigate the impact of non-verb... /

A Reinforcement Learning Agent for Minutiae Extraction from.. - Bazen, van Otterlo, Gerez, Poel (2001)   (Correct)
In this paper we show that reinforcement learning can be used for minutiae detection in fingerprint matching. Minutiae are characteristic features of fingerprints that determine their uniqueness. Clas... / A Reinforcement Learning Agent for Minutiae br In this paper we show that reinforcement learning can be used for minutiae

Competitive Reinforcement Learning for Combinatorial Problems - Abramson, Wechsler (2001)   (Correct)
This paper shows that the competitive learning rule found in Learning Vector Quantization (LVQ) serves as a promising function approximator to enable reinforcement learning methods to cope with a larg... /

A survey of the application of machine learning to the game of go - Ramon, Blockeel (2001)   (Correct)
Unlike other games such as chess, draughts and backgammon, computers are currently quite weak at the game of go (baduk). Brute force is dicult due to the higher branching factor and game length. Hum... /

Parallel Cortico-Basal Ganglia Mechanisms for Acquisition and.. - Hiroyuki Nakahara Kenji (2001)   (Correct)
Experimental studies have suggested that many brain areas, including the basal ganglia, contribute to procedural learning. Focusing on the basal ganglia-thalamocortical #BG-TC# system, we propose a ... / the BG-TC loops work as a reinforcement learning system for learning br functional components. Reinforcement learning actor-critic architecture

Tracking the Evolution of Learning on a Visual-Motor Task - Subramanian, Siruguri (2001)   (Correct)
How do humans acquire skills on complex visualmotor tasks? From a large sequential corpus of low-level visualmotor data gathered from human subjects, we track the evolution of their control polici... / subjects as well as for reinforcement learning algorithms. Our learning

Scalability and Portability of a Belief Network-based Dialog Model.. - Wai, Meng, Pieraccini (2001)   (Correct)
This paper describes the scalability and portability of a Belief Network (BN)-based mixed initiative dialog model across application domains. The Belief Networks (BNs) are used to automatically govern... / strategy can be obtained by reinforcement learning While the system

Optimal Camera Parameter Selection for State Estimation with.. - Denzler, Brown, Niemann (2001)   (Correct)
In this paper we introduce a formalism for optimal camera parameter selection for iterative state estimation. We consider a framework based on Shannon 's information theory and select the camera par... /

ISocRob 2001 Team Description - Lima, Custódio, Damas, Lopes, .. (2001)   (Correct)
This paper describes the ISocRob team current status, new features planned to be demonstrated in RoboCup 2001, and the project long term scienti c goals, as of March 2001. An evolution of the team... / and Object Location Reinforcement Learning and Stochastic Games. br left free based on reinforcement learning applied to stochastic

Two Dimensional Evaluation Reinforcement Learning - Okada, Yamakawa, Omori (2001)   (Correct)
To solve the problem of tradeoff between exploration and exploitation actions in reinforcement learning, the authors have proposed two-dimensional evaluation reinforcement learning, whichdistinguish... /

How XCS Evolves Accurate Classifiers - Butz, Kovacs, Lanzi, Wilson (2001)   (Correct)
Due to the accuracy based fitness approach, the ultimate goal for XCS is the evolution of unknown How XCS Evolves Accurate Classifiers Martin V. Butz, Tim Kovacs, Pier Luca Lanzi, and Stewart W. Wil... / As in all LCSs and reinforcement learning methods the XCS acts as a br S.Barto A. G. Reinforcement Learning An Introduction.

Genetic-Based Machine Learning Systems for Classification Task - Afanasyeva (2001)   (Correct)
This paper examines the possibility of using evolutionary learning methods for classification. Great attention is paid to studying special features of credit assignment methods and genetic algorithm a... /

Stochastic Search for Signal Processing Algorithm Optimization - Singer, Veloso (2001)   (Correct)
Many difficult problems can be viewed as search problems. However, given a new task with an embedded search problem, it is challenging to state and find a truly effective search approach. In this pape... / and Littman use reinforcement learning to learn to select br Algorithm selection using reinforcement learning. In Proceedings of

An Analysis of the Dynamics of Adaptive Multiagent Systems, with an.. - Vidal (2001)   (Correct)
Introduction In the past twenty years we have seen an increasing emphasis on the study of the dynamics of complex systems such as the human immune system and the economy. Some of this work is reflect... / to be given to individual reinforcement learning agents in order to speed br Smith protocol and reinforcement learning with learning rate of

Adaptive Behavior Navigation of a Mobile Robot - Zalama, Gomez, Paul, Peran (2001)   (Correct)
This paper describes a neural network model for the reactive behavioral navigation of a mobile robot. From the information received through the sensors the robot can elicit one of several behaviors ... / and learning operation. Reinforcement learning improves the navigation of br introduces new knowledge. Reinforcement learning has been one of the

CoPS-Team Description - Lafrenz, Becht, Buchheim, Burger.. (2001)   (Correct)
The control software of the robot soccer team CoPS is designed as a multi-agent-system. The basis for a cooperation between the robots is a suitable environment model based on uncertain sensory data a... / and we plan to include reinforcement learning in our system.

Determination Of Sensory Motor Coordination Parameters For A Robot.. - Mark Edward Cambron (2001)   (Correct)
This paper proposes a method for the determination of Sensory-Motor Coordination (SMC) parameters through the teleoperation of a humanoid robot designed for human-robot interaction. It is argued that ... / One is to set up a reinforcement learning scheme which allows the

Self-Organization of Place Cells and Reward-Based Navigation for a.. - Takahashi, Tanaka, Kurita (2001)   (Correct)
We investigate a method to navigate a mobile robot by using self-organizing map and reinforcement learning. Modeling hippocampal place cells, the map consists of units activated at specified locations... / self-organizing map and reinforcement learning. Modeling hippocampal br to a specific goal by using reinforcement learning on actor-critic model.Th

A Reinforcement Learning Intelligent Agent - Serban (2001)   (Correct)
The field of Reinforcement Learning, a subfield of machine learning, represents an important direction for research in Artificial Intelligence, the way for improving an agent's behavior, given a certa... / Xlvi Number A Reinforcement Learning Intelligent Agent br Abstract. The eld of Reinforcement Learning a sub- eld of machine

Infinite-Horizon Policy-Gradient Estimation - Baxter, Bartlett (2001)   (Correct)
Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problem... / to direct policy search in reinforcement learning have received much recent br tend to go by the name Reinforcement Learning and have been

Universal Sequential Decisions in Unknown Environments - Hutter (2001)   (Correct)
Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental probability distribution is known. Solomonoff's theory of universal induction formally solv... / Y and n are nite. Reinforcement learning for unknown environment br change if is unknown. Reinforcement learning algorithms are commonly

Latency-dependent fitness in evolutionary multithreaded Web agents - Degeratu, Pant, Menczer (2001)   (Correct)
The World Wide Web creates opportunities for search systems using adaptive distributed agents. This paper presents a threaded implementation of InfoSpiders, a client-based system that uses an evolving... / levels byevolutionary and reinforcement learning. The goal is to maintain

CiteSeer - citeseer.org - Terms of Service - Privacy Policy - Copyright © 1997-2002 NEC Research Institute