• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 3,381
Next 10 →

Reward Functions for Accelerated Learning

by Maja J Mataric - In Proceedings of the Eleventh International Conference on Machine Learning , 1994
"... This paper discusses why traditional reinforcement learning methods, and algorithms applied to those models, result in poor performance in situated domains characterized by multiple goals, noisy state, and inconsistent reinforcement. We propose a methodology for designing reinforcement functions tha ..."
Abstract - Cited by 195 (14 self) - Add to MetaCart
This paper discusses why traditional reinforcement learning methods, and algorithms applied to those models, result in poor performance in situated domains characterized by multiple goals, noisy state, and inconsistent reinforcement. We propose a methodology for designing reinforcement functions

Predictive reward signal of dopamine neurons

by Wolfram Schultz - Journal of Neurophysiology , 1998
"... Schultz, Wolfram. Predictive reward signal of dopamine neurons. is called rewards, which elicit and reinforce approach behav-J. Neurophysiol. 80: 1–27, 1998. The effects of lesions, receptor ior. The functions of rewards were developed further during blocking, electrical self-stimulation, and drugs ..."
Abstract - Cited by 747 (12 self) - Add to MetaCart
Schultz, Wolfram. Predictive reward signal of dopamine neurons. is called rewards, which elicit and reinforce approach behav-J. Neurophysiol. 80: 1–27, 1998. The effects of lesions, receptor ior. The functions of rewards were developed further during blocking, electrical self-stimulation, and drugs

Automatic shaping and decomposition of reward functions

by Bhaskara Marthi , 2007
"... This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begin by describing a method that learns a shaped reward function given a set of state and temporal abstractions. Next, we co ..."
Abstract - Cited by 20 (0 self) - Add to MetaCart
This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begin by describing a method that learns a shaped reward function given a set of state and temporal abstractions. Next, we

Reward Function Learning for Dialogue Management

by Layla El Asri A, Romain Laroche A, Olivier Pietquin B
"... Abstract. This paper addresses the problem of defining, from data, a reward func-tion in a Reinforcement Learning (RL) problem. This issue is applied to the case of Spoken Dialogue Systems (SDS), which are interfaces enabling users to interact in natural language. A new methodology which, from syste ..."
Abstract - Add to MetaCart
Abstract. This paper addresses the problem of defining, from data, a reward func-tion in a Reinforcement Learning (RL) problem. This issue is applied to the case of Spoken Dialogue Systems (SDS), which are interfaces enabling users to interact in natural language. A new methodology which, from

Genetic programming for reward function search

by Scott Niekum, Andrew G. Barto, Lee Spector - IEEE Trans. Autonom. Mental Develop , 2010
"... ©2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other wo ..."
Abstract - Cited by 11 (2 self) - Add to MetaCart
©2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder. This material is posted on the author's personal server in accord with the IEEE policies

Eliciting Additive Reward Functions for Markov Decision Processes

by Kevin Regan, Craig Boutilier - PROCEEDINGS OF THE TWENTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
"... Specifying the reward function of a Markov decision process (MDP) can be demanding, requiring human assessment of the precise quality of, and tradeoffs among, various states and actions. However, reward functions often possess considerable structure which can be leveraged to streamline their specifi ..."
Abstract - Cited by 7 (2 self) - Add to MetaCart
Specifying the reward function of a Markov decision process (MDP) can be demanding, requiring human assessment of the precise quality of, and tradeoffs among, various states and actions. However, reward functions often possess considerable structure which can be leveraged to streamline

Inverse Reinforcement Learning with Locally Consistent Reward Functions

by Quoc Phong Nguyen, Kian Hsiang Low, Patrick Jaillet
"... Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent rewar ..."
Abstract - Add to MetaCart
Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent

Insensitivity to future consequences following damage to human prefrontal cortex.

by Antoine Bechara , Antonio R Damasio , Hanna Damasio , Steven W Anderson - Cognition, , 1994
"... Abstract Following damage to the ventromedial prefrontal cortex, humans develop a defect in real-life decision-making, which contrasts with otherwise normal intellectual functions. Currently, there is no neuropsychological probe to detect in the laboratory, and the cognitive and neural mechanisms r ..."
Abstract - Cited by 534 (14 self) - Add to MetaCart
Abstract Following damage to the ventromedial prefrontal cortex, humans develop a defect in real-life decision-making, which contrasts with otherwise normal intellectual functions. Currently, there is no neuropsychological probe to detect in the laboratory, and the cognitive and neural mechanisms

Decision-Theoretic Planning: Structural Assumptions and Computational Leverage

by Craig Boutilier, Thomas Dean, Steve Hanks - JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH , 1999
"... Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives ..."
Abstract - Cited by 515 (4 self) - Add to MetaCart
or plans. Planning problems commonly possess structure in the reward and value functions used to de...

A Modified Average Reward Reinforcement Learning Based on Fuzzy Reward Function

by Zhenkun Zhai, Wei Chen, Xiong Li, Jing Guo
"... The purpose of this paper is to propose a fuzzy reward function for improving the learning efficiency of reinforcement learning. Reinforcement learning is a sort of on line learning approach. During the learning process, learning system (often called an agent) learns how to operate in the environmen ..."
Abstract - Add to MetaCart
The purpose of this paper is to propose a fuzzy reward function for improving the learning efficiency of reinforcement learning. Reinforcement learning is a sort of on line learning approach. During the learning process, learning system (often called an agent) learns how to operate
Next 10 →
Results 1 - 10 of 3,381
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University