Results 1  10
of
1,021,522
Reinforcement Learning I: Introduction
, 1998
"... In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection, search ..."
Abstract

Cited by 5500 (120 self)
 Add to MetaCart
In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection
Nearoptimal reinforcement learning in polynomial time
 Machine Learning
, 1998
"... We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve nearoptimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal return is lower bounded by the m ..."
Abstract

Cited by 305 (5 self)
 Add to MetaCart
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve nearoptimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal return is lower bounded
Near Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
, 2004
"... Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear m ..."
Abstract

Cited by 1513 (20 self)
 Add to MetaCart
Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear
Predicting How People Play Games: Reinforcement Learning . . .
 AMERICAN ECONOMIC REVIEW
, 1998
"... ..."
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
 Journal of Artificial Intelligence Research
, 2000
"... This paper presents a new approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs. Th ..."
Abstract

Cited by 439 (6 self)
 Add to MetaCart
This paper presents a new approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs
Planning Algorithms
, 2004
"... This book presents a unified treatment of many different kinds of planning algorithms. The subject lies at the crossroads between robotics, control theory, artificial intelligence, algorithms, and computer graphics. The particular subjects covered include motion planning, discrete planning, planning ..."
Abstract

Cited by 1108 (51 self)
 Add to MetaCart
, planning under uncertainty, sensorbased planning, visibility, decisiontheoretic planning, game theory, information spaces, reinforcement learning, nonlinear systems, trajectory planning, nonholonomic planning, and kinodynamic planning.
Quantization Index Modulation: A Class of Provably Good Methods for Digital Watermarking and Information Embedding
 IEEE TRANS. ON INFORMATION THEORY
, 1999
"... We consider the problem of embedding one signal (e.g., a digital watermark), within another "host" signal to form a third, "composite" signal. The embedding is designed to achieve efficient tradeoffs among the three conflicting goals of maximizing informationembedding rate, mini ..."
Abstract

Cited by 495 (15 self)
 Add to MetaCart
distortionrobustness tradeoffs than currently popular spreadspectrum and lowbit(s) modulation methods. Furthermore, we show that for some important classes of probabilistic models, DCQIM is optimal (capacityachieving) and regular QIM is nearoptimal. These include both additive white Gaussian noise
Evolving Neural Networks through Augmenting Topologies
 Evolutionary Computation
"... An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights. We present a method, NeuroEvolution of Augmenting Topologies (NEAT), which outperforms the best fixedtopology method on a challenging benchmark reinforcement learning task ..."
Abstract

Cited by 524 (113 self)
 Add to MetaCart
An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights. We present a method, NeuroEvolution of Augmenting Topologies (NEAT), which outperforms the best fixedtopology method on a challenging benchmark reinforcement learning
Predictive reward signal of dopamine neurons
 Journal of Neurophysiology
, 1998
"... Schultz, Wolfram. Predictive reward signal of dopamine neurons. is called rewards, which elicit and reinforce approach behavJ. Neurophysiol. 80: 1–27, 1998. The effects of lesions, receptor ior. The functions of rewards were developed further during blocking, electrical selfstimulation, and drugs ..."
Abstract

Cited by 717 (12 self)
 Add to MetaCart
Schultz, Wolfram. Predictive reward signal of dopamine neurons. is called rewards, which elicit and reinforce approach behavJ. Neurophysiol. 80: 1–27, 1998. The effects of lesions, receptor ior. The functions of rewards were developed further during blocking, electrical selfstimulation, and drugs
Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. Technical Report 2003/235, Cryptology ePrint archive, http://eprint.iacr.org, 2006. Previous version appeared at EUROCRYPT 2004
 34 [DRS07] [DS05] [EHMS00] [FJ01] Yevgeniy Dodis, Leonid Reyzin, and Adam
, 2004
"... We provide formal definitions and efficient secure techniques for • turning noisy information into keys usable for any cryptographic application, and, in particular, • reliably and securely authenticating biometric data. Our techniques apply not just to biometric information, but to any keying mater ..."
Abstract

Cited by 532 (38 self)
 Add to MetaCart
material that, unlike traditional cryptographic keys, is (1) not reproducible precisely and (2) not distributed uniformly. We propose two primitives: a fuzzy extractor reliably extracts nearly uniform randomness R from its input; the extraction is errortolerant in the sense that R will be the same even
Results 1  10
of
1,021,522