| R. Sutton and A. Barto. Reinforcement Learning. 1998. |
....URU Algorithm The algorithm is shown in Figure 1. 1) Initialize the state utilities with some initial values; 2) Initialize the current state with the initial state sc : si; 3) Choose a state s neighbor of sc (s h(sc) using some known action selection mechanisms (# Greedy or SoftMax [2]) following the steps: a) determine the set of successors of the current state (m = h(sc) b) if the current state has no successors (m is empty) return to the previous state (s : sc) otherwise go to step (c) c) select from m a subset m1 containing the states that were not visited yet ....
Sutton, R., Barto, A., G.: Reinforcement Learning. The MIT Press, Cambridge, England, 1998
....criteria usually result in intractable systems of differential equations. Learning the equations of motion further complicates the solution. The field of reinforcement learning is specifically concerned with optimal sequential decision making when the dynamics of the controlled system are unknown [10]. Hard control problems such as balancing an inverted pendulum and the Acrobot task have been studied extensively and also implemented on real setups [4, 9] The state of these systems, however, is assumed to be completely known, or (in the case of real robots) precisely measurable from joint ....
R. S. Sutton and A. G. Barto. Reinforcement learning. The MIT Press, Cambridge, Massachusetts, 1998.
....to compute classification results, we have to be able to perform a fusion of several views. A way to solve this problem using particle filters is given in section 2.1. Second, the main task, the planning of view sequences, must be properly formulated. An approach based on reinforcement learning [14] is presented in section 2.2 2.1 Fusion of Multiple Views by Density Propagation In active object recognition a series of observed images f t , f t 1 , f 0 of an object are given together with the camera movements a t 1 , a 0 between these images. Based on these observations of ....
....for the decision process in (8) One of the demands defined in section 1 is that the selection of the most promising view should be learned without user interaction. Reinforcement learning provides many di#erent algorithms to estimate the action value function based on a trial and error method [14]. Trial and error means that the system itself is responsible for trying certain actions in a certain state. The result of such a trial is then used to update Q( and to improve its policy #. In reinforcement learning a series of episodes are performed: Each episode k consists of a sequence of ....
[Article contains additional citation context not shown here]
R.S. Sutton and A.G. Barto. Reinforcement Learning. A Bradford Book, Cambridge, London, 1998.
....the function Q which indeed informs on the long term cost of a given action, provided that future actions are selected optimally. In ant programming, as generally in reinforcement learning, the search in the space of the policies is performed through some form of generalized policy iteration [19]. Starting from some arbitrary initial policy, ant programming iteratively generates a number of paths in order to evaluate the current policy and then improves it on the basis of the result of the evaluation. At each iteration, therefore, a cohort of ants is considered, each generating a ....
....are to be related to the di erent update strategies in reinforcement learning. In particular, for an ant to propose values of T only for the visited transitions and on the basis of the cost of the associated solution, is equivalent to what in reinforcement learning is called Monte Carlo update [19]. On the other hand, it is equivalent to a Q learning update [20] to propose a value of T for a visited transition on the basis of the experienced cost for the transition itself and of the minimum of the current values that T assumes on the edges departing from the node to which the considered ....
[Article contains additional citation context not shown here]
R. S. Sutton and A. G. Barto. Reinforcement Learning. An Introduction. MIT Press, Cambridge, MA, USA, 1998.
....of ant colony optimization but which is more amenable to theoretical analysis for what concerns the concepts of representation and state. In particular, ant programming bridges the terminological gap between ant colony optimization and the fields of optimal control [3] and reinforcement learning [17]. Accordingly, the name ant programming was chosen for its assonance with dynamic programming, with which ant programming has in common the stress on the concept of state and the related idea of reformulating an optimization problem as a multi stage decision problem and then searching for a good ....
....with the function Q which indeed informs on the long term cost of a given action, provided that future actions are selected optimally. In ant programming, as generally in reinforcement learning, the search in the space of the policies is performed through some form of generalized policy iteration [17]. Starting from some arbitrary initial policy, ant programming iteratively generates a number of paths in order to evaluate the current policy and then improves it on the basis of the result of the evaluation. At each iteration, therefore, a cohort of ants is considered, each generating a solution ....
[Article contains additional citation context not shown here]
R. S. Sutton and A. G. Barto. Reinforcement Learning. An Introduction. MIT Press, Cambridge, MA, USA, 1998.
.... always defect or tit for tat , is considered as (evolutionarily) stable [Axelrod, 1984; Fudenberg and Kreps, 1993] To build multi agent learning algorithm, we resort to reinforcement learning, which is an elegant mathematical framework for studying such tasks because it requires few assumptions [Sutton and Barto, 1998]. The only crucial one is associating a utility for each state that In zero sum games the payo matrices are implied from agents own payo s. Monkey(B) Boss(A) Work Shirk Inspect P W I 0 I No Inspect P W R W (a) Work n Shirk Bob(B) Alice(A) Cooperate Defect Cooperate R=3 T=5 ....
R. S. Sutton and A. G. Barto, Reinforcement Learning, MIT Press, 1998.
No context found.
R. Sutton and A. Barto. Reinforcement Learning. 1998.
No context found.
R. S. Sutton and A. G. Barto, Reinforcement Learning, MIT Press, 1998.
No context found.
Sutton, R. S. and Barto, A. G. Reinforcement learning. Cambridge, MA: MIT Press, 1998.
No context found.
R. Sutton and A. G. Barto. Reinforcement Learning. The MIT Press, Cambridge, Massachusetts, 1998.
No context found.
Sutton, R. and Barto, A. G. (1998). Reinforcement learning. MIT Press, Cambridge, MA.
No context found.
R. S. Sutton and A. G. Barto. Reinforcement Learning. MIT Press, 1998.
No context found.
Sutton, R., & Barto, A. (1998). Reinforcement learning. MIT Press, Cambridge.
No context found.
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning. Cambridge, MA: MIT Press.
No context found.
Sutton, R.S., & Barto, A.G. (1998) Reinforcement learning, MIT Press.
No context found.
Sutton, R.S., and Barto, A.G. Reinforcement Learning, The MIT Press, 1998
No context found.
Sutton, R. S. and Barto, A.G.: Reinforcement learning, an introduction. The MIT Press, (1998).
No context found.
R. Button, A.G. Barto, Reinforcement Learning, The MIT Press,CambridL , MA, 1998.
No context found.
R. S. Sutton and A. G. Barto, Reinforcement Learning, MIT press, Cambridge, Massachusetts, London, England, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, Cambridge, 1998.
No context found.
Sutton, R. S. & A. G. Barto (1998). Reinforcement Learning. MIT Press, Cambridge.
No context found.
Richard Sutton and Andrew Barto. Reinforcement Learning. MIT Press, 1998.
No context found.
R.S. Sutton and Barto, A. (1996). Reinforcement Learning. Cambridge, MA: MIT Press.
No context found.
Sutton R.S., Barto A.G. (1998). Reinforcement Learning, MIT Press.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, Cambridge, MA, 1998. 28
No context found.
R. S. Sutton and A. G. Barto. Reinforcement Learning. MIT Press, Cambridge, MA, 1998.
No context found.
A. Barto R. Sutton. Reinforcement Learning. MIT Press, Cambridge, Massachusetts, 1998.
No context found.
R. S. Sutton and A. G. Barto. Reinforcement Learning. The MIT Press, 1998.
No context found.
R. S. Sutton and A. G. Barto. Reinforcement learning. MIT Press, 1998.
No context found.
R. S. Sutton and A. G. Barto. Reinforcement Learning. An Introduction. MIT Press/A Bradford Book, Cambridge, MA, 1998.
No context found.
R. Sutton and A. G. Barto, Reinforcement Learning. Cambridge, Massachusetts: The MIT press, 1998.
No context found.
R. Sutton and A. Barto. Reinforcement Learning. MIT Press, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, 1998. 29
No context found.
A.G. Barto and R.S. Sutton. Reinforcement Learning. MIT Press, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning, An introduction. BradFord Book. The MIT Press, 1998. 25
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning, an Introduction. MIT Press / Bradford Books, Cambridge, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. An Introduction. MIT Press/A Bradford Book, Cambridge, MA, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning, An introduction. BradFord Book. The MIT Press, 1998.
No context found.
R. Sutton and A. Barto. Reinforcement Learning. An Introduction. MIT Press/A Bradford Book, Cambridge, MA, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, Cambridge, MA, 1998.
No context found.
R. S. Sutton, A. G. B. (1998). Reinforcement Learning. A Bradford book. MIT press.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning, An introduction. BradFord Book. The MIT Press, 1998. Appendix: derivation of an upper bound of the interpolation error
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, Cambridge, Ma., 1998.
No context found.
Andrew G.Barto Richard S.Sutton. Reinforcement Learning, An Introduction. MIT Press, 2000. ISBN : 0-262-19398-1.
No context found.
R. S. Sutton and A. G. Barto. Reinforcement Learning. An Introduction. MIT Press/A Bradford Book, Cambridge, MA, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. A Bradford Book, Cambridge, London, 1998.
No context found.
R. S. Sutton and A. G. Barto, Reinforcement Learning, an Introduction. Cambridge, MA: MIT Press-Bradford, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning, An introduction. BradFord Book. The MIT Press, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, Cambridge, 1998.
No context found.
R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, Cambridge, MA, 1998.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC