70 citations found. Retrieving documents...
Sutton, R. S., McAllester, D., Singh, S., and Mansour, Y. #1999#. Policy gradient methods for reinforcement learning with function approximation. In Neural Information Processing Systems-1999.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

A Social Reinforcement Learning Agent - Charles Lee Isbell   Self-citation (Singh)   (Correct)

No context found.

Sutton, R. S., McAllester, D., Singh, S., and Mansour, Y. #1999#. Policy gradient methods for reinforcement learning with function approximation. In Neural Information Processing Systems-1999.


Simultaneous Adversarial Multi-Robot Learning - Michael Bowling And   (Correct)

No context found.

Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pages 1057--1063. MIT Press, 2000.


Journal of Artificial Intelligence Research 22 (2004).. - Michael Bowling Bowling   (Correct)

No context found.

Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pp. 1057--1063. MIT Press.


Existence of Multiagent Equilibria with Limited Agents - Michael Bowling Mhb   (Correct)

No context found.

Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pp. 1057--1063. MIT Press.


Planning In Hybrid Structured Stochastic - Domains Comenius University   (Correct)

No context found.

Richard Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pages 1057--1063, 2000. 116


Policy Gradient in Continuous Time - Munos (2006)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. Neural Information Processing Systems. MIT Press, pages 1057--1063, 2000.


Geometric Variance Reduction in Markov Chains: Application to.. - Munos (2006)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. Neural Information Processing Systems. MIT Press, pages 1057--1063, 2000.


Journal of Artificial Intelligence Research 15 (2001).. - Jonathan Baxter Jbaxter   (Correct)

No context found.

Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Neural Information Processing Systems 1999. MIT Press.


Journal of Artificial Intelligence Research 15 (2001).. - Jonathan Baxter Jbaxter   (Correct)

No context found.

Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Neural Information Processing Systems 1999. MIT Press.


Covariant Policy Search - Andrew Bagnell And (2003)   (Correct)

No context found.

R. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Neural Information Processing Systems 12, 1999.


Intensive versus non-intensive actor-critic reinforcement.. - Wawrzyski, Pacut (2004)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy Gradient Methods for Reinforcement Learning with Function Approximation," Advances in Information Processing Systems 12, pp. 1057-1063, MIT Press, 2000.


Model-free off-policy reinforcement learning in continuous.. - Wawrzynski, Pacut (2004)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy Gradient Methods for Reinforcement Learning with Function Approximation," Advances in Information Processing Systems 12, pp. 1057-1063, MIT Press, 2000.


Intensive Reinforcement Learning - Wawrzyski (2005)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy Gradient Methods for Reinforcement Learning with Function Approximation," Advances in Information Processing Systems 12, pp. 1057-1063, MIT Press, 2000.


Scaling Reinforcement Learning Paradigms for Motor Control - Peters, Vijayakumar, Schaal (2003)   (Correct)

No context found.

Sutton, R.S., McAllester, D., Singh, S., and Mansour, Y. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12. MIT Press, 2000.


Reinforcement Learning for Humanoid Robotics - Peters, Vijayakumar, Schaal (2003)   (Correct)

No context found.

Sutton, R.S., McAllester, D., Singh, S., and Mansour, Y. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12. MIT Press, 2000.


Towards a Unified Theory of State Abstraction for MDPs - Li, Walsh, Littman (2006)   (Correct)

No context found.

Richard S. Sutton, David McAllester, Satinder P. Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pages 1057--1063, 2000.


Reinforcement Learning for Factored Markov Decision Processes - Sallans (2002)   (Correct)

No context found.

Sutton, R. S., D. McAllester, S. P. Singh, and Y. Mansour (2000). Policy gradient methods for reinforcement learning with function approximation. See Solla, Leen, and Muller [2000], pp. 1057--1063.


Exploiting Multiple Secondary Reinforcers in Policy Gradient.. - Grudic, Ungar (2001)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In S. A. Solla, T. K. Leen, and K.-R. Mller, editors, Advances in Neural Information Processing Systems, volume 12, Cambridge, MA, 2000. MIT Press.


Simultaneous Adversarial Multi-Robot Learning - Bowling, Veloso (2003)   (1 citation)  (Correct)

No context found.

Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pages 1057--1063. MIT Press, 2000.


Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey - Yang, Gu (2004)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," in Advances in Neural Information Proceesing Systems. MIT Press, 12, pp. 1057--1063.


The Role of Reactivity in Multiagent Learning - Bikramjit Banerjee And (2004)   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pages 1057 -- 1063. MIT Press, 2000.


Learning to Trade via Direct Reinforcement - Moody, Saffell (2001)   (6 citations)  (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," in Advances in Neural Information Processing Systems, T. K. Leen, S. A. Solla, and K.-R. Muller, Eds. Cambridge, MA: MIT Press, 2000, vol. 12, pp. 1057--1063.


Rates of Convergence of Performance Gradient Estimates Using.. - Grudic, Ungar   (Correct)

No context found.

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, Policy gradient methods for reinforcement learning with function approximation, Advances in Neural Information Processing Systems (Cambridge, MA) (S. A. Solla, T. K. Leen, and K.-R. Mller, eds.), vol. 12, MIT Press, 2000.


Reinforcement Learning by Policy Search - Peshkin (2001)   (7 citations)  (Correct)

No context found.

Richard S. Sutton, David McAllester, Satinder P. Singh, and Yishay Mansour, Policy gradient methods for reinforcement learning with function approximation, Advances in Neural Information Processing Systems, vol. 12, The MIT Press, 1999, pp. 1057-63.


Reinforcement Learning for Problems with Hidden State - Hasinoff (2003)   (Correct)

No context found.

R. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pages 1057--1063, 1999.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC