See this document in CiteSeerX!

Journal of Machine Learning Research 7 (2006) 2259-2301 Submitted 10/05; Revised 7/06; Published 11/06 Causal Graph Based Decomposition of Factored MDPs  (Make Corrections)  
Anders Jonsson ANDERS. Departament de Tecnologia Universitat...



  Home/Search   Context   Related

 
View or download:
jmlr.org/papers/volume...jonsson06a.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  mit.edu/papers/v7/ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesian network model of actions, and constructs a causal graph that captures relationships between state variables. (Update)

Active bibliography (related documents):   More   All
0.8:   Probabilistic Soft Interventions in Conditional Gaussian.. - Florian Markowetz Steen   (Correct)
0.7:   Recent Advances in Hierarchical Reinforcement Learning - Barto, Mahadevan (2003)   (Correct)
0.6:   Probabilistic Policy Reuse in a Reinforcement Learning - Agent Fernando Fern   (Correct)

Similar documents based on text:
0.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

@misc{ anders-journal,
  author = "Anders Jonsson Anders",
  title = "Journal of Machine Learning Research 7 (2006) 2259-2301 Submitted 10/05;
    Revised 7/06; Published 11/06 Causal Graph Based Decomposition of Factored
    MDPs",
  url = "citeseer.ist.psu.edu/761816.html" }
Citations (may not include all citations):
1174   Statecharts: A visual formalism for complex systems - Harel - 1987
891   Strips: A New Approach to the Application of Theorem Proving.. (context) - Fikes, Nilsson - 1971
614   Reinforcement Learning: An Introduction - Sutton, Barto - 1998
257   Learning to act using real-time dynamic programming - Barto, Bradtke et al. - 1995
246   Markov Decision Processes (context) - Puterman - 1994
199   A model for reasoning about persistence and causation (context) - Dean, Kanazawa - 1989
136   Exploiting structure in policy construction - Boutilier, Dearden et al. - 1995
87   Reinforcement Learning with Hierarchies of Machines - Parr, Russell - 1998
77   Between MDPs and Semi-MDPs: A framework for temporal abstrac.. - Sutton, Precup et al. - 1999
63   Decomposition techniques for planning in stochastic domains - Dean, Lin - 1995
51   Model minimization in Markov decision processes - Dean, Givan - 1997
49   Hierarchical Solution of Markov Decision Processes using Mac.. - Hauskrecht, Meuleau et al. - 1998
42   Max-norm Projections for Factored MDPs - Guestrin, Koller et al. - 2001
41   Hierarchical Control and Learning for Markov Decision Proces.. - Parr - 1998
40   Spudd: Stochastic Planning using Decision Diagrams - Hoey, St-Aubin et al. - 1999
34   Reinforcement learning methods for continuous-time Markov de.. - Bradtke, Duff - 1995
19   Active learning for parameter estimation in Bayesian network.. - Tong, Koller - 2001
12   Automatic Discovery of Subgoals in Reinforcement Learning us.. - McGovern, Barto - 2001
11   Advances in Neural Information Processing Systems (context) - Thrun, Schwartz et al. - 1996
10   Discovering Hierarchy in Reinforcement Learning with HEXQ - Hengst - 2002
6   Continuous-Time Hierarchical Reinforcement Learning - Ghavamzadeh, Mahadevan - 2001
4   Symbolic Heuristic Search for Factored Markov Decision Proce.. - Feng, Hansen - 2002
4   Efficient Reinforcement Learning in Factored MDPs - Kearns, Koller - 1999
4   A Markov decision process (context) - Bellman - 1957
3   Cut -- Dynamic Discovery of Sub-Goals in Reinforcement Learn.. (context) - Menache, Mannor et al. - 2002
3   Automated State Abstractions for Options Using the U-Tree Al.. - Jonsson, Barto - 2001
3   Unsupervised Active Learning in Large Domains - Steck, Jaakkola - 2002
2   Active Learning of Causal Bayes Net Structure - Murphy - 2001
2   Identifying useful subgoals in reinforcement learning by loc.. (context) - imsek, Wolfe et al. - 2005
2   Using relative novelty to identify useful temporal abstracti.. (context) - imsek, Barto - 2004
1   An Algebraic Approach to Abstraction in Reinforcement Learni.. - Ravindran - 2004
1   A Causal Approach to Hierarchical Decomposition of Factored .. (context) - Jonsson, Barto - 2005
1   Symbolic generalization for on-line planning (context) - Feng, Hansen et al. - 2003
1   Intrinsically Motivated Reinforcement Learning (context) - Singh, Barto et al. - 2005
1   A planning heuristic based on causal graph analysis (context) - Helmert - 2004
1   Dynamic abstraction in reinforcement learning via clustering - Mannor, Menache et al. - 2004
1   Policyblocks: An Algorithm for Creating Useful Macro-Actions.. - Pickett, Barto - 2002

Documents on the same site (http://jmlr.csail.mit.edu/papers/v7/):   More
Estimating the "Wrong" Graphical Model: Benefits in the.. - Wainwright (2006)   (Correct)
A Hierarchy of Support Vector Machines for Pattern Detection - Sahbi, Geman (2006)   (Correct)
Efficient Learning of Label Ranking by Soft Projections.. - Shalev-Shwartz, Singer (2006)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC