37 citations found. Retrieving documents...
Boutilier, Craig; Dearden, Richard; and Goldszmidt, Moiss 2000. Stochastic dynamic programming with factored representations. Artificial Intelligence 121(1-2): 49-107.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Logical Markov Decision Programs - Kersting, De Raedt (2003)   (Correct)

....RRL has failed to explain in theoretical terms why RRL works. It is precisely such a theory that we contribute. The relational regression trees used in RRL encode abstract policies. From a more general point of view, our approach is closely related to decision theoretic regression (DTR) [Boutilier et al. 2000] . Here, state spaces are characterized by a number of random variables and the domain is specified using logical representations of actions that capture the regularities in the effects of actions. Because existing DTR algorithms are all designed to work with propositional representations of ....

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121, 2000.


Approximate Policy Iteration with a Policy Language Bias - Fern, Yoon, Givan (2003)   (1 citation)  (Correct)

....1 Introduction Dynamic programming approaches to finding optimal control policies in Markov decision processes (MDPs) 4, 14] using explicit (flat) state space representations break down when the state space becomes extremely large. More recent work extends these algorithms to use propositional [6, 11, 7, 12] as well as relational [8] state space representations. These extensions have not yet shown the capacity to solve large classical planning problems such as the benchmark problems used in planning competitions [2] These methods typically calculate a sequence of cost functions. For familiar ....

Craig Boutilier, Richard Dearden, and Moises Goldszmidt. Stochastic dynamic programming with factored representations. AIJ, 121(1-2):49--107, 2000.


Policy-contingent state abstraction for hierarchical MDPs - Pineau, Gordon (2002)   (Correct)

.... is: Step I III Same as equations 3 5 Step IV Update value of each cluster: V (C j ) max a [R(C j ; a) C 2C T (C j ; a; C )V (C ) 8C j 2 C (10) For non hierarchical MDPs, this is equivalent to Boutilier et al. s decision tree representation of MDP value functions [2]. The value function of the final policy solution can be retrieved by looking at the value function of the top subtask: V (h 0 s) Because it fixes low level subtask policies prior to solving higher level subtasks, the algorithm is limited to recursive optimality (rather than ....

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. AI Journal, 2000.


Reinforcement Learning with Exploration - Reynolds (2002)   (Correct)

....set. This is compared with a sample from an adjacent region using the Kolmogorov test. Also, an alternative (less theoretically based ) test was used which maintains splits if this reduces the variance in the 1 step return estimates by some threshold. Dynamically Refactoring Representations In [35] Boutilier, Dearden and Goldszmidt use a method that seeks to increase the resolution of (decision based) binary state representations where there is evidence that the value is non constant within a aggregate region. A Bayesian network is used to compactly represent a transition probability ....

Richard Dearden Craig Boutilier and Moises Goldszmidt. Stochastic dynamic pro- gramming with factored representations. Artificial Intelligence. To appear.


Efficient Approximate Inference for Online Probabilistic Plan.. - Bui (2002)   (Correct)

....over all the possible values of the cut set variables which can be intractable, only a number of representative sampled values are used. In addition, we show that the AHMM representation and the hybrid policy recognition algorithm can also utilize a factored representation of the state space [2]. The AHMM is closely related to a model for probabilistic plan recognition called the Probabilistic State Dependent Grammar (PSDG) independently proposed in [19, 21] The PSDG can be described as the Probabilistic Context Free Grammar (PCFG) 13] augmented with a state space, and a state ....

Craig Boutilier, Richard Dearden, and Moises Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 2001. to appear.


Greedy linear value-approximation for factored Markov.. - Relu Patrascu Rpatrasc (2002)   (2 citations)  Self-citation (Boutilier)   (Correct)

No context found.

Boutilier, C.; Dearden, R.; and Goldszmidt, M. 2000. Stochastic dynamic programming with factored representations. Artificial Intelligence.


Learning and Planning in Structured Worlds - Dearden   Self-citation (Dearden)   (Correct)

....They make the representation of policies and value functions as trees, and the algorithm for simplifying trees slightly more complex. The effect of cor relations is more problematic, but the algorithm can be amended in a relatively straightforward manner to handle this case. See [12] or [17] for details. How can we expect this structured representation of a problem to provide 49 DO NOTHING U True False GET UMBRELLA WC HC 0 W 2 1 2 Figure 3.2: Examples of (a) a policy tree, and (b) a value tree. computational savings The idea is conceptually simple and closely related to ....

Craig Boutilier, Richard Dearden, and Moiss Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 2000. To appear.


Value-directed Compression of POMDPs - Poupart, Boutilier (2002)   (8 citations)  Self-citation (Boutilier)   (Correct)

....DBN structure and context specific independence. If transition, observation and reward functions are represented using DBNs and structured CPTs (e.g. decision trees or algebraic decision diagrams) then the matrix operations required by the Krylov algorithm can be implemented effectively [1, 7]. Although this approach can offer substantial savings, the DTs or ADDs that represent the basis vectors of the Krylov subspace may still be much larger than the dimensionality of the compressed state space and the original DBN specifications. Alternatively, families of self sufficient variables ....

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121:49--107, 2000.


Equivalence Notions and Model Minimization in - Markov Decision Processes   (Correct)

No context found.

Boutilier, Craig; Dearden, Richard; and Goldszmidt, Moiss 2000. Stochastic dynamic programming with factored representations. Artificial Intelligence 121(1-2): 49-107.


Game theory and AI: a unified approach to poker games - Oliehoek (2005)   (Correct)

No context found.

Craig Boutilier, Richard Dearden, and Moisés Goldszmidt. Stochastic dynamic programming with factored representations. Artif. Intell., 121(1-2):49107, 2000.


Towards a Unified Theory of State Abstraction for MDPs - Li, Walsh, Littman (2006)   (Correct)

No context found.

Craig Boutilier, Richard Dearden, and Moises Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1--2):49--107, 2000.


Reinforcement Learning for Factored Markov Decision Processes - Sallans (2002)   (Correct)

No context found.

Boutilier, C., R. Dearden, and M. Goldszmidt (2000). Stochastic dynamic programming with factored representations. Articial Intelligence 121, 49--107.


Reinforcement Learning for Factored Markov Decision Processes - Sallans (2002)   (Correct)

No context found.

Boutilier, C., R. Dearden, and M. Goldszmidt (1999). Stochastic dynamic programming with factored representations. Unpublished manuscript.


Graphical Models in Local, Asymmetric Multi-Agent Markov.. - Dolgov, Durfee (2004)   (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1-2):49--107, 2000.


Algorithms for Partially Observable Markov Decision Processes - Zhang (2001)   (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt, "Stochastic dynamic programming with factored representations," Artificial Intelligence, vol. 121, pp. 49--107, 2000.


Approximate Policy Iteration with a Policy Language Bias - Fern, Yoon, Givan (2003)   (1 citation)  (Correct)

No context found.

Craig Boutilier, Richard Dearden, and Moises Goldszmidt. Stochastic dynamic programming with factored representations. AIJ, 121(1-2):49--107, 2000.


Reinforcement Learning: a brief overview. - Jeremy Wyatt School   (Correct)

No context found.

Richard Dearden, Craig Boutillier, and Moises Goldszmidt. Stochastic dynamic programming with factored representations. Arti cial Intelligence, 121(1-2):49{ 107, 2000.


Learning to Take Concurrent Actions - Rohanimanesh, Mahadevan (2002)   (1 citation)  (Correct)

No context found.

Craig Boutilier, Richard Dearden, and Moises Goldszmidt. Stochastic dynamic programming with factored representations. To appear in Arti cial Intelligence, to appear.


Logical Markov Decision Programs - Kersting, De Raedt (2003)   (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121, 2000.


Learning and Approximate Dynamic Programming - Scaling.. - Si, Barto, Powell..   (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. ArtificialIntelligence, 121(1--2):49--107, 2000.


Proposed design for gR, a graphical models toolkit for R - Murphy (2003)   (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 2001.


Robot Weightlifting By Direct Policy Search - Michael Rosenstein And (2001)   (1 citation)  (Correct)

No context found.

Craig Boutilier, Richard Dearden, and Moises Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1-2):49--107, 2000.


Improving MACS thanks to Comparison with 2TBNs - Sigaud, Gourdin, Wuillemin   (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Articial Intelligence, 121(1):49107, 2000.


Learning to Exploit Dynamics for Robot Motor Coordination - Rosenstein (2003)   (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1-2):49--107, 2000. 107


Exploiting First-Order Regression in Inductive Policy.. - Gretton, Thiébaux (2004)   (1 citation)  (Correct)

No context found.

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1-2):49--107, 2000.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC