| Bellman, R.; Kalaba, R.; and Kotkin, B. 1963. Polynomial approximation -- a new computational technique in dynamic programming: Allocation processes. Mathematics of Computation 17(82):155--161. |
....of IR spanned by the basis functions H . It is useful to define an jSj k matrix A whose columns are the k basis functions, viewed as vectors. Our approximate value function is then represented by Aw. The idea of using linear value functions for dynamic programming was proposed, initially, by Bellman et al. 1963] and has been further explored recently [Tsitsiklis and Van Roy, 1996; Koller and Parr, 1999; 2000; Guestrin et al. 2001] The basic idea is as follows: in the solution algorithms, whether value iteration or policy iteration, we use only value functions within H. Whenever the algorithm takes a ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
....IR jSj spanned by the basis functions H . It is useful to define an jSj k matrix A whose columns are the k basis functions, viewed as vectors. Our approximate value function is then represented by Aw. The idea of using linear value functions for dynamic programming was proposed, initially, by Bellman et al. 1963] and has been further explored recently [Tsitsiklis and Van Roy, 1996; Koller and Parr, 1999; 2000; Guestrin et al. 2001] The basic idea is as follows: in the solution algorithms, whether value iteration or policy iteration, we use only value functions within H. Whenever the algorithm takes a ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
....functions H. It is useful to define an jSj Theta k matrix A whose columns are the k basis functions, viewed as vectors. Our approximate value function is then represented by Aw. Linear value functions: The idea of using linear value functions for dynamic programming was proposed, initially, by Bellman et al. 1963] and has been further explored recently [Tsitsiklis and Van Roy, 1996; Koller and Parr, 1999; 2000] The basic idea is as follows: in the solution algorithms, whether value or policy iteration, we use only value functions within H. Whenever the algorithm takes a step that results in a value ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
....and not the easier problem of approximating a solution that is already available. There is an extensive literature on function approximation methods and DP, such as multigrid methods and methods using splines and orthogonal polynomials (e.g. Bellman and Dreyfus [7] Bellman, Kalaba, and Kotkin [8], Daniel [18] Kushner and Dupuis [41] However, most of this literature is devoted to off line algorithms for cases in which there is a complete model of the decision problem. Adapting techniques from this literature to produce approximation methods for RTDP and other DP based learning ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation---A new computational technique in dynamic programming: Allocation processes. Mathematical Computation, 17:155--161, 1973. 58 Submitted to AI Journal special issue on Computational Theories of Interaction and Agency
....possible to run similar algorithms on an approximate representation of the solution to a decision problem. For example, Bellman discusses finding approximate value functions by quantization and low order polynomial interpolation in [Bel61] and decomposition by orthogonal functions in [BD59, BKK63] These approximate methods are not covered by the convergence proofs for the exact methods. But, if they do converge, they can allow us to find numerical solutions to problems which would otherwise be too large to solve. Researchers have experimented with a number of approximate algorithms for ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation --- a new computational technique in dynamic programming: allocation processes. Mathematics of Computation, 17:155--161, 1963.
.... it is perfectly possible to perform temporal differencing on an approximate representation of the solution to a decision problem Bellman discusses quantization and low order polynomial interpolation in (Bellman, 1961) and decomposition by orthogonal functions in (Bellman and Dreyfus, 1959, Bellman et al. 1963). These fitted temporal difference methods are not covered by the above convergence proofs. But, if they do converge, they can allow us to find numerical solutions to problems which would otherwise be too large to solve. Researchers have experimented with a number of fitted temporal difference ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation --- a new computational technique in dynamic programming: allocation processes. Mathematics of Computation, 17:155--161, 1963.
....and not the easier problem of approximating a solution that is already available. There is an extensive literature on function approximation methods and DP, such as multigrid methods and methods using splines and orthogonal polynomials (e.g. Bellman and Dreyfus [7] Bellman, Kalaba, and Kotkin [8], Daniel [18] Kushner and Dupuis [41] However, most of this literature is devoted to off line algorithms for cases in which there is a complete model of the decision problem. Adapting techniques from this literature to produce approximation methods for RTDP and other DP based learning ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation---A new computational technique in dynamic programming: Allocation processes. Mathematical Computation, 17:155--161, 1973.
....of the value function is intractable; some form of generalization is required. A natural way to incorporate generalization into DP is to use a function approximator, rather than a lookup table, to represent the value function. This approach, which dates back to uses of Legendre polynomials in DP [ Bellman et al. 1963 ] has recently worked well on several dynamic control problems [ Mahadevan and Connell, 1990, Lin, 1993 ] and succeeded spectacularly on the game of backgammon [ Tesauro, 1992, Boyan, 1992 ] On the other hand, many sensible implementations have been less successful [ Bradtke, 1993, Schraudolph ....
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation---a new computational technique in dynamic programming: Allocation processes. Mathematics of Computation, 17, 1963.
No context found.
Bellman, R.; Kalaba, R.; and Kotkin, B. 1963. Polynomial approximation -- a new computational technique in dynamic programming: Allocation processes. Mathematics of Computation 17(82):155--161.
No context found.
Richard Bellman, Robert Kalaba, and Bella Kotkin. Polynomial approximation -- a new computational technique in dynamic programming: Allocation processes. Mathematics of Computation, 17(82):155--161, 1963.
No context found.
Bellman, R.; Kalaba, R.; and Kotkin, B. 1963. Polynomial approximation -- a new computational technique in dynamic programming: Allocation processes. Mathematics of Computation 17(82):155--161.
No context found.
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
No context found.
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
No context found.
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
No context found.
Bellman, R.; Kalaba, R.; and Kotkin, B. 1963. Polynomial approximation -- a new computational technique in dynamic programming: Allocation processes. Mathematics of Computation 17(82):155--161.
No context found.
Richard Bellman, Robert Kalaba, and Bella Kotkin. Polynomial approximation - a new computational technique in dynamic programming. Mathematics of Computation, 17(8):155--161, 1963.
No context found.
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
No context found.
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
No context found.
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
No context found.
Bellman, R., Kalaba, R. and Kotkin, B. "Polynomial Approximation --- A New Computational Technique in Dynamic Programming", Math. of Computation, 17, 8 (1963), 155--161.
No context found.
R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation -- a new computational technique in dynamic programming. Math. Comp., 17(8):155--161, 1963.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC