Results 1  10
of
108,514
Deep AutoEncoder Neural Networks in Reinforcement Learning
"... Abstract — This paper discusses the effectiveness of deep autoencoder neural networks in visual reinforcement learning (RL) tasks. We propose a framework for combining deep autoencoder neural networks (for learning compact feature spaces) with recentlyproposed batchmode RL algorithms (for learning ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
Abstract — This paper discusses the effectiveness of deep autoencoder neural networks in visual reinforcement learning (RL) tasks. We propose a framework for combining deep autoencoder neural networks (for learning compact feature spaces) with recentlyproposed batchmode RL algorithms (for
OPTIMAL SAMPLE SELECTION FOR BATCHMODE REINFORCEMENT LEARNING
"... Abstract: We introduce the Optimal Sample Selection (OSS) metaalgorithm for solving discretetime Optimal Control problems. This metaalgorithm maps the problem of finding a nearoptimal closedloop policy to the identification of a small set of onestep system transitions, leading to highquality ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
quality policies when used as input of a batchmode Reinforcement Learning (RL) algorithm. We detail a particular instance of this OSS metaalgorithm that uses treebased Fitted QIteration as a batchmode RL algorithm and Cross Entropy search as a method for navigating efficiently in the space of sample sets
Planning Algorithms
, 2004
"... This book presents a unified treatment of many different kinds of planning algorithms. The subject lies at the crossroads between robotics, control theory, artificial intelligence, algorithms, and computer graphics. The particular subjects covered include motion planning, discrete planning, planning ..."
Abstract

Cited by 1108 (51 self)
 Add to MetaCart
This book presents a unified treatment of many different kinds of planning algorithms. The subject lies at the crossroads between robotics, control theory, artificial intelligence, algorithms, and computer graphics. The particular subjects covered include motion planning, discrete planning
Optimization Flow Control, I: Basic Algorithm and Convergence
 IEEE/ACM TRANSACTIONS ON NETWORKING
, 1999
"... We propose an optimization approach to flow control where the objective is to maximize the aggregate source utility over their transmission rates. We view network links and sources as processors of a distributed computation system to solve the dual problem using gradient projection algorithm. In thi ..."
Abstract

Cited by 690 (64 self)
 Add to MetaCart
We propose an optimization approach to flow control where the objective is to maximize the aggregate source utility over their transmission rates. We view network links and sources as processors of a distributed computation system to solve the dual problem using gradient projection algorithm
SNOPT: An SQP Algorithm For LargeScale Constrained Optimization
, 2002
"... Sequential quadratic programming (SQP) methods have proved highly effective for solving constrained optimization problems with smooth nonlinear functions in the objective and constraints. Here we consider problems with general inequality constraints (linear and nonlinear). We assume that first deriv ..."
Abstract

Cited by 582 (23 self)
 Add to MetaCart
Sequential quadratic programming (SQP) methods have proved highly effective for solving constrained optimization problems with smooth nonlinear functions in the objective and constraints. Here we consider problems with general inequality constraints (linear and nonlinear). We assume that first derivatives are available, and that the constraint gradients are sparse. We discuss
Reinforcement Learning I: Introduction
, 1998
"... In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection, search ..."
Abstract

Cited by 5500 (120 self)
 Add to MetaCart
In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection
GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems
 SIAM J. SCI. STAT. COMPUT
, 1986
"... We present an iterative method for solving linear systems, which has the property ofminimizing at every step the norm of the residual vector over a Krylov subspace. The algorithm is derived from the Arnoldi process for constructing an l2orthogonal basis of Krylov subspaces. It can be considered a ..."
Abstract

Cited by 2046 (40 self)
 Add to MetaCart
We present an iterative method for solving linear systems, which has the property ofminimizing at every step the norm of the residual vector over a Krylov subspace. The algorithm is derived from the Arnoldi process for constructing an l2orthogonal basis of Krylov subspaces. It can be considered
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood
, 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract

Cited by 2109 (30 self)
 Add to MetaCart
. The core of this method is a simple hillclimbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distancebased method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment
FAST VOLUME RENDERING USING A SHEARWARP FACTORIZATION OF THE VIEWING TRANSFORMATION
, 1995
"... Volume rendering is a technique for visualizing 3D arrays of sampled data. It has applications in areas such as medical imaging and scientific visualization, but its use has been limited by its high computational expense. Early implementations of volume rendering used bruteforce techniques that req ..."
Abstract

Cited by 541 (2 self)
 Add to MetaCart
that require on the order of 100 seconds to render typical data sets on a workstation. Algorithms with optimizations that exploit coherence in the data have reduced rendering times to the range of ten seconds but are still not fast enough for interactive visualization applications. In this thesis we present a
The strength of weak learnability
 Machine Learning
, 1990
"... Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with h ..."
Abstract

Cited by 861 (24 self)
 Add to MetaCart
Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner
Results 1  10
of
108,514