Results 1  10
of
41
Efficient Natural Evolution Strategies
 GECCO '09
, 2009
"... Efficient Natural Evolution Strategies (eNES) is a novel alternative to conventional evolutionary algorithms, using the natural gradient to adapt the mutation distribution. Unlike previous methods based on natural gradients, eNES uses a fast algorithm to calculate the inverse of the exact Fisher inf ..."
Abstract

Cited by 19 (10 self)
 Add to MetaCart
Efficient Natural Evolution Strategies (eNES) is a novel alternative to conventional evolutionary algorithms, using the natural gradient to adapt the mutation distribution. Unlike previous methods based on natural gradients, eNES uses a fast algorithm to calculate the inverse of the exact Fisher information matrix, thus increasing both robustness and performance of its evolution gradient estimation, even in higher dimensions. Additional novel aspects of eNES include optimal fitness baselines and importance mixing (a procedure for updating the population with very few fitness evaluations). The algorithm yields competitive results on both unimodal and multimodal benchmarks.
Exploring Parameter Space in Reinforcement Learning
 PALADYN JOURNAL OF BEHAVIORAL ROBOTICS REVIEW
, 2010
"... This paper discusses parameterbased exploration methods for reinforcement learning. Parameterbased methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameterbased exploration unifies reinforcement learning and blackbox opt ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
This paper discusses parameterbased exploration methods for reinforcement learning. Parameterbased methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameterbased exploration unifies reinforcement learning and blackbox optimization, and has several advantages
Variable metric reinforcement learning methods applied to the noisy mountain car problem
 Eds.), European Workshop on Reinforcement Learning (EWRL 2008), in: Lecture Notes in Artificial Intelligence
, 2008
"... Abstract. Two variable metric reinforcement learning methods, the natural actorcritic algorithm and the covariance matrix adaptation evolution strategy, are compared on a conceptual level and analysed experimentally on the mountain car benchmark task with and without noise. 1 ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
(Show Context)
Abstract. Two variable metric reinforcement learning methods, the natural actorcritic algorithm and the covariance matrix adaptation evolution strategy, are compared on a conceptual level and analysed experimentally on the mountain car benchmark task with and without noise. 1
High Dimensions and Heavy Tails for Natural Evolution Strategies
 In Genetic and Evolutionary Computation Conference (GECCO
, 2011
"... The family of natural evolution strategies (NES) offers a principled approach to realvalued evolutionary optimization by following the natural gradient of the expected fitness on the parameters of its search distribution. While general in its formulation, existing research has focused only on multi ..."
Abstract

Cited by 13 (10 self)
 Add to MetaCart
(Show Context)
The family of natural evolution strategies (NES) offers a principled approach to realvalued evolutionary optimization by following the natural gradient of the expected fitness on the parameters of its search distribution. While general in its formulation, existing research has focused only on multivariate Gaussian search distributions. We address this shortcoming by exhibiting problem classes for which other search distributions are more appropriate, and then derive the corresponding NESvariants. First, we show how simplifying NES to separable distributions reduces its complexity from O(d 3) to O(d), and apply it to problems of previously unattainable dimensionality, recovering lowestenergy structures on the LennardJones atom clusters and stateoftheart results on neuroevolution benchmarks. Second, we develop a new, equivalent formulation based on invariances, which allows us to generalize NES to heavytailed distributions, even if their variance is undefined. We theninvestigate howthisvariant aids inovercoming deceptive local optima.
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
, 2011
"... Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. At any given time, the novel algorithmic framework POWERPLAY searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. The new task and its corresponding tasksolving skill are those first found and validated. Newly invented tasks may require making previously learned skills more efficient. The greedy search of typical POWERPLAY variants orders candidate pairs of tasks and solver modifications by their conditional computational complexity, given the stored experience so far. This biases the search towards pairs that can be described compactly and validated quickly. Standard problem solver architectures of personal computers or neural networks tend to generalize by solving numerous tasks outside the selfinvented training set; POWERPLAY’s ongoing search for novelty keeps fighting to extend beyond the generalization abilities of its present solver. The continually increasing repertoire of problem solving procedures can be exploited
Sequential Constant Size Compressors and Reinforcement Learning
 In Proceedings of the Fourth Conference on Artificial General Intelligence
, 2011
"... Abstract. Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequen ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
(Show Context)
Abstract. Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential ConstantSize Compressor (SCSC). The SCSC takes the form of a sequential Recurrent AutoAssociative Memory, trained through standard backpropagation. Results illustrate the feasibility of this approach — this system learns to deal with highdimensional visual observations (up to 640 pixels) in partially observable environments where there are long time lags (up to 12 steps) between relevant sensory information and necessary action.
Compressed Network Complexity Search
, 2012
"... Indirect encoding schemes for neural network phenotypes can represent large networks compactly. In previous work, we presented a new approach where networks are encoded indirectly as a set of Fouriertype coefficients that decorrelate weight matrices such that they can often be represented by a sma ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
Indirect encoding schemes for neural network phenotypes can represent large networks compactly. In previous work, we presented a new approach where networks are encoded indirectly as a set of Fouriertype coefficients that decorrelate weight matrices such that they can often be represented by a small number of genes, effectively reducing the search space dimensionality, and speed up search. Up to now, the complexity of networks using this encoding was fixed a priori, both in terms of (1) the number of free parameters (topology) and (2) the number of coefficients. In this paper, we introduce a method, called Compressed Network Complexity Search (CNCS), for automatically determining network complexity that favors parsimonious solutions. CNCS maintains a probability distribution over complexity classes that it uses to select which class to optimize. Class probabilities are adapted based on their expected fitness. Starting with a prior biased toward the simplest networks, the distribution grows gradually until a solution is found. Experiments on two benchmark control problems, including a challenging nonlinear version of the helicopter hovering task, demonstrate that the method consistently finds simple solutions.
A Natural Evolution Strategy for MultiObjective Optimization
"... Abstract. The recently introduced family of natural evolution strategies (NES), a novel stochastic descent method employing the natural gradient, is providing a more principled alternative to the wellknown covariance matrix adaptation evolution strategy (CMAES). Until now, NES could only be used f ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
Abstract. The recently introduced family of natural evolution strategies (NES), a novel stochastic descent method employing the natural gradient, is providing a more principled alternative to the wellknown covariance matrix adaptation evolution strategy (CMAES). Until now, NES could only be used for singleobjective optimization. This paper extends the approach to the multiobjective case, by first deriving a (1+1) hillclimber version of NES which is then used as the core component of a multiobjective optimization algorithm. We empirically evaluate the approach on a battery of benchmark functions and find it to be competitive with the stateoftheart. 1
NoveltyBased Restarts for Evolution Strategies
 in Proc. of the 12th IEEE Congress on Evolutionary Computation (CEC’11
"... Abstract—A major limitation in applying evolution strategies to black box optimization is the possibility of convergence into bad local optima. Many techniques address this problem, mostly through restarting the search. However, deciding the new start location is nontrivial since neither a good loca ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract—A major limitation in applying evolution strategies to black box optimization is the possibility of convergence into bad local optima. Many techniques address this problem, mostly through restarting the search. However, deciding the new start location is nontrivial since neither a good location nor a good scale for sampling a random restart position are known. A black box search algorithm can nonetheless obtain some information about this location and scale from past exploration. The method proposed here makes explicit use of such experience, through the construction of an archive of novel solutions during the run. Upon convergence, the most “novel ” individual found so far is used to position the new start in the least explored region of the search space, actively looking for a new basin of attraction. We demonstrate the working principle of the method on two multimodal test problems.
Convergence of the Continuous Time Trajectories of Isotropic Evolution Strategies on Monotonic C 2composite Functions
"... Abstract. The InformationGeometric Optimization (IGO) has been introduced as a unified framework for stochastic search algorithms. Given a parametrized family of probability distributions on the search space, the IGO turns an arbitrary optimization problem on the search space into an optimization p ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
Abstract. The InformationGeometric Optimization (IGO) has been introduced as a unified framework for stochastic search algorithms. Given a parametrized family of probability distributions on the search space, the IGO turns an arbitrary optimization problem on the search space into an optimization problem on the parameter space of the probability distribution family and defines a natural gradient ascent on this space. From the natural gradients defined over the entire parameter space we obtain continuous time trajectories which are the solutions of an ordinary differential equation (ODE). Via discretization, the IGO naturally defines an iterated gradient ascent algorithm. Depending on the chosen distribution family, the IGO recovers several known algorithms such as the pure rankμ update CMAES. Consequently, the continuous time IGOtrajectory can be viewed as an idealization of the original algorithm. In this paper we study the continuous time trajectories of the IGO given the family of isotropic Gaussian distributions. These trajectories are a deterministic continuous time model of the underlying evolution strategy in the limit for population size to infinity and change rates to zero. On functions that are the composite of a monotone and a convexquadratic function, we prove the global convergence of the solution of the ODE towards the global optimum. We extend this result to composites of monotone and twice continuously differentiable functions and prove local convergence towards local optima. 1