Results 1 - 10
of
41
Code Growth in Genetic Programming
, 1998
"... Genetic programming is a technique for the automatic generation of computer programs loosely based on the theory of evolution. It has produced successful solutions to a wide variety of problems and can be effective even in noisy and changing environments. However, genetic programming produces soluti ..."
Abstract
-
Cited by 106 (9 self)
- Add to MetaCart
Genetic programming is a technique for the automatic generation of computer programs loosely based on the theory of evolution. It has produced successful solutions to a wide variety of problems and can be effective even in noisy and changing environments. However, genetic programming produces solutions with large amounts of unnecessary code. The amount of unnecessary code increases over time and is not proportional to increases in the quality of the solutions produced. Thus, this additional code seriously hinders the genetic programming processes by requiring extra resources without producing equivalent returns. This dissertation examines the causes of this "code growth." We use three test problems from very different fields of interest to confirm the generality of the results. We tested the destructive hypothesis, that code growth is a protective response to the destructiveness of crossover, as a potential cause of code growth. It is a definite cause, but is not sufficient to explai...
Effects of Code Growth and Parsimony Pressure on Populations in Genetic Programming
- Evolutionary Computation
, 1998
"... Parsimony pressure, the explicit penalization of larger programs, has been increasingly used as a means of controlling code growth in genetic programming. However, in many cases parsimony pressure degrades the performance of the genetic program. In this paper we show that poor average results wit ..."
Abstract
-
Cited by 61 (1 self)
- Add to MetaCart
Parsimony pressure, the explicit penalization of larger programs, has been increasingly used as a means of controlling code growth in genetic programming. However, in many cases parsimony pressure degrades the performance of the genetic program. In this paper we show that poor average results with parsimony pressure are a result of "failed" populations that overshadow the results of populations that incorporate parsimony pressure successfully. Additionally, we show that the effect of parsimony pressure can be measured by calculating the relationship between program size and performance within the population. This measure can be used as a partial indicator of success or failure for individual populations. Keywords Code growth, code bloat, parsimony, genetic programming, introns. 1. Introduction The use of parsimony pressure as a means of controlling the size of programs generated with genetic programming (GP) has grown considerably in recent years. In many cases parsimony pr...
Removal Bias: a New Cause of Code Growth in Tree Based Evolutionary Programming
- In 1998 IEEE International Conference on Evolutionary Computation
, 1998
"... This paper presents a new cause of code growth, termed removal bias. We show that growth due to removal bias can be expected to occur whenever operations which remove and replace a variable sized section of code, e.g. crossover or subtree mutation, are used in an evolutionary paradigm. Two forms of ..."
Abstract
-
Cited by 39 (7 self)
- Add to MetaCart
(Show Context)
This paper presents a new cause of code growth, termed removal bias. We show that growth due to removal bias can be expected to occur whenever operations which remove and replace a variable sized section of code, e.g. crossover or subtree mutation, are used in an evolutionary paradigm. Two forms of non-destructive crossover are used to examine the causes of code growth. Results support the protective value of inviable code and removal bias as two distinct causes of code growth. Both causes of code growth are shown to exist in at least two different problems. Keywords--- Code growth, variable length representations, removal bias, parsimony I. Introduction The rapid growth of fitness neutral code in genetic programming (GP), often referred to as code growth or code bloat, is a well documented phenomenon [1], [2], [3], [4], [5]. Code growth is a serious issue because larger programs require additional memory and CPU time, often taxing available resources and limiting GP usefulness. Add...
Breeding Decision Trees Using Evolutionary Techniques
"... We explore the use of genetic algorithms to directly evolve classification decision trees. We argue on the suitability of such a concept learner due to its ability to efficiently search complex hypotheses spaces and discover conditionally dependent as well as irrelevant attributes. The performance o ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
We explore the use of genetic algorithms to directly evolve classification decision trees. We argue on the suitability of such a concept learner due to its ability to efficiently search complex hypotheses spaces and discover conditionally dependent as well as irrelevant attributes. The performance of the system is measured on a set of artificial and standard discretized concept-learning problems and compared with the performance of two known algorithms (C4.5, OneR). We demonstrate that the derived hypotheses of standard algorithms can substantially deviate from the optimum. This deviation is partly because of their non-universal procedural bias and it can be reduced using global metrics of tree quality like the one proposed.
Solving High-Order Boolean Parity Problems with Smooth Uniform Crossover, Sub-Machine Code GP and Demes
, 2000
"... We propose and study new search operators and a novel node representation that can make GP fitness landscapes smoother. Together with a tree evaluation method known as sub-machine code GP and the use of demes, these make up a recipe for solving very large parity problems using GP. We tested this rec ..."
Abstract
-
Cited by 34 (2 self)
- Add to MetaCart
We propose and study new search operators and a novel node representation that can make GP fitness landscapes smoother. Together with a tree evaluation method known as sub-machine code GP and the use of demes, these make up a recipe for solving very large parity problems using GP. We tested this recipe on parity problems with up to 22 input variables, solving them with a very high success probability.
R.B.: An analysis of the causes of code growth in genetic programming
- Genetic Programming and Evolvable Machines
, 2002
"... Abstract. This research examines the cause of code growth (bloat) in genetic programming (GP). Currently there are three hypothesized causes of code growth in GP: protection, drift, and removal bias. We show that single node mutations increase code growth in evolving programs. This is strong evidenc ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
(Show Context)
Abstract. This research examines the cause of code growth (bloat) in genetic programming (GP). Currently there are three hypothesized causes of code growth in GP: protection, drift, and removal bias. We show that single node mutations increase code growth in evolving programs. This is strong evidence that the protective hypothesis is correct. We also show a negative correlation between the size of the branch removed during crossover and the resulting change in fitness, but a much weaker correlation for added branches. These results support the removal bias hypothesis, but seem to refute the drift hypothesis. Our results also suggest that there are serious disadvantages to the tree structured programs commonly evolved with GP, because the nodes near the root are effectively fixed in the very early generations. Keywords: genetic programming, code growth, code bloat, crossover
Application of Genetic Programming to Induction of Linear Classification Trees
- In Proceedings of the Third European Conference on Genetic Programming
, 2000
"... . A common problem in datamining is to find accurate classifiers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classification problems. Using GP we are able to induce decision trees with a linear combination of variables in each function node. A new r ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
. A common problem in datamining is to find accurate classifiers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classification problems. Using GP we are able to induce decision trees with a linear combination of variables in each function node. A new representation of decision trees using strong typing in GP is introduced. With this representation it is possible to let the GP classify into any number of classes. Results indicate that GP can be applied successfully to classification problems. Comparisons with current state-of-the-art algorithms in machine learning are presented and areas of future research are identified. 1 Introduction Classification problems form an important area in datamining. For example, a bank may want to classify its clients in good and bad credit risks or a doctor may want to classify his patients as having diabetes or not. Classifiers may take the form of decision trees [11] (see Figure 1). In each node, a...
An Investigation of Supervised Learning in Genetic Programming
, 1998
"... This thesis is an investigation into Supervised Learning (SL) in Genetic Programming (GP). With its flexible tree-structured representation, GP is a type of Genetic Algorithm, using the Darwinian idea of natural selection and genetic recombination, evolving populations of solutions over many generat ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
This thesis is an investigation into Supervised Learning (SL) in Genetic Programming (GP). With its flexible tree-structured representation, GP is a type of Genetic Algorithm, using the Darwinian idea of natural selection and genetic recombination, evolving populations of solutions over many generations to solve problems. SL is a common approach in Machine Learning where the problem is presented as a set of examples. A good or fit solution is one which can successfully deal with all of the examples. In common with most Machine Learning approaches, GP has been used to solve many trivial problems. When applied to larger and more complex problems, however, several difficulties become apparent. When focusing on the basic features of GP, this thesis highlights the immense size of the GP search space, and describes an approach to measure this space. A stupendously flexible but frustratingly useless representation, Anarchically Automatically Defined Functions, is described. Some difficulties...
Small Populations over Many Generations can beat Large Populations over Few Generations in Genetic Programming
- Genetic Programming 1997: Proceedings of the Second Annual Conference
, 1997
"... This paper looks at the use of small populations in Genetic Programming (GP), where the trend in the literature appears to be towards using as large a population as possible, which requires more memory resources and CPU-usage is less efficient. Dynamic Subset Selection (DSS) and Limited Error Fitnes ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
This paper looks at the use of small populations in Genetic Programming (GP), where the trend in the literature appears to be towards using as large a population as possible, which requires more memory resources and CPU-usage is less efficient. Dynamic Subset Selection (DSS) and Limited Error Fitness (LEF) are two different, adaptive variations of the standard supervised learning method used in GP. This paper compares the performance of GP, GP+DSS, and GP+LEF, on a 958 case classification problem, using a small population size of 50. A similar comparison between GP and GP+DSS is done on a larger and messier 3772 case classification problem. For both problems, GP+DSS with the small population size consistently produces a better answer using fewer tree evaluations than other runs using much larger populations. Even standard GP can be seen to perform well with the much smaller population size, indicating that it is certainly worth an exploratory run or three with a small population size b...