Genetic programming is distinguished from other evolutionary algorithms in that it uses tree representations of variable size instead of linear strings of fixed length. The flexible representation scheme is very important because it allows the underlying structure of the data to be discovered automatically. One primary difficulty, however, is that the solutions may grow too big without any improvement of their generalization ability. In this paper we investigate the fundamental relationship between the performance and complexity of the evolved structures. The essence of the parsimony problem is demonstrated empirically by analyzing error landscapes of programs evolved for neural network synthesis. We consider genetic programming as a statistical inference problem and apply the Bayesian model-comparison framework to introduce a class of fitness functions with error and complexity terms. An adaptive learning method is then presented that automatically balances the model-complexity factor to evolve parsimonious programs without losing the diversity of the population needed for achieving the desired training accuracy. The effectiveness of this approach is empirically shown on the induction of sigma-pi neural networks for solving a real-world medical diagnosis problem as well as benchmark tasks.
|
4828
|
Genetic Algorithms
– Goldberg
- 1989
|
|
2438
|
Classification and Regression Trees
– Breiman, Friedman, et al.
- 1984
|
|
1782
|
Genetic Programming: On the Programming of Computers by Means of Natural Selection Cambridge
– Koza
- 1992
|
|
655
|
UCI Repository of Machine Learning Databases [machine-readable data repository
– Murphy, Aha
- 1992
|
|
494
|
Genetic Programming II: Automatic Discovery of Reusable Programs
– Koza
- 1994
|
|
489
|
Neural networks and the bias/variance dilemma
– Geman, Bienenstock, et al.
- 1992
|
|
469
|
Some studies in machine learning using the game of checkers
– Samuel
- 1959
|
|
268
|
An overview of evolutionary algorithms for parameter optimization
– Back, Schwefel
- 1993
|
|
265
|
Inferring decision trees using the minimum description length principle
– Quinlan, Rivest
- 1989
|
|
241
|
Predictive models for the breeder genetic algorithm
– Mühlenbein, Schlierkamp-Vose
- 1993
|
|
233
|
Learning structural descriptions from examples, The Psychology of Computer Vision
– Winston
- 1975
|
|
198
|
Universal coding, information, prediction, and estimation
– Rissanen
- 1984
|
|
194
|
Connectionist models and their properties
– Feldman, Ballard
- 1982
|
|
127
|
Competitive environments evolve better solutions for complex tasks
– Angeline, Pollack
- 1993
|
|
113
|
Learning machines
– Nilsson
- 1965
|
|
102
|
The genetic algorithm and the structure of the fitness landscape
– Manderick, Weger, et al.
- 1991
|
|
80
|
A general framework for Parallel Distributed Processing
– Rumelhart, Hinton, et al.
- 1986
|
|
80
|
Genetic programming for feature discovery and image discrimination
– Tackett
- 1993
|
|
70
|
Product units: A computationally powerful and biologically plausible extension to backpropagation networks
– R, Rumelhart
- 1989
|
|
47
|
Learning, invariance, and generalization in highorder neural networks
– Giles, Maxwell
- 1987
|
|
34
|
Genetic programming using a minimum description length principle
– Iba, Garis, et al.
- 1994
|
|
32
|
Polynomial theory of complex systems
– Ivakhnenko
- 1971
|
|
28
|
Generality and difficulty in genetic programming: Evolving a sort
– Kinnear
- 1993
|
|
25
|
Evolving optimal neural networks using genetic algorithms with Occam's razor
– Zhang, Muhlenbein
- 1993
|
|
25
|
Genetic Programming of Minimal Neural Nets Using Occam's Razor
– Zhang, Muehlenbein
- 1993
|
|
22
|
Hierarchical automatic function definition in genetic programming
– Koza
- 1993
|
|
22
|
Stochastic complexity and modeling. The Annals of Statistics
– Rissanen
- 1986
|
|
20
|
Introduction: Paradigms for Machine Learning
– Carbonell
- 1989
|
|
18
|
System identification using structured genetic algorithms
– Iba, Kurita, et al.
- 1993
|
|
17
|
An information criterion for optimal neural network selection
– Fogel
- 1991
|
|
16
|
Dualistic geometry of the manifold of higher-order neurons
– Amari
- 1991
|
|
16
|
Fitness landscapes and difficulty in genetic programming
– Kinnear
- 1994
|
|
15
|
Evolutionary algorithms: theory and applications
– Mühlenbein
- 1993
|
|
13
|
Simultaneous discovery of reusable detectors and subroutines using genetic programming
– Koza
- 1993
|
|
5
|
Classi cation and Regression Trees. Wadsworth Statistic/Probability Series
– Breiman, Friedman, et al.
- 1984
|
|
5
|
The genetic algorithm and the structure of the tness landscape
– Manderick, Weger, et al.
- 1991
|
|
3
|
Effects of Occam's razor in evolving sigma-pi neural networks
– Zhang
- 1994
|
|
3
|
Generality and diculty in genetic programming: Evolving a sort
– Kinnear
- 1993
|
|
3
|
Fitness landscapes and diculty in genetic programming
– Kinnear
- 1994
|
|
2
|
System identi cation using structured genetic algorithms
– Iba, Kurita, et al.
- 1993
|
|
1
|
Hierarchical automatic function de nition in genetic programming
– Koza
- 1992
|