A well-known fundamental limitation of selective induction algorithms is that when tasksupplied attributes are not adequate for, or directly relevant to, describing hypotheses, their performance in terms of prediction accuracy and/or theory complexity is poor. One solution to this problem is constructive induction. It constructs, by using task-supplied attributes, new attributes that are expected to be more appropriate than the task-supplied attributes for describing the target concepts. This thesis focuses on constructive induction with decision trees as the theory description language. It explores: (1) novel approaches to constructing new binary attributes using existing constructive operators, and (2) novel methods of constructing new nominal and new continuous-valued attributes based on a newly proposed constructive operator. The thesis investigates a fixed rule-based approach to constructing new binary attributes for decision tree learning. It generates conjunctions from production rules that are converted from decision trees using the C4.5rules algorithm. During the process of transforming a decision tree into production rules, C4.5rules eliminates some irrelevant
|
2489
|
Induction of Decision Trees
– Quinlan
- 1986
|
|
2438
|
Classification and Regression Trees
– Breiman, Friedman, et al.
- 1984
|
|
2140
|
Learning Internal Representations by Error Propagation
– Rumelhart, Hinton, et al.
- 1986
|
|
1364
|
A theory of the learnable
– Valiant
- 1984
|
|
843
|
Efficient induction of logic programs
– Muggleton, Feng
- 1990
|
|
792
|
Instance-Based Learning Algorithms
– Kibler
- 1991
|
|
747
|
Learning logical definitions from relations
– Quinlan
- 1990
|
|
655
|
UCI Repository of Machine Learning Databases [machine-readable data repository
– Murphy, Aha
- 1992
|
|
654
|
On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab
– Vapnik, Červonekis
- 1971
|
|
625
|
A theory and methodology of inductive learning
– Michalski
- 1983
|
|
619
|
The CN2 Induction Algorithm
– Clark, Niblett
- 1989
|
|
527
|
Knowledge acquisition via incremental conceptual clustering
– Fisher
- 1987
|
|
509
|
C4.5: Programs for
– Quinlan
- 1993
|
|
490
|
Irrelevant features and the subset selection problem
– John, Kohavi
- 1994
|
|
490
|
Generalization as search
– MITCHELL
- 1982
|
|
469
|
Some studies in machine learning using the game of checkers
– Samuel
- 1959
|
|
440
|
Cross-Validatory Choice and Assessment of Statistical Predictions
– Stone
- 1974
|
|
418
|
Multi-interval discretization of continuous-valued attributes for classification learning
– Fayyad, Irani
- 1993
|
|
401
|
Parallel networks that learn to pronounce english text
– Sejnowski, Rosenberg
- 1987
|
|
338
|
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection
– Kohavi
- 1995
|
|
337
|
Solving multiclass learning problems via error-correcting output codes
– Dietterich, Bakiri
- 1995
|
|
335
|
Very simple classification rules perform well on most commonly used data sets
– Holte
- 1993
|
|
329
|
Learning decision lists
– Rivest
- 1987
|
|
317
|
Computer Systems that learn
– Weiss, Kulikowski
- 1991
|
|
280
|
A universal prior for integers and estimation by minimum description length
– Rissanen
- 1983
|
|
278
|
The multi-purpose incremental learning system AQ15 and its testing application to three medical domains
– Michalski, Mozetic, et al.
- 1986
|
|
278
|
Learning efficient classification procedures and their application to chess endgames
– Quinlan
- 1983
|
|
273
|
Connectionist learning procedures
– Hinton
- 1989
|
|
265
|
Inferring decision trees using the minimum description length principle
– Quinlan, Rivest
- 1989
|
|
263
|
Rule induction with CN2: Some recent improvements
– CLARK, BOSWELL
- 1991
|
|
219
|
Quantifying Inductive Bias: AI Learning Algorithms and Valiant’s Learning Framework
– Haussler
- 1988
|
|
191
|
A system for induction of oblique decision trees
– Murthy, Kasif, et al.
- 1994
|
|
191
|
Boolean feature discovery in empirical learning
– Pagallo, Haussler
- 1990
|
|
179
|
Refinement of approximate domain theories by knowledge-based neural networks
– TOWELL, SHAVLIK, et al.
- 1990
|
|
172
|
The Feature Selection Problem: Traditional Methods and a New Algorithm
– Kira, Rendell
- 1992
|
|
165
|
Greedy Attribute Selection
– Caruana, Freitag
- 1994
|
|
162
|
The MONK’s problems – a performance comparison of different learning algorithms
– Thrun
- 1991
|
|
153
|
The Need for Biases in Learning Generalizations
– Mitchell
- 1990
|
|
151
|
An Empirical Comparison of Pruning Methods for Decision Tree
– Mingers
- 1987
|
|
135
|
An Empirical Comparison of Selection Measures for Decision Tree Induction”, Machine learning
– Mingers
- 1989
|
|
131
|
Incremental induction of decision trees
– Utgoff
- 1989
|
|
130
|
A Conservation Law for Generalization Performance
– Schaffer
- 1994
|
|
129
|
On changing continuous attributes into ordered discrete attributes
– Catlett
- 1991
|
|
125
|
Generating production rules from decision trees
– Quinlan
- 1987
|
|
120
|
Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms
– Skalak
- 1994
|
|
120
|
An empirical comparison of pattern recognition, neural nets, and machine learning classification methods
– Weiss, Kapouleas
- 1989
|
|
117
|
Efficient Algorithms for Minimizing Cross Validation Error
– Moore, Lee
- 1993
|
|
113
|
Shift of bias for inductive concept-learning
– Utgoff
- 1986
|
|
106
|
A Guide to Expert System
– Waterman
- 1985
|
|
104
|
Combining Instance-Based and Model-Based Learning
– Quinlan
- 1993
|