Results 1 - 10
of
34
Supervised and Unsupervised Discretization of Continuous Features
- MACHINE LEARNING: PROCEEDINGS OF THE TWELFTH INTERNATIONAL CONFERENCE
, 1995
"... Many supervised machine learning algorithms require a discrete feature space. In this paper, we review previous work on continuous feature discretization, identify defining characteristics of the methods, and conduct an empirical evaluation of several methods. We compare binning, an unsupervised dis ..."
Abstract
-
Cited by 335 (9 self)
- Add to MetaCart
Many supervised machine learning algorithms require a discrete feature space. In this paper, we review previous work on continuous feature discretization, identify defining characteristics of the methods, and conduct an empirical evaluation of several methods. We compare binning, an unsupervised discretization method, to entropy-based and purity-based methods, which are supervised algorithms. We found that the performance of the Naive-Bayes algorithm significantly improved when features were discretized using an entropy-based method. In fact, over the 16 tested datasets, the discretized version of Naive-Bayes slightly outperformed C4.5 on average. We also show that in some cases, the performance of the C4.5 induction algorithm significantly improved if features were discretized in advance; in our experiments, the performance never significantly degraded, an interesting phenomenon considering the fact that C4.5 is capable of locally discretizing features.
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
- Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract
-
Cited by 121 (1 self)
- Add to MetaCart
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, tree-structured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
Wrappers For Performance Enhancement And Oblivious Decision Graphs
, 1995
"... In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are stu ..."
Abstract
-
Cited by 94 (6 self)
- Add to MetaCart
In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are studied under the wrapper approach. The hypothesis spaces we investigate are: decision tables with a default majority rule (DTMs) and oblivious read-once decision graphs (OODGs).
Enhancing Supervised Learning with Unlabeled Data
, 2000
"... In many practical learning scenarios, there is a small amount of labeled data along with a large pool of unlabeled data. Many supervised learning algorithms have been developed and extensively studied. We present a new "co-training" strategy for using unlabeled data to improve the performance ..."
Abstract
-
Cited by 94 (0 self)
- Add to MetaCart
In many practical learning scenarios, there is a small amount of labeled data along with a large pool of unlabeled data. Many supervised learning algorithms have been developed and extensively studied. We present a new "co-training" strategy for using unlabeled data to improve the performance of standard supervised learning algorithms. Unlike much of the prior work, such as the co-training procedure of Blum and Mitchell (1998), we do not assume there are two redundant views both of which are sufficient for perfect classification. The only requirement our co-training strategy places on each supervised learning algorithm is that its hypothesis partitions the example space into a set of equivalence classes (e.g. for a decision tree each leaf defines an equivalence class). We evaluate our co-training strategy via experiments using data from the UCI repository. 1. Introduction In many practical learning scenarios, there is a small amount of labeled data along with a lar...
MLC++: a machine learning library in C++
- IN TOOLS WITH ARTIFICIAL INTELLIGENCE
, 1994
"... We present MLC++, a library of C++ classes and tools for supervised Machine Learning. While MLC++ provides general learning algorithms that can be used by end users, the main objective is to provide researchers and experts with a wide variety of tools that can accelerate algorithm development, incre ..."
Abstract
-
Cited by 90 (8 self)
- Add to MetaCart
We present MLC++, a library of C++ classes and tools for supervised Machine Learning. While MLC++ provides general learning algorithms that can be used by end users, the main objective is to provide researchers and experts with a wide variety of tools that can accelerate algorithm development, increase software reliability, provide comparison tools, and display information visually. More than just a collection of existing algorithms, MLC++ is an attempt to extract commonalities of algorithms and decompose them for a unified view that is simple, coherent, and extensible. In this paper we discuss the problems MLC++ aims to solve, the design of MLC++, and the current functionality.
The Power of Decision Tables
- Proceedings of the European Conference on Machine Learning
, 1995
"... . We evaluate the power of decision tables as a hypothesis space for supervised learning algorithms. Decision tables are one of the simplest hypothesis spaces possible, and usually they are easy to understand. Experimental results show that on artificial and real-world domains containing only discre ..."
Abstract
-
Cited by 79 (5 self)
- Add to MetaCart
. We evaluate the power of decision tables as a hypothesis space for supervised learning algorithms. Decision tables are one of the simplest hypothesis spaces possible, and usually they are easy to understand. Experimental results show that on artificial and real-world domains containing only discrete features, IDTM, an algorithm inducing decision tables, can sometimes outperform state-of-the-art algorithms such as C4.5. Surprisingly, performance is quite good on some datasets with continuous features, indicating that many datasets used in machine learning either do not require these features, or that these features have few values. We also describe an incremental method for performing crossvalidation that is applicable to incremental learning algorithms including IDTM. Using incremental cross-validation, it is possible to cross-validate a given dataset and IDTM in time that is linear in the number of instances, the number of features, and the number of label values. The time for incre...
Simplifying Decision Trees: A Survey
, 1996
"... Induced decision trees are an extensively-researched solution to classification tasks. For many practical tasks, the trees produced by tree-generation algorithms are not comprehensible to users due to their size and complexity. Although many tree induction algorithms have been shown to produce simpl ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Induced decision trees are an extensively-researched solution to classification tasks. For many practical tasks, the trees produced by tree-generation algorithms are not comprehensible to users due to their size and complexity. Although many tree induction algorithms have been shown to produce simpler, more comprehensible trees (or data structures derived from trees) with good classification accuracy, tree simplification has usually been of secondary concern relative to accuracy and no attempt has been made to survey the literature from the perspective of simplification. We present a framework that organizes the approaches to tree simplification and summarize and critique the approaches within this framework. The purpose of this survey is to provide researchers and practitioners with a concise overview of tree-simplification approaches and insight into their relative capabilities. In our final discussion, we briefly describe some empirical findings and discuss the application of tree i...
Oblivious Decision Trees, Graphs, and Top-Down Pruning
- Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence
, 1995
"... We describe a supervised learning algorithm, EODG, that uses mutual information to build an oblivious decision tree. The tree is then converted to an Oblivious read-Once Decision Graph (OODG) by merging nodes at the same level of the tree. For domains that are appropriate for both decision trees and ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
We describe a supervised learning algorithm, EODG, that uses mutual information to build an oblivious decision tree. The tree is then converted to an Oblivious read-Once Decision Graph (OODG) by merging nodes at the same level of the tree. For domains that are appropriate for both decision trees and OODGs, performance is approximately the same as that of C4.5, but the number of nodes in the OODG is much smaller. The merging phase that converts the oblivious decision tree to an OODG provides a new way of dealing with the replication problem and a new pruning mechanism that works top down starting from the root. The pruning mechanism is well suited for finding symmetries and aids in recovering from splits on irrelevant features that may happen during the tree construction. 1 Introduction Decision trees provide a hypothesis space for supervised machine learning algorithms that is well suited for many datasets encountered in the real world (Breiman, Friedman, Olshen & Stone 1984, Quinlan...
Transforming Rules and Trees into Comprehensible Knowledge Structures
, 1996
"... The problem of transforming the knowledge bases of expert systems using induced rules or decision trees into comprehensible knowledge structures is addressed. A knowledge structure is developed that generalizes and subsumes production rules, decision trees, and rules with exceptions. It gives rise t ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
The problem of transforming the knowledge bases of expert systems using induced rules or decision trees into comprehensible knowledge structures is addressed. A knowledge structure is developed that generalizes and subsumes production rules, decision trees, and rules with exceptions. It gives rise to a natural complexity measure that allows them to be understood, analyzed and compared on a uniform basis. The structure is a directed acyclic graph with the semantics that nodes are premises, some of which have attached conclusions, and the arcs are inheritance links with disjunctive multiple inheritance. A detailed example is given of the generation of a range of such structures of equivalent performance for a simple problem, and the complexity measure of a particular structure is shown to relate to its perceived complexity. The simplest structures are generated by an algorithm that factors common sub-premises from the premises of rules. A more complex example of a chess dataset is used t...
On Neurobiological, Neuro-Fuzzy, Machine Learning and Statistical Pattern Recognition Techniques
, 1997
"... In this paper, we propose two new neuro--fuzzy schemes, one for classification and one for clustering problems. The classification scheme is based on Simpson's Fuzzy Min Max method, and relaxes some assumptions he makes. This enables our scheme to handle mutually non exclusive classes. The neuro--fu ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
In this paper, we propose two new neuro--fuzzy schemes, one for classification and one for clustering problems. The classification scheme is based on Simpson's Fuzzy Min Max method, and relaxes some assumptions he makes. This enables our scheme to handle mutually non exclusive classes. The neuro--fuzzy clustering scheme is a multiresolution algorithm that is modeled after the mechanics of human pattern recognition. We also present data from an exhaustive comparison of these techniques with neural, statistical, machine learning and other traditional approaches to pattern recognition applications. The data sets used for comparisons include those from the machine learning repository at the University of California, Irvine. We find that our proposed schemes compare quite well with the existing techniques, and in addition offer the advantages of one pass learning and on--line adaptation. Keywords--- Pattern Recognition, Classification, Clustering, Neuro-Fuzzy Systems, Multiresolution, Visi...

