Results 11 - 20
of
40
Constructing Nominal X-of-N Attributes
- Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence
, 1995
"... Most constructive induction researchers focus only on new boolean attributes. This paper reports a new constructive induction algorithm, called XofN, that constructs new nominal attributes in the form of X-of-N representations. An X-of-N is a set containing one or more attribute-value pairs. For a g ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
Most constructive induction researchers focus only on new boolean attributes. This paper reports a new constructive induction algorithm, called XofN, that constructs new nominal attributes in the form of X-of-N representations. An X-of-N is a set containing one or more attribute-value pairs. For a given instance, its value corresponds to the number of its attribute-value pairs that are true. The promising preliminary experimental results, on both artificial and real-world domains, show that constructing new nominal attributes in the form of X-of-N representations can significantly improve the performance of selective induction in terms of both higher prediction accuracy and lower theory complexity. 1 Introduction A well-known elementary limitation of selective induction algorithms is that when task-supplied attributes are not adequate for describing hypotheses, their performance in terms of prediction accuracy and/or theory complexity is poor. To overcome this limitation, constructiv...
Automatic Feature Construction and a Simple Rule Induction Algorithm for Skin Detection
- In Proc. of the ICML Workshop on Machine Learning in Computer Vision
, 2002
"... Many vision systems use skin detection as a principal component. Skin detection algorithms, normally evaluate a single and thus limited color model, such as HSV, Y C r C b , YUV, RGB, normalized RGB, etc. Their limited performance, however, suggests that they are looking at the incorrect color model ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Many vision systems use skin detection as a principal component. Skin detection algorithms, normally evaluate a single and thus limited color model, such as HSV, Y C r C b , YUV, RGB, normalized RGB, etc. Their limited performance, however, suggests that they are looking at the incorrect color models.
Learning By Discovering Concept Hierarchies
- Artificial Intelligence
, 1999
"... We present a new machine learning method that, given a set of training examples, induces a definition of the target concept in terms of a hierarchy of intermediate concepts and their definitions. This effectively decomposes the problem into smaller, less complex problems. The method is inspired b ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
We present a new machine learning method that, given a set of training examples, induces a definition of the target concept in terms of a hierarchy of intermediate concepts and their definitions. This effectively decomposes the problem into smaller, less complex problems. The method is inspired by the Boolean function decomposition approach to the design of switching circuits. To cope with high time complexity of finding an optimal decomposition, we propose a suboptimal heuristic algorithm. The method, implemented in program HINT (Hierarchy INduction Tool), is experimentally evaluated using a set of artificial and real-world learning problems. In particular, the evaluation addresses the generalization property of decomposition and its capability to discover meaningful hierarchies. The experiments show that HINT performs well in both respects. Keywords Function decomposition, Machine learning, Concept hierarchies, Concept discovery, Constructive induction, Generalization 1 ...
Lookahead-based Algorithms for Anytime Induction of Decision Trees
- In ICML’04
, 2004
"... The majority of the existing algorithms for learning decision trees are greedy-a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Furthermore, the greedy algorithms require a fixed amount of time and are no ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
The majority of the existing algorithms for learning decision trees are greedy-a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Furthermore, the greedy algorithms require a fixed amount of time and are not able to generate a better tree if additional time is available. To overcome this problem, we present two lookahead-based algorithms for anytime induction of decision trees, thus allowing tradeoff between tree quality and learning time. The first one is depth-k lookahead, where a larger time allocation permits larger k. The second algorithm uses a novel strategy for evaluating candidate splits; a stochastic version of ID3 is repeatedly invoked to estimate the size of the tree in which each split results, and the one that minimizes the expected size is preferred. Experimental results indicate that for several hard concepts, our proposed approach exhibits good anytime behavior and yields significantly better decision trees when more time is available.
Time Series Learning with Probabilistic Network Composites
- University of Illinois
, 1998
"... The purpose of this research is to extend the theory of uncertain reasoning over time through integrated, multi-strategy learning. Its focus is on decomposable, concept learning problems for classification of spatiotemporal sequences. Systematic methods of task decomposition using attribute-driven m ..."
Abstract
-
Cited by 9 (9 self)
- Add to MetaCart
The purpose of this research is to extend the theory of uncertain reasoning over time through integrated, multi-strategy learning. Its focus is on decomposable, concept learning problems for classification of spatiotemporal sequences. Systematic methods of task decomposition using attribute-driven methods, especially attribute partitioning, are investigated. This leads to a novel and important type of unsupervised learning in which the feature construction (or extraction) step is modified to account for multiple sources of data and to systematically search for embedded temporal patterns. This modified technique is combined with traditional cluster definition methods to provide an effective mechanism for decomposition of time series learning problems. The decomposition process interacts with model selection from a collection of probabilistic models such as temporal artificial neural networks and temporal Bayesian networks. Models are chosen using a new quantitative (metric-based) approach that estimates expected performance of a learning architecture, algorithm, and mixture model on a newly defined subproblem. By mapping subproblems to customized configurations of probabilistic networks for time series learning, a hierarchical, supervised learning system with enhanced generalization quality can be automatically built. The system can improve data fusion
Dimensionality Reduction via Discretization
, 1996
"... The existence of numeric data and large amounts of records in a database pose a challenging task to explicit concepts extraction from the raw data. This paper introduces a method that reduces data vertically and horizontally, keeps the discriminating power of the original data, and paves the way ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
The existence of numeric data and large amounts of records in a database pose a challenging task to explicit concepts extraction from the raw data. This paper introduces a method that reduces data vertically and horizontally, keeps the discriminating power of the original data, and paves the way for extracting concepts. The method is based on discretization (vertical reduction) and feature selection (horizontal reduction). The experimental results show that (1) the data can be effectively reduced by the proposed method; (2) the predictive accuracy of a classifier (C4.5) can be improved after data and dimensionality reduction; and (3) the classification rules learned are simpler. Key Word: Dimensionality Reduction, Discretization, Knowledge Discovery 0 1 Introduction The wide use of computers brings forth proliferation of databases. Without the aid of computer, little of this flood of raw data will ever be seen and exploited by humans. Knowledge discovery systems in database...
Machine Learning in Prognosis of the Femoral Neck Fracture Recovery
, 1996
"... We compare the performance of several machine learning algorithms in the problem of prognostics of the femoral neck fracture recovery: the K-nearest neighbours algorithm, the semi-naive Bayesian classifier, backpropagation with weight elimination learning of the multilayered neural networks, the LFC ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We compare the performance of several machine learning algorithms in the problem of prognostics of the femoral neck fracture recovery: the K-nearest neighbours algorithm, the semi-naive Bayesian classifier, backpropagation with weight elimination learning of the multilayered neural networks, the LFC (lookahead feature construction) algorithm, and the Assistant-I and Assistant-R algorithms for top down induction of decision trees using information gain and RELIEFF as search heuristics, respectively. We compare the prognostic accuracy and the explanation ability of di#erent classifiers. Among the di#erent algorithms the semi-naive Bayesian classifier and Assistant-R seem to be the most appropriate. We analyze the combination of decisions of several classifiers for solving prediction problems and show that the combined classifier improves both performance and the explanation ability. Keywords: learning from examples, estimating attributes, explanation ability, impurity function, empirica...
Complex concept acquisition through directed search and feature caching
- in Proceeding of the 13th International Joint Conference on Artificial Intelligence
, 1993
"... Difficult concepts arise in many complex, formative, or poorly understood real-world domains. High interaction among the data attributes causes problems for many learning algorithms, including greedy decision-tree builders, extensions of basic methods, and even backpropagation and MARS. A new algori ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Difficult concepts arise in many complex, formative, or poorly understood real-world domains. High interaction among the data attributes causes problems for many learning algorithms, including greedy decision-tree builders, extensions of basic methods, and even backpropagation and MARS. A new algorithm, LFC uses directed lookahead search to address feature interaction, improving hypothesis accuracy at reasonable cost. LFC also addresses a second problem, the general verbosity or global replication problem. The algorithm caches search information as new features for decision tree construction. The combination of these two design factors leads to improved prediction accuracy, concept compactness, and noise tolerance. Empirical results with synthetic boolean concepts, bankruptcy prediction and bond rating show typical accuracy improvement of 15%-20 % with LFC over several alternative algorithms in cases of moderate feature interaction. LFC also explicates latent relationships in the training data to provide useful intermediate concepts from the perspective of domain experts. 1
Constructing New Attributes for Decision Tree Learning
, 1996
"... A well-known fundamental limitation of selective induction algorithms is that when tasksupplied attributes are not adequate for, or directly relevant to, describing hypotheses, their performance in terms of prediction accuracy and/or theory complexity is poor. One solution to this problem is constru ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
A well-known fundamental limitation of selective induction algorithms is that when tasksupplied attributes are not adequate for, or directly relevant to, describing hypotheses, their performance in terms of prediction accuracy and/or theory complexity is poor. One solution to this problem is constructive induction. It constructs, by using task-supplied attributes, new attributes that are expected to be more appropriate than the task-supplied attributes for describing the target concepts. This thesis focuses on constructive induction with decision trees as the theory description language. It explores: (1) novel approaches to constructing new binary attributes using existing constructive operators, and (2) novel methods of constructing new nominal and new continuous-valued attributes based on a newly proposed constructive operator. The thesis investigates a fixed rule-based approach to constructing new binary attributes for decision tree learning. It generates conjunctions from producti...
Anytime learning of decision trees
- Journal of Machine Learning Research
"... The majority of existing algorithms for learning decision trees are greedy—a tree is induced topdown, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Even the few non-greedy learners cannot learn good trees when the concept is diff ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The majority of existing algorithms for learning decision trees are greedy—a tree is induced topdown, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Even the few non-greedy learners cannot learn good trees when the concept is difficult. Furthermore, they require a fixed amount of time and are not able to generate a better tree if additional time is available. We introduce a framework for anytime induction of decision trees that overcomes these problems by trading computation speed for better tree quality. Our proposed family of algorithms employs a novel strategy for evaluating candidate splits. A biased sampling of the space of consistent trees rooted at an attribute is used to estimate the size of the minimal tree under that attribute, and an attribute with the smallest expected tree is selected. We present two types of anytime induction algorithms: a contract algorithm that determines the sample size on the basis of a pre-given allocation of time, and an interruptible algorithm that starts with a greedy tree and continuously improves subtrees by additional sampling. Experimental results indicate that, for several hard concepts, our proposed approach exhibits good anytime behavior and yields significantly better decision trees when more time is available.

