Results 1 - 10
of
195
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features
- Machine Learning
, 1993
"... In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of t ..."
Abstract
-
Cited by 309 (3 self)
- Add to MetaCart
(Show Context)
In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of the feature space is required. We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates distance tables that allow it to produce real-valued distances between instances, and attaches weights to the instances to further modify the structure of feature space. We show that this technique produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons with the other learning algorithms show that our nearest neighbor algorithm is comparable or superior ...
Theory Refinement on Bayesian Networks
, 1991
"... Theory refinement is the task of updating a domain theory in the light of new cases, to be done automatically or with some expert assistance. The problem of theory refinement under uncertainty is reviewed here in the context of Bayesian statistics, a theory of belief revision. The problem is reduced ..."
Abstract
-
Cited by 255 (5 self)
- Add to MetaCart
(Show Context)
Theory refinement is the task of updating a domain theory in the light of new cases, to be done automatically or with some expert assistance. The problem of theory refinement under uncertainty is reviewed here in the context of Bayesian statistics, a theory of belief revision. The problem is reduced to an incremental learning task as follows: the learning system is initially primed with a partial theory supplied by a domain expert, and thereafter maintains its own internal representation of alternative theories which is able to be interrogated by the domain expert and able to be incrementally refined from data. Algorithms for refinement of Bayesian networks are presented to illustrate what is meant by "partial theory", "alternative theory representation ", etc. The algorithms are an incremental variant of batch learning algorithms from the literature so can work well in batch and incremental mode. 1 Introduction Theory refinement is the task of updating a domain theory in the light of...
Error Reduction through Learning Multiple Descriptions
, 1996
"... . Learning multiple descriptions for each class in the data has been shown to reduce generalization error but the amount of error reduction varies greatly from domain to domain. This paper presents a novel empirical analysis that helps to understand this variation. Our hypothesis is that the amount ..."
Abstract
-
Cited by 149 (3 self)
- Add to MetaCart
. Learning multiple descriptions for each class in the data has been shown to reduce generalization error but the amount of error reduction varies greatly from domain to domain. This paper presents a novel empirical analysis that helps to understand this variation. Our hypothesis is that the amount of error reduction is linked to the "degree to which the descriptions for a class make errors in a correlated manner." We present a precise and novel definition for this notion and use twenty-nine data sets to show that the amount of observed error reduction is negatively correlated with the degree to which the descriptions make errors in a correlated manner. We empirically show that it is possible to learn descriptions that make less correlated errors in domains in which many ties in the search evaluation measure (e.g. information gain) are experienced during learning. The paper also presents results that help to understand when and why multiple descriptions are a help (irrelevant attribute...
Theory Refinement Combining Analytical and Empirical Methods
- Artificial Intelligence
, 1994
"... This article describes a comprehensive approach to automatic theory revision. Given an imperfect theory, the approach combines explanation attempts for incorrectly classified examples in order to identify the failing portions of the theory. For each theory fault, correlated subsets of the examples a ..."
Abstract
-
Cited by 129 (7 self)
- Add to MetaCart
(Show Context)
This article describes a comprehensive approach to automatic theory revision. Given an imperfect theory, the approach combines explanation attempts for incorrectly classified examples in order to identify the failing portions of the theory. For each theory fault, correlated subsets of the examples are used to inductively generate a correction. Because the corrections are focused, they tend to preserve the structure of the original theory. Because the system starts with an approximate domain theory, in general fewer training examples are required to attain a given level of performance (classification accuracy) compared to a purely empirical system. The approach applies to classification systems employing a propositional Horn-clause theory. The system has been tested in a variety of application domains, and results are presented for problems in the domains of molecular biology and plant disease diagnosis. 1 INTRODUCTION 2 1 Introduction One of the most difficult problems in the develo...
Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning
, 1996
"... This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The sp ..."
Abstract
-
Cited by 126 (2 self)
- Add to MetaCart
This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word "line" using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this ob- served difference. We also discuss the role of bias in machine ]earning and its importance in explaining performance differences observed on specific problems.
Creating Advice-Taking Reinforcement Learners
- Machine Learning
, 1996
"... . Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present and evaluate a design that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, ..."
Abstract
-
Cited by 117 (10 self)
- Add to MetaCart
. Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present and evaluate a design that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer. In our approach, the advice-giver watches the learner and occasionally makes suggestions, expressed as instructions in a simple imperative programming language. Based on techniques from knowledge-based neural networks, we insert these programs directly into the agent's utility function. Subsequent reinforcement learning further integrates and refines the advice. We present empirical evidence that investigates several aspects of our approach and show that, given good advice, a learner can achieve statistically significant gains in expected reward. A second experiment shows that advice improves the expected reward regardless of the...
Input/output hmms for sequence processing
- IEEE Transactions on Neural Networks
, 1996
"... We consider problems of sequence processing and propose a solution based on a discrete state model in order to represent past context. Weintroduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation ..."
Abstract
-
Cited by 116 (13 self)
- Add to MetaCart
We consider problems of sequence processing and propose a solution based on a discrete state model in order to represent past context. Weintroduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation we call Input/Output Hidden Markov Model (IOHMM). It can be trained by the EM or GEM algorithms, considering state trajectories as missing data, which decouples temporal credit assignment and actual parameter estimation. The model presents similarities to hidden Markov models (HMMs), but allows us to map input se-quences to output sequences, using the same processing style as recurrent neural networks. IOHMMs are trained using a more discriminant learning paradigm than HMMs, while potentially taking advantage of the EM algorithm. We demonstrate that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem. Experimental results are presented for the seven Tomita grammars, showing that these adaptive models can attain excellent generalization.
A Comparative Evaluation of Voting and Meta-learning on Partitioned Data
- In Proceedings of the Twelfth International Conference on Machine Learning
, 1995
"... Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of very large network computing, it is likely that orders of magnitude more data in databases will be available for various learning problems of real world importance. Some ..."
Abstract
-
Cited by 115 (14 self)
- Add to MetaCart
Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of very large network computing, it is likely that orders of magnitude more data in databases will be available for various learning problems of real world importance. Some learning algorithms assume that the entire data set fits into main memory, which is not feasible for massive amounts of data. One approach to handling a large data set is to partition the data set into subsets, run the learning algorithm on each of the subsets, and combine the results. In this paper we evaluate different techniques for learning from partitioned data. Our meta-learning approach is empirically compared with techniques in the literature that aim to combine multiple evidence to arrive at one prediction. 1 Introduction Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of very large network computing, i...
Toward parallel and distributed learning by meta-learning
- In Working
, 1993
"... Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of very large network computing, it is likely that orders of magnitude ..."
Abstract
-
Cited by 99 (27 self)
- Add to MetaCart
(Show Context)
Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of very large network computing, it is likely that orders of magnitude
Using Sampling and Queries to Extract Rules from Trained Neural Networks
- In Proceedings of the Eleventh International Conference on Machine Learning
, 1994
"... Concepts learned by neural networks are difficult to understand because they are represented using large assemblages of real-valued parameters. One approach to understanding trained neural networks is to extract symbolic rules that describe their classification behavior. There are several existing r ..."
Abstract
-
Cited by 85 (3 self)
- Add to MetaCart
Concepts learned by neural networks are difficult to understand because they are represented using large assemblages of real-valued parameters. One approach to understanding trained neural networks is to extract symbolic rules that describe their classification behavior. There are several existing rule-extraction approaches that operate by searching for such rules. We present a novel method that casts rule extraction not as a search problem, but instead as a learning problem. In addition to learning from training examples, our method exploits the property that networks can be efficiently queried. We describe algorithms for extracting both conjunctive and M-of-N rules, and present experiments that show that our method is more efficient than conventional search-based approaches. 1 INTRODUCTION A problem that arises when neural networks are used for supervised learning tasks is that, after training, it is usually difficult to understand the concept representations formed by the networks....