Results 1 - 10
of
133
Locally weighted learning
- ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract
-
Cited by 599 (51 self)
- Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition
, 1997
"... this article. ..."
(Show Context)
A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms
- ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... Many lazy learning algorithms are derivatives of the k-nearest neighbor (k-NN) classifier, which uses a distance function to generate predictions from stored instances. Several studies have shown that k-NN's performance is highly sensitive to the definition of its distance function. Many k ..."
Abstract
-
Cited by 147 (0 self)
- Add to MetaCart
(Show Context)
Many lazy learning algorithms are derivatives of the k-nearest neighbor (k-NN) classifier, which uses a distance function to generate predictions from stored instances. Several studies have shown that k-NN's performance is highly sensitive to the definition of its distance function. Many k-NN variants have been proposed to reduce this sensitivity by parameterizing the distance function with feature weights. However, these variants have not been categorized nor empirically compared. This paper reviews a class of weight-setting methods for lazy learning algorithms. We introduce a framework for distinguishing these methods and empirically compare them. We observed four trends from our experiments and conducted further studies to highlight them. Our results suggest that methods which use performance feedback to assign weight settings demonstrated three advantages over other methods: they require less pre-processing, perform better in the presence of interacting features, and generally require less training data to learn good settings. We also found that continuous weighting methods tend to outperform feature selection algorithms for tasks where some features are useful but less important than others.
Semi-supervised Clustering with User Feedback
, 2003
"... We present a new approach to clustering based on the observation that \it is easier to criticize than to construct." Our approach of semi-supervised clustering allows a user to iteratively provide feedback to a clustering algorithm. The feedback is incorporated in the form of constraints w ..."
Abstract
-
Cited by 125 (2 self)
- Add to MetaCart
We present a new approach to clustering based on the observation that \it is easier to criticize than to construct." Our approach of semi-supervised clustering allows a user to iteratively provide feedback to a clustering algorithm. The feedback is incorporated in the form of constraints which the clustering algorithm attempts to satisfy on future iterations. These constraints allow the user to guide the clusterer towards clusterings of the data that the user nds more useful. We demonstrate semi-supervised clustering with a system that learns to cluster news stories from a Reuters data set. Introduction Consider the following problem: you are given 100,000 text documents (e.g., papers, newsgroup articles, or web pages) and asked to group them into classes or into a hierarchy such that related documents are grouped together. You are not told what classes or hierarchy to use or what documents are related; you have some criteria in mind, but may not be able to say exactly w...
Lazy Decision Trees
, 1996
"... Lazy learning algorithms, exemplified by nearestneighbor algorithms, do not induce a concise hypothesis from a given training set; the inductive process is delayed until a test instance is given. Algorithms for constructing decision trees, such as C4.5, ID3, and CART create a single "best" ..."
Abstract
-
Cited by 117 (6 self)
- Add to MetaCart
Lazy learning algorithms, exemplified by nearestneighbor algorithms, do not induce a concise hypothesis from a given training set; the inductive process is delayed until a test instance is given. Algorithms for constructing decision trees, such as C4.5, ID3, and CART create a single "best" decision tree during the training phase, and this tree is then used to classify test instances. The tests at the nodes of the constructed tree are good on average, but there may be better tests for classifying a specific instance. We propose a lazy decision tree algorithm---LazyDT---that conceptually constructs the "best" decision tree for each test instance. In practice, only a path needs to be constructed, and a caching scheme makes the algorithm fast. The algorithm is robust with respect to missing values without resorting to the complicated methods usually seen in induction of decision trees. Experiments on real and artificial problems are presented. Introduction Delay is preferable to error. -...
Discovering Structure in Multiple Learning Tasks: The TC Algorithm
- In International Conference on Machine Learning
, 1996
"... Recently, there has been an increased interest in "lifelong " machine learning methods, that transfer knowledge across multiple learning tasks. Such methods have repeatedly been found to outperform conventional, single-task learning algorithms when the learning tasks are appropriately rela ..."
Abstract
-
Cited by 111 (3 self)
- Add to MetaCart
(Show Context)
Recently, there has been an increased interest in "lifelong " machine learning methods, that transfer knowledge across multiple learning tasks. Such methods have repeatedly been found to outperform conventional, single-task learning algorithms when the learning tasks are appropriately related. To increase robustness of such approaches, methods are desirable that can reason about the relatedness of individuallearning tasks, in order to avoid the danger arising from tasks that are unrelated and thus potentially misleading. This paper describes the task-clustering (TC) algorithm. TC clusters learning tasks into classes of mutually related tasks. When facing a new learning task, TC first determines the most related task cluster, then exploits information selectively from this task cluster only. An empirical study carried out in a mobile robot domain shows that TC outperforms its non-selective counterpart in situations where only a small number of tasks is relevant. 1 INTRODUCTION One of t...
Adaptive Metric Nearest Neighbor Classification
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2000
"... Nearest neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose a ..."
Abstract
-
Cited by 104 (4 self)
- Add to MetaCart
(Show Context)
Nearest neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose a locally adaptive nearest neighbor classification method to try to minimize bias. We use a Chisquared distance analysis to compute a flexible metric for producing neighborhoods that are highly adaptive to query locations. Neighborhoods are elongated along less relevant feature dimensions and constricted along most influential ones. As a result, the class conditional probabilities tend to be smoother in the modified neighborhoods, whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other techniques using a variety of simulated and real world data. 1 Introduction In a classification problem, we are given J classes and N tra...
Probabilistic Feature Relevance Learning for Content-Based Image Retrieval
- Computer Vision and Image Understanding
, 1999
"... Most of the current image retrieval systems use "one-shot" queries to a database to retrieve similar images. Typically a K-nearest neighbor kind of algorithm is used, where weights measuring feature importance along each input dimension remain fixed (or manually tweaked by the user), in th ..."
Abstract
-
Cited by 68 (18 self)
- Add to MetaCart
(Show Context)
Most of the current image retrieval systems use "one-shot" queries to a database to retrieve similar images. Typically a K-nearest neighbor kind of algorithm is used, where weights measuring feature importance along each input dimension remain fixed (or manually tweaked by the user), in the computation of a given similarity metric. However, the similarity does not vary with equal strength or in the same proportion in all directions in the feature space emanating from the query image. The manual adjustment of these weights is time consuming and exhausting. Moreover, it requires a very sophisticated user. In this paper, we present a novel probabilistic method that enables image retrieval procedures to automatically capture feature relevance based on user's feedback and that is highly adaptive to query locations. Experimental results are presented that demonstrate the efficacy of our technique using both simulated and real-world data.
Improving minority class prediction using case-specific feature weights
- Proceedings of the Fourteenth International Conference on Machine Learning
, 1997
"... This paper addresses the problem of handling skewed class distributions within the case-based learning (CBL) framework. We rst present as a baseline an informationgain-weighted CBL algorithm and apply it to three data sets from natural language processing (NLP) with skewed class distributions. Altho ..."
Abstract
-
Cited by 62 (4 self)
- Add to MetaCart
This paper addresses the problem of handling skewed class distributions within the case-based learning (CBL) framework. We rst present as a baseline an informationgain-weighted CBL algorithm and apply it to three data sets from natural language processing (NLP) with skewed class distributions. Although overall performance of the baseline CBL algorithm is good, we show that the algorithm exhibits poor performance on minority class instances. We then present two CBL algorithms designed to improve the performance of minority class predictions. Each variation creates test-case-speci c feature weights by rst observing the path taken by the test case in a decision tree created for the learning task, and then using pathspeci c information gain values to create an appropriate weight vector for use during case retrieval. When applied to the NLP data sets, the algorithms are shown to signi cantly increase the accuracy of minority class predictions while maintaining or improving overall classi cation accuracy. 1