Results 1 - 10
of
19
Learning in the Presence of Concept Drift and Hidden Contexts
- Machine Learning
, 1996
"... . On-line learning in domains where the target concept depends on some hidden context poses serious problems. A changing context can induce changes in the target concepts, producing what is known as concept drift. We describe a family of learning algorithms that flexibly react to concept drift and c ..."
Abstract
-
Cited by 135 (0 self)
- Add to MetaCart
. On-line learning in domains where the target concept depends on some hidden context poses serious problems. A changing context can induce changes in the target concepts, producing what is known as concept drift. We describe a family of learning algorithms that flexibly react to concept drift and can take advantage of situations where contexts reappear. The general approach underlying all these algorithms consists of (1) keeping only a window of currently trusted examples and hypotheses; (2) storing concept descriptions and re-using them when a previous context reappears; and (3) controlling both of these functions by a heuristic that constantly monitors the system's behavior. The paper reports on experiments that test the systems' performance under various conditions such as different levels of noise and different extent and rate of concept drift. Keywords: Incremental concept learning, on-line learning, context dependence, concept drift, forgetting 1. Introduction The work presen...
Detecting Concept Drift with Support Vector Machines
- In Proceedings of the Seventeenth International Conference on Machine Learning (ICML
, 2000
"... For many learning tasks where data is collected over an extended period of time, its underlying distribution is likely to change. A typical example is information filtering, i.e. the adaptive classification of documents with respect to a particular user interest. Both the interest of the user and th ..."
Abstract
-
Cited by 72 (8 self)
- Add to MetaCart
For many learning tasks where data is collected over an extended period of time, its underlying distribution is likely to change. A typical example is information filtering, i.e. the adaptive classification of documents with respect to a particular user interest. Both the interest of the user and the document content change over time. A filtering system should be able to adapt to such concept changes. This paper proposes a new method to recognize and handle concept changes with support vector machines. The method maintains a window on the training data. The key idea is to automatically adjust the window size so that the estimated generalization error is minimized. The new approach is both theoretically well-founded as well as effective and efficient in practice. Since it does not require complicated parameterization, it is simpler to use and more robust than comparable heuristics. Experiments with simulated concept drift scenarios based on real-world text data com...
Tracking drifting concepts by minimizing disagreements
- Machine Learning
, 1994
"... Abstract. In this paper we consider the problem of tracking a subset of a domain (called the target) which changes gradually over time. A single (unknown) probability distribution over the domain is used to generate random examples for the learning algorithm and measure the speed at which the target ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
Abstract. In this paper we consider the problem of tracking a subset of a domain (called the target) which changes gradually over time. A single (unknown) probability distribution over the domain is used to generate random examples for the learning algorithm and measure the speed at which the target changes. Clearly, the more rapidly the target moves, the harder it is for the algorithm to maintain a good approximation of the target. Therefore we evaluate algorithms based on how much movement of the target can be tolerated between examples while predicting with accuracy e. Furthermore, the complexity of the class 7-/of possible targets, as measured by d, its VC-dimension, also effects the difficulty of tracking the target concept. We show that if the problem of minimizing the number of disagreements with a sample from among concepts in a class 7 { can be approximated to within a factor k, then there is a simple tracking algorithm for 7-t which can achieve a probability e of making a mistake if the target movement rate is at most a constant times e2/(k(d + k) In 1), where d is the Vapnik-Chervonenkis dimension of 7-t. Also, we show that if 7- / is properly PAC-learnable, then there is an efficient (randomized) algorithm that with high probability approximately minimizes disagreements to within a factor of 7d + 1, yielding an efficient tracking algorithm for 7-I which tolerates drift rates up to a constant times e2/(d 2 In ). In addition, we prove complementary results for the classes of halfspaces and axisaligned hyperrectangles showing that the maximum rate of drift that any algorithm (even with unlimited computational power) can tolerate is a constant times e2/d.
The Problem of Concept Drift: Definitions and Related Work
, 2004
"... In the real world concepts are often not stable but change with time. Typical examples of this are weather prediction rules and customers' preferences. The underlying data distribution may change as well. Often these changes make the model built on old data inconsistent with the new data, and regula ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
In the real world concepts are often not stable but change with time. Typical examples of this are weather prediction rules and customers' preferences. The underlying data distribution may change as well. Often these changes make the model built on old data inconsistent with the new data, and regular updating of the model is necessary. This problem, known as concept drift, complicates the task of learning a model from data and requires special approaches, different from commonly used techniques, which treat arriving instances as equally important contributors to the final concept. This paper considers different types of concept drift, peculiarities of the problem, and gives a critical review of existing approaches to the problem.
Beating a Defender in Robotic Soccer: Memory-Based Learning of a Continuous Function
, 1995
"... Learning how to adjust to an opponent's position is critical to the success of having intelligent agents collaborating towards the achievement of specific tasks in unfriendly environments. This paper describes our work on developing methods to learn to choose an action based on a continuous-valued s ..."
Abstract
-
Cited by 21 (8 self)
- Add to MetaCart
Learning how to adjust to an opponent's position is critical to the success of having intelligent agents collaborating towards the achievement of specific tasks in unfriendly environments. This paper describes our work on developing methods to learn to choose an action based on a continuous-valued state attribute indicating the position of an opponent. We use a framework in which teams of agents compete in a simulator of a game of robotic soccer. We introduce a memory-based supervised learning strategy which enables an agent to choose to pass or shoot in the presence of a defender. In our memory model, training examples affect neighboring generalized learned instances with different weights. We conduct experiments in which the agent incrementally learns to approximate a function with a continuous domain. Then we investigate the question of how the agent performs in nondeterministic variations of the training situations. Our experiments indicate that when the random variations fall within some bound of the initial training, the agent performs better with some initial training rather than from a tabula-rasa.
Density-adaptive learning and forgetting
- In Proceedings of the Tenth International Conference on Machine Learning
, 1993
"... We describe a density-adaptive reinforcement learning and a density-adaptive forgetting algorithm. This learning algorithm uses hybrid k-D/2k-trees to allow foravariable resolution partitioning and labelling of the input space. The density adaptive forgetting algorithm deletes observations from the ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
We describe a density-adaptive reinforcement learning and a density-adaptive forgetting algorithm. This learning algorithm uses hybrid k-D/2k-trees to allow foravariable resolution partitioning and labelling of the input space. The density adaptive forgetting algorithm deletes observations from the learning set depending on whether subsequent evidence is available in a local region of the parameter space. The algorithms are demonstrated in a simulation for learning feasible robotic grasp approach directions and orientations and then adapting to subsequent mechanical failures in the gripper. 1
Adapting to Drift in Continuous Domains
- In Proceedings of the 8th European Conference on Machine Learning
, 1995
"... The paper presents the system FRANN, which exploits the idea of radial-basis functions for the needs of learning in numeric domains under concept drift. The classification accuracy of the program compares favourably to that of older algorithms that are based on symbol manipulation. The system tolera ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
The paper presents the system FRANN, which exploits the idea of radial-basis functions for the needs of learning in numeric domains under concept drift. The classification accuracy of the program compares favourably to that of older algorithms that are based on symbol manipulation. The system tolerates noise and is able to learn symbolic, numeric, and mixed concepts with nonlinear boundaries in environments with abrupt as well as gradual concept drift. Research area. Inductive learning Key words. concept drift, radial-basis functions Demo request. No Address for Correspondence: Miroslav Kubat, Institute for Systems Sciences, Johannes Kepler University, A-4040 Linz, Austria, e-mail: mirek@cast.uni-linz.ac.at 1 Introduction Recently, the problem of on-line learning in time-varying domains has received attention in the machine learning community. The essence is to make the learner recognize gradual or abrupt changes in the target concept and adjust accordingly the internal representa...
Concept Drift and the Importance of Examples
- Text Mining – Theoretical Aspects and Applications
, 2002
"... For many learning tasks where data is collected over an extended period of time, its underlying distribution is likely to change. A typical example is information ltering, i.e. the adaptive classi cation of documents with respect to a particular user interest. Both the interest of the user and the ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
For many learning tasks where data is collected over an extended period of time, its underlying distribution is likely to change. A typical example is information ltering, i.e. the adaptive classi cation of documents with respect to a particular user interest. Both the interest of the user and the document content change over time. A ltering system should be able to adapt to such concept changes.
Using Labeled and Unlabeled Data to Learn Drifting Concepts
- In Workshop notes of IJCAI-01 Workshop on Learning from Temporal and Spatial Data
, 2001
"... For many learning tasks, where data is collected over an extended period of time, one has to cope two problems. The distribution underlying the data is likely to change and only little labeled training data is available at each point in time. A typical example is information filtering, i. e. th ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
For many learning tasks, where data is collected over an extended period of time, one has to cope two problems. The distribution underlying the data is likely to change and only little labeled training data is available at each point in time. A typical example is information filtering, i. e. the adaptive classification of documents with respect to a particular user interest. Both the interest of the user and the document content change over time. A filtering system should be able to adapt to such concept changes. Since users often give little feedback, a filtering system should also be able to achieve a good performance, even if only few labeled training examples are provided. This paper proposes a method to recognize and handle concept changes with support vector machines and to use unlabeled data to reduce the need for labeled data. The method maintains windows on the training data, whose size is automatically adjusted so that the estimated generalization error is minimized. The approach is both theoretically well-founded as well as effective and efficient in practice. Since it does not require complicated parameterization, it is simpler to use and more robust than comparable heuristics. Experiments with simulated concept drift scenarios based on real-world text data compare the new method with other window management approaches and show that it can effectively select an appropriate window size in a robust way. In order to achieve an acceptable performance with fewer labeled training examples, the proposed method exploits unlabeled examples in a transductive way. 1
On the Complexity of Learning from Drifting Distributions
- In Proceedings of the Workshop on Computational Learning Theory
, 1996
"... We consider two models of on-line learning of binary-valued functions from drifting distributions due to Bartlett. We show that if each example is drawn from a joint distribution which changes in total variation distance by at most O(ffl 3 =(d log(1=ffl))) between trials, then an algorithm can ach ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
We consider two models of on-line learning of binary-valued functions from drifting distributions due to Bartlett. We show that if each example is drawn from a joint distribution which changes in total variation distance by at most O(ffl 3 =(d log(1=ffl))) between trials, then an algorithm can achieve a probability of a mistake at most ffl worse than the best function in a class of VC-dimension d. We prove a corresponding necessary condition of O(ffl 3 =d). Finally, in the case that a fixed function is to be learned from noise-free examples, we show that if the distributions on the domain generating the examples change by at most O(ffl 2 =(d log(1=ffl))), then any consistent algorithm learns to within accuracy ffl. 1 Introduction In prediction models [7, 11] like that studied in this paper, learning proceeds in trials, where in the tth trial, the algorithm (1) is given x t chosen from some set X , (2) is required to output a prediction y t 2 f0; 1g, and (3) discovers y t 2 f0;...

