#### DMCA

## Mining Concept-Drifting Data Streams Using Ensemble Classifiers (2003)

### Cached

### Download Links

- [wis.cs.ucla.edu]
- [domino.watson.ibm.com]
- [domino.watson.ibm.com]
- [www-faculty.cs.uiuc.edu]
- [magna.cs.ucla.edu]
- [www.cs.uiuc.edu]
- [www.cs.uiuc.edu]
- [www.cs.uiuc.edu]
- [web.engr.illinois.edu]
- [hanj.cs.illinois.edu]
- [www.cs.columbia.edu]
- [www1.cs.columbia.edu]
- [www.cs.columbia.edu]
- [www1.cs.columbia.edu]
- CiteULike
- DBLP

### Other Repositories/Bibliography

Citations: | 280 - 37 self |

### Citations

2213 | Experiments with a New Boosting Algorithm”
- Freund, Schapire
- 1996
(Show Context)
Citation Context ...ble 6: Benefits (US $) using Single Classifiers and Classifier Ensembles (original stream) sembles include changing the instances used for training through techniques such as Bagging [3] and Boosting =-=[12]-=-. The classifier ensembles have several advantages over single model classifiers. First, classifier ensembles offer a significant improvement in prediction accuracy [12, 24]. Second, building a classi... |

1274 | Fast effective rule induction.
- Cohen
- 1995
(Show Context)
Citation Context ...ts on prediction accuracy, and to analyze the advantage of our approach over alternative methods such as incremental learning. The base models used in our tests are C4.5 [20], the RIPPER rule learner =-=[5]-=-, and the Naive Bayesian method. The tests are conducted on a Linux machine with a 770 MHz CPU and 256 MB main memory. 6.1 Algorithms used in Comparison We denote a classifier ensemble with a capacity... |

786 | Models and issues in data stream systems,
- Babcock, Babu, et al.
- 2002
(Show Context)
Citation Context ...0/03/0008 ...$5.00. data generating mechanism, or the concept that we try to learn from the data, is constantly evolving. Knowledge discovery on streaming data is a research topic of growing interest =-=[1, 4, 7, 19]-=-. The fundamental problem we need to solve is the following: given an infinite amount of continuous measurements, how do we model them in order to capture time-evolving trends and patterns in the stre... |

730 |
Neural networks and the bias/variance dilemma
- Geman, Bienenstock, et al.
- 1992
(Show Context)
Citation Context ... well trained classifier are expected to approximate the a posterior class distribution. In addition to the Bayes error, the remaining error of the classifier can be decomposed into bias and variance =-=[15, 6]-=-. More specifically, given a test example y, the probability output of classifier Ci can be expressed as: i=1 f i c(y) = p(c|y) + β i c + η i c(y) } {{ } added error for y (1) where p(c|y) is the a po... |

707 | An empirical comparison of voting classification algorithms: Bagging, boosting, and variants.
- Bauer, Kohavi
- 1999
(Show Context)
Citation Context ...02 46139 71660 Table 6: Benefits (US $) using Single Classifiers and Classifier Ensembles (original stream) sembles include changing the instances used for training through techniques such as Bagging =-=[3]-=- and Boosting [12]. The classifier ensembles have several advantages over single model classifiers. First, classifier ensembles offer a significant improvement in prediction accuracy [12, 24]. Second,... |

402 | Mining high-speed data streams.
- Domingos, Hulten
- 2000
(Show Context)
Citation Context ...0/03/0008 ...$5.00. data generating mechanism, or the concept that we try to learn from the data, is constantly evolving. Knowledge discovery on streaming data is a research topic of growing interest =-=[1, 4, 7, 19]-=-. The fundamental problem we need to solve is the following: given an infinite amount of continuous measurements, how do we model them in order to capture time-evolving trends and patterns in the stre... |

338 | Mining Time-changing Data Streams.
- Hulten, Spencer, et al.
- 2001
(Show Context)
Citation Context ...0/03/0008 ...$5.00. data generating mechanism, or the concept that we try to learn from the data, is constantly evolving. Knowledge discovery on streaming data is a research topic of growing interest =-=[1, 4, 7, 19]-=-. The fundamental problem we need to solve is the following: given an infinite amount of continuous measurements, how do we model them in order to capture time-evolving trends and patterns in the stre... |

312 | Sprint: A scalable parallel classifier for data mining
- Shafer, Agrawal, et al.
- 1996
(Show Context)
Citation Context ... community. One of the goals of traditional data mining algorithms is to learn models from large databases with boundedmemory. It has been achieved by several classification methods, including Sprint =-=[21]-=-, BOAT [14], etc. Nevertheless, the fact that these algorithms require multiple scans of the training data makes them inappropriate in the streaming environment where examples are coming in at a highe... |

308 | Continuous queries over data streams. - Babu, Widom - 2001 |

295 | Clustering data streams
- Guha, Mishra, et al.
- 2000
(Show Context)
Citation Context ...k has been done on modeling [1], querying [2, 13, 16], and mining data streams, for instance, several papers have been published on classification [7, 19, 23], regression analysis [4], and clustering =-=[17]-=-. Traditional data mining algorithms are challenged by two characteristic features of data streams: the infinite data flow and the drifting concepts. As methods that require multiple scans of the data... |

212 | Bias plus variance decomposition for zero-one loss functions. In - Kohavi, Wolpert - 1996 |

206 | Space-efficient online computation of quantile summaries
- Greenwald, Khanna
- 2001
(Show Context)
Citation Context ...e above tests are given in Table 6 and 5. 7. DISCUSSION AND RELATED WORK Data stream processing has recently become a very important research domain. Much work has been done on modeling [1], querying =-=[2, 13, 16]-=-, and mining data streams, for instance, several papers have been published on classification [7, 19, 23], regression analysis [4], and clustering [17]. Traditional data mining algorithms are challeng... |

198 | Incremental Induction of Decision Trees",
- Utgoff
- 1989
(Show Context)
Citation Context ...f the training data makes them inappropriate in the streaming environment where examples are coming in at a higher rate than they can be repeatedly analyzed. Incremental or online data mining methods =-=[25, 14]-=- are another option for mining data streams. These methods continuously revise and refine a model by incorporating new data as they arrive. However, in order to guarantee that the model trained increm... |

185 | Error correlation and error reduction in ensemble classifiers.
- Tumer, Ghosh
- 1996
(Show Context)
Citation Context ...ty of y being an instance of class c. A classifier ensemble pools the outputs of several classifiers before a decision is made. The most popular way of combining multiple classifiers is via averaging =-=[24]-=-, in which case the probability output of the ensemble is given by: Ek produces a smaller classification error than Gk, if classifiers in Ek are weighted by their expected classification accuracy on t... |

167 | A streaming ensemble algorithm (SEA) for large-scale classification, in:
- Street, Kim
- 2001
(Show Context)
Citation Context ...tly become a very important research domain. Much work has been done on modeling [1], querying [2, 13, 16], and mining data streams, for instance, several papers have been published on classification =-=[7, 19, 23]-=-, regression analysis [4], and clustering [17]. Traditional data mining algorithms are challenged by two characteristic features of data streams: the infinite data flow and the drifting concepts. As m... |

144 | Multi-dimensional regression analysis of time-series data streams
- Chen, Dong, et al.
- 2002
(Show Context)
Citation Context |

120 | Rainforest - a framework for fast decision tree construction of large datasets. - Gehrke, Ramakrishnan, et al. - 2000 |

117 | Boat-optimistic decision tree construction.
- Gehrke, Ganti, et al.
- 1999
(Show Context)
Citation Context ... One of the goals of traditional data mining algorithms is to learn models from large databases with boundedmemory. It has been achieved by several classification methods, including Sprint [21], BOAT =-=[14]-=-, etc. Nevertheless, the fact that these algorithms require multiple scans of the training data makes them inappropriate in the streaming environment where examples are coming in at a higher rate than... |

67 | Credit Card Fraud Detection Using Meta Learning: Issues and Initial Results,"
- Stolfo, Fan
- 1997
(Show Context)
Citation Context ...the data include the time of the transaction, the merchant type, the merchant location, past payments, the summary of transaction history, etc. A detailed description of this data set can be found in =-=[22]-=-. We use the benefit matrix shown in Table 1 with the cost of disputing and investigating a fraud transaction fixed at cost = $90. The total benefit is the sum of recovered amount of fraudulent transa... |

59 | A unified bias-variance decomposition and its applications, in:
- Domingos
- 2000
(Show Context)
Citation Context ... well trained classifier are expected to approximate the a posterior class distribution. In addition to the Bayes error, the remaining error of the classifier can be decomposed into bias and variance =-=[15, 6]-=-. More specifically, given a test example y, the probability output of classifier Ci can be expressed as: i=1 f i c(y) = p(c|y) + β i c + η i c(y) } {{ } added error for y (1) where p(c|y) is the a po... |

57 |
Continually evaluating similarity-based pattern queries on a streaming time series.
- Gao, Wang
- 2002
(Show Context)
Citation Context ...e above tests are given in Table 6 and 5. 7. DISCUSSION AND RELATED WORK Data stream processing has recently become a very important research domain. Much work has been done on modeling [1], querying =-=[2, 13, 16]-=-, and mining data streams, for instance, several papers have been published on classification [7, 19, 23], regression analysis [4], and clustering [17]. Traditional data mining algorithms are challeng... |

41 | Pasting bites together for prediction in large data sets. - Breiman - 1999 |

26 | Modeling and
- Wang, Liao
- 2005
(Show Context)
Citation Context ...ion of large databases. Previously, we used averaging ensemble for scalable learning over very-large datasets [10]. We show that a model’s performance can be estimated before it is completely learned =-=[8, 9]-=-. In this work, we use weighted ensemble classifiers on concept-drifting data streams. It combines multiple classifiers weighted by their expected prediction accuracy on the current test data. Compare... |

12 | Distributed learning on very large data sets,
- Hall, Bowyer, et al.
- 2000
(Show Context)
Citation Context ...re efficient than building a single model, since most model construction algorithms have super-linear complexity. Third, the nature of classifier ensembles lend themselves to scalable parallelization =-=[18]-=- and on-line classification of large databases. Previously, we used averaging ensemble for scalable learning over very-large datasets [10]. We show that a model’s performance can be estimated before i... |

11 | Pruning and dynamic scheduling of cost-sensitive ensembles.
- Fan, Chu, et al.
- 2002
(Show Context)
Citation Context ... the KL-distance criterion is limited to streams with no or mild concept drifting only, since concept drifts also enlarge the KL-distance. In this paper, we apply the instance-based pruning technique =-=[11]-=- to data streams with conceptual drifts. q is defined as D(p||q) = ∑ x 5.2 Instance Based Pruning Cost-sensitive applications usually provide higher error tolerance. For instance, in credit card fraud... |

8 |
A framework for scalable cost-sensitive learning based on combing probabilities and benefits
- Fan, Wang, et al.
- 2002
(Show Context)
Citation Context ...ssifier ensembles lend themselves to scalable parallelization [18] and on-line classification of large databases. Previously, we used averaging ensemble for scalable learning over very-large datasets =-=[10]-=-. We show that a model’s performance can be estimated before it is completely learned [8, 9]. In this work, we use weighted ensemble classifiers on concept-drifting data streams. It combines multiple ... |

1 |
Inductive learning in less than one sequential scan
- Fan, Wang, et al.
- 2003
(Show Context)
Citation Context ...ion of large databases. Previously, we used averaging ensemble for scalable learning over very-large datasets [10]. We show that a model’s performance can be estimated before it is completely learned =-=[8, 9]-=-. In this work, we use weighted ensemble classifiers on concept-drifting data streams. It combines multiple classifiers weighted by their expected prediction accuracy on the current test data. Compare... |