Results 1 -
4 of
4
Clustering Association Rules
, 1997
"... We consider the problem of clustering two-dimensional association rules in large databases. We present a geometric-based algorithm, BitOp, for performing the clustering, embedded within an association rule clustering system, ARCS. Association rule clustering is useful when the user desires to segmen ..."
Abstract
-
Cited by 99 (0 self)
- Add to MetaCart
We consider the problem of clustering two-dimensional association rules in large databases. We present a geometric-based algorithm, BitOp, for performing the clustering, embedded within an association rule clustering system, ARCS. Association rule clustering is useful when the user desires to segment the data. We measure the quality of the segmentation generated by ARCS using the Minimum Description Length (MDL) principle of encoding the clusters on several databases including noise and errors. Scale-up experiments show that ARCS, using the BitOp algorithm, scales linearly with the amount of data. 1 Introduction Data mining, or the efficient discovery of interesting patterns from large collections of data, has been recognized as an important area of database research. The most commonly sought patterns are association rules as introduced in [AIS93b]. Intuitively, an association rule identifies a frequently occuring pattern of information in a database. Consider a supermarket database w...
The Quest Data Mining System
- In Proc. of the 2nd Int'l Conference on Knowledge Discovery in Databases and Data Mining
, 1996
"... This paper is a capsule summary of the current functionality and architecture of the Quest data mining System. Our overall approach has been to identify basic data mining operations that cut across applications and develop fast, scalable algorithms for their execution (Agrawal, Imielinski, & Swami 1 ..."
Abstract
-
Cited by 72 (2 self)
- Add to MetaCart
This paper is a capsule summary of the current functionality and architecture of the Quest data mining System. Our overall approach has been to identify basic data mining operations that cut across applications and develop fast, scalable algorithms for their execution (Agrawal, Imielinski, & Swami 1993a). We wanted our algorithms to:
The application of AdaBoost for distributed, scalable and online learning
- Pages 362–366 of: SIGKDD Conference on Knowledge and Data Mining (KDD
, 1999
"... We propose to use AdaBoost to efficiently learn classifiers over very large and possibly distributed data sets that cannot fit into main memory, as well as on-line learning where new data become available periodically. We propose two new ways to apply AdaBoost. The first allows the use of a small sa ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
We propose to use AdaBoost to efficiently learn classifiers over very large and possibly distributed data sets that cannot fit into main memory, as well as on-line learning where new data become available periodically. We propose two new ways to apply AdaBoost. The first allows the use of a small sample of the weighted training set to compute a weak hypothesis. The second approach involves using AdaBoost as a means to re-weight classifiers in an ensemble, and thus to reuse previously computed classifiers along with new classifier computed on a new increment of data. These two techniques of using AdaBoost provide scalable, distributed and on-line learning. We discuss these methods and their implementation in JAM, an agent-based learning system. Empirical studies on four real world and artifical data sets have shown results that are either comparable to or better than learning classifiers over the complete training set and, in some cases, are comparable to boosting on the complete data set. However, our algorithms use much smaller samples of the training set and require much less memory.
MINTO: A Software Tool for Mining Manufacturing Databases
"... this report, we discuss the design and implementation of the MINTO software mining tool. MINTO features a variety of mining algorithms catering to simple association rules, generalized association rules, quantitative association rules, parallel mining, sampling-based mining, incremental mining and v ..."
Abstract
- Add to MetaCart
this report, we discuss the design and implementation of the MINTO software mining tool. MINTO features a variety of mining algorithms catering to simple association rules, generalized association rules, quantitative association rules, parallel mining, sampling-based mining, incremental mining and vertical mining. Some of these algorithms are taken from the literature while the others are new algorithms that we have designed as part of this project

