Download:
by Chang-hung Lee, Cheng-ru Lin, Ming-syan Chen
In Proc. of the int’l conf. on Info. and knowledge management
http://www2.ee.ntu.edu.tw/~mschen/paperps/cikm414.pdf
Add To MetaCart
Abstract:
We explore in this paper an effective sliding-window filtering (abbreviatedly as SWF) algorithm for incremental mining of association rules. In essence, by partitioning a transaction database into several partitions, algorithm SWF employs a filtering threshold in each partition to deal with the candidate itemset generation. Under SWF, the cumulative information of mining previous partitions is selectively carried over toward the generation of candidate itemsets for the subsequent partitions. Algorithm SWF not only significantly reduces I/O and CPU cost by the concepts of cumulative filtering and scan reduction techniques but also effectively controls memory utilization by the technique of sliding-window partition. Algorithm SWF is particularly powerful for efficient incremental mining for an ongoing time-variant transaction database. By utilizing proper scan reduction techniques, only one scan of the incremented dataset is needed by algorithm SWF. The I/O cost of SWF is, in orders of magnitude, smaller than those required by prior methods, thus resolving the performance bottleneck. Experimental studies are performed to evaluate performance of algorithm SWF. It is noted that the improvement achieved by algorithm SWF is even more prominent as the incremented portion of the dataset increases and also as the size of the database increases.
Citations
|
364
|
Fast algorithms for mining association rules in large databases
– Agrawal, Srikant
- 1994
|
|
361
|
Mining Generalized Association Rules
– Srikant, Agrawal
- 1995
|
|
343
|
Dynamic itemset counting and implication rules for market basket data
– Brin, Motwani, et al.
- 1997
|
|
292
|
An e cient algorithm for mining association rules in large databases
– Savasere, Omiecinski, et al.
- 1995
|
|
280
|
Sampling Large Databases for Association Rules
– Toivonen
- 1996
|
|
149
|
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique
– Cheung, S, et al.
- 1996
|
|
103
|
Efficient data mining for path traversal patterns
– CHEN, PARK, et al.
- 1998
|
|
102
|
A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing (Special Issue on High Performance Data Mining), (Accepted for Publication
– Agarwal, Aggarwal, et al.
- 1993
|
|
67
|
Optimization of constrained frequent set queries: 2-var constraints
– Lakshmanan, Ng, et al.
- 1998
|
|
66
|
Mining frequent item sets with convertible constraints
– Pei, Han, et al.
- 2001
|
|
60
|
Freespan: Frequent pattern-projected sequential pattern mining
– Han, Pei, et al.
- 2000
|
|
58
|
An efficient algorithm for the incremental updation of association rules
– Thomas, Bodagala, et al.
- 1997
|
|
47
|
Can we push more constraints into frequent pattern mining? SIGKDD’00
– Pei, Han
- 2000
|
|
36
|
Constraintbased clustering in large databases
– Tung, Ng, et al.
- 2001
|
|
33
|
Mining association rules: Anti-skew algorithms
– Lin, Dunham
- 1997
|
|
9
|
An efficient algorithm to update large itemsets with early pruning
– Ayn, Tansel, et al.
- 1999
|
|
7
|
A general incremental technique for updating discovered association rules
– Cheung, Lee, et al.
- 1997
|
|
1
|
Algorithmsforassociationrulemining—ageneral survey and comparison
– Hipp, Nakhaeizadeh
- 2000
|