| J. L. Han and A. W. Plank. Background for Association Rules and Cost Estimate of Selected Mining Algorithms. In Proc. of CIKM, Rockville, MD, USA, 1996. |
....studied in this paper. Above all, the same approach is what makes the mining of informative rules (in a single run of the data mining task) possible. However, the strength of the Apriori is also its weakness. The generate and test process incurs costly computations. According to Han and Plank [13], the algorithm has dependency on many parameters including the amount of memory, the number of transactions, candidates, frequent itemsets, and the length of a frequent itemset. This meant that the Apriori has varying performance under di#erent environment and data conditions (which is also ....
J. L. Han and A. W. Plank. Background for Association Rules and Cost Estimate of Selected Mining Algorithms. In Proc. of CIKM, Rockville, MD, USA, 1996.
....algorithms proposed by the authors of each variant. This general framework maintains the simplicity of the Apriori model (i.e. the generate and test process) while improving its performance and scaling up its ability to handle large data sets. The main insight, based on an analysis made by [7] and our experience, is Apriori s dependency on the database size and transaction size. Given an itemset, the time to determine its support is primarily bounded by the size of each transaction and the number of transactions in the database. With large databases and long transactions, Apriori based ....
J. L. Han and A. W. Plank. Background for Association Rules and Cost Estimate of Selected Mining Algorithms. In Proc. of CIKM, Rockville, MD, USA, 1996.
....candidate set C k is generated by the natural join of L k Gamma1 with L 1 in the attribute T ID, and it is implemented by a merge sort join. The candidates are counted using SQL commands. SETM generates too many candidates with respect to Apriori and is less efficient. Readers are referred to [HP96] for the evaluation of the algorithms above, and their cost of computation. Lower and upper bounds for their computational complexity are provided in this paper. The motivation behind Dynamic Hashing and Pruning (DHP) PCY95a] is the attempt to reduce the size of candidate 2 itemsets. Park et ....
Jia Liang Han and Ashley W. Plank. Background for association rules and cost estimate of mining algorithms. In Proceedings of 5 th Intl. Conf. on Information and Knowledge Management (CIKM'96), pages 73--80, Rockville, Maryland, USA, November 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC