A short poster version to appear in KDD-2001 Real World Performance of Association Rule Algorithms
Abstract:
Association rule discovery has been an active research area over the past few years with several new proposals for algorithms that improve the running time for generating association rules or frequent itemsets. Several new algorithms were shown by their authors to run faster then previously existing algorithms, although benchmarks were typically done on artificial datasets. Unlike classification algorithms, for which several large evaluations were done by third parties, there have been no such evaluations for the correctness and runtime performance of association algorithms. This study compares five well-known association rule algorithms using three real-world datasets and an artificial dataset from IBM Almaden. The experimental results confirm the performance improvements previously claimed by the authors on the artificial data, but some of these gains do not carry over to the real datasets,
Citations
| 2138 | UCI Repository of Machine Learning Databases – Merz, Murphy - 1996 |
| 537 | Mining frequent patterns without candidate generation – Han, Pei, et al. - 2000 |
| 347 | Fast Discovery of Association Rules – Agrawal - 1995 |
| 147 | CLOSET: an efficient algorithm for mining frequent closed itemsets – Pei, Han, et al. |
| 129 | Data mining using MLC++: a machine learning library in C – Kohavi, Sommerfield, et al. - 1996 |
| 120 | Generating non-redundant association rules – Zaki - 2000 |
| 94 | Y.: A comparison of prediction accuracy, complexity, and training time for thirtythree old and new classification algorithms. Machine Learning 40 – Lim, Loh, et al. - 1995 |
| 93 | Mining Associations between Sets of Items in Massive Databases – Agrawal, Imielinski, et al. - 1993 |
| 80 | Pruning and summarizing the discovered associations – Liu, Hsu, et al. - 1999 |
| 59 | OPUS: An efficient admissible algorithm for unordered search – Webb - 1995 |
| 38 | Efficient Search for Association Rules – Webb - 2000 |

