MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Mining frequent patterns without candidate generation (2000) [537 citations — 37 self]

Download:
Download as a PDF
by Jiawei Han, Jian Pei, Yiwen Yin
ftp://ftp.fas.sfu.ca/pub/cs/han/kdd/sigmod00.pdf
Add To MetaCart

Abstract:

SIGMOD'2000 Paper ID: 196 Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist long patterns. In this study, we propose a novel frequent pattern tree (FP-tree) structure, which is an extended pre xtree structure for storing compressed, crucial information about frequent patterns, and develop an e cient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. E ciency of mining is achieved with three techniques: (1) a large database is compressed into a highly condensed, much smaller data structure, which avoids costly, repeated database scans, (2) our FP-treebased mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based divide-and-conquer method is used to dramatically reduce the search space. Our performance study shows that the FP-growth method is e cient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent pattern mining methods.

Citations

1607 Fast Algorithms for Mining Association Rules – Agrawal, Srikant - 1994
659 Mining sequential patterns – Agrawal, Srikant - 1995
358 Mining generalized association rules – Srikant, Agrawal - 1995
347 Fast Discovery of Association Rules – Agrawal - 1995
342 Dynamic itemset counting and implication rules for market basket data – Brin, Motwani, et al. - 1997
322 Beyond market basket: Generalizing association rules to correlations – Brin, Motwani, et al.
299 Mining sequential patterns: Generalizations and performance improvements – Srikant, Agrawal - 1996
297 Discovery of Multiple-Level Association rules from Large Databases – HAN, FU - 1995
285 An Efficient Algorithm for Mining Association Rules in Large Databases – Savasere, Omiecinski, et al. - 1995
252 Efficiently mining long patterns from databases – Bayardo - 1998
209 Exploratory mining and pruning optimizations of constrained associations rules – Ng, Lakshmanan, et al. - 1998
191 Discovering frequent closed itemsets for association rules – Pasquier, Bastide, et al.
182 Discovery of frequent episodes in event sequences – Mannila, Toivonen, et al. - 1997
174 An Effective Hash-Based Algorithm for Mining Association Rules – Park, Chen, et al. - 1995
174 Finding Interesting Rules from Large Sets of Discovered Association Rules – Klemettinen, Mannila, et al. - 1994
172 Mining association rules with item constraints – Srikant, Vu, et al. - 1997
155 Efficient algorithms for discovering association rules – Mannila, Toivonen, et al. - 1994
147 CLOSET: an efficient algorithm for mining frequent closed itemsets – Pei, Han, et al.
119 Charm: An efficient algorithm for closed itemset mining – Zaki, Hsiao - 2002
106 Efficient mining of emerging patterns: discovering trends and differences – Dong, Li - 1999
105 Constraint-based rule mining in large, dense databases – Bayardo, Agrawal, et al.
99 A Tree Projection Algorithm for Generation of Frequent Itemsets – Agarwal, Aggarwal, et al. - 2000
96 Integrating association rule mining with relational database systems: alternatives and implications – Sarawagi, Thomas, et al. - 1998
93 PrefixSpan mining sequential patterns efficiently by prefix projected pattern growth – Pei, Han, et al.
81 Clustering Association Rules – Lent, Swami, et al.
78 Efficient mining of partial periodic patterns in time series database – Han, Dong, et al. - 1999
66 Mining frequent item sets with convertible constraints – Pei, Han, et al. - 2001
59 Scalable techniques for mining causal structures – Silverstein, Brin, et al. - 1998
57 Metarule-guided mining of multi-dimensional association rules using data cubes – KAMBER, HAN, et al. - 1997
57 Association Rules Over Interval Data – Miller, Yang - 1997
37 Efficient mining of constrained correlated sets – Grahne, Lakshmanan, et al. - 2000
36 H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases – Pei, Lu, et al. - 2001
31 Parallel Algorithms for Discovery of Association Rules – Zaki, Parthasarathy, et al. - 1997
11 An e ective hash-based algorithm for mining association rules – Park, Chen, et al. - 1995
10 E#cient mining of emerging patterns: discovering trends and di#erences – Dong, Li - 1999
7 Mining partial periodicity using frequent pattern trees – Han, Pei, et al. - 1999
6 is a professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He has been working on research into data mining, data warehousing, stream data mining, spatiotemporal and multimedia data mining, biological data mini – Han - 2005
3 Depth- rst generation of large itemsets for association rules – Agarwal, Aggarwal, et al. - 1999
1 Data Engineering (ICDE’01 – Conf
1 received his M.Sc. degree in Computing Science at Simon Fraser University in 2001 and has been working as a software engineering in B.C – Yin