See this document in CiteSeerX!

Distribution-Based Synthetic Database Generation Techniques for Itemset Mining  (Make Corrections)  
Ganesh Ramesh University of British Columbia Mohammed J....



  Home/Search   Context   Related

 
View or download:
rpi.edu/~zaki/./PS/IDEAS05.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  rpi.edu/~zaki/papers (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: The resource requirements of frequent pattern mining algorithms depend mainly on the length distribution of the mined patterns in the database. Synthetic databases, which are used to benchmark performance of algorithms, tend to have distributions far different from those observed in real datasets. In this paper we focus on the problem of synthetic database generation and propose algorithms to effectively embed within the database, any given set of maximal pattern collections, and make the... (Update)

Active bibliography (related documents):   More   All
0.5:   Mining Multiple Private Databases Using a kNN Classifier - Li Xiong Emory   (Correct)
0.4:   On Inverse Frequent Set Mining - Mielikäinen (2003)   (Correct)
0.3:   Indexing and Data Access Methods for Database Mining - Ramesh, Maniatty, Zaki (2001)   (Correct)

Similar documents based on text:
0.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

@misc{ university-distributionbased,
  author = "Ganesh Ramesh University",
  title = "Distribution-Based Synthetic Database Generation Techniques for Itemset
    Mining",
  url = "citeseer.ist.psu.edu/751196.html" }
Citations (may not include all citations):
910   Fast algorithms for mining association rules - Agrawal, Srikant - 1994
400   Fast discovery of association rules (context) - Agrawal - 1996
225   Data mining: concepts and techniques (context) - Han, Kamber - 2000
161   Exploratory mining and pruning optimizations of constrained .. - Ng, Lakshmanan et al. - 1998
109   New algorithms for fast discovery of association rules - Zaki, Parthasarathy et al. - 1997
108   Efficiently mining long patterns from databases (context) - Bayardo - 1998
100   Levelwise search and borders of theories in knowledge discov.. - Mannila, Toivonen - 1997
60   Privacy preserving mining of association rules - Evfimievski, Srikant et al. - 2002
54   Real world performance of association rule algorithms - Zheng, Kohavi et al. - 2001
25   A simple algorithm for finding frequent elements in streams .. - Karp - 2003
24   Efficiently mining maximal frequent itemsets - Gouda, Zaki - 2001
22   A condensed representation to find frequent patterns - Bykowski - 2001
17   Discovering all most specific sentences - Gunopulos - 2003
4   Feasible itemset distributions in data mining: Theory and ap.. (context) - Ramesh, Maniatty et al. - 2003
3   Indexing and Data Access Methods for Database Mining - Ramesh, Maniatty et al. - 2002
2   Privacy preserving data mining of association rules on verti.. (context) - Kantarcioglu, Clifton - 2003
1   The Complexity of Mining Maximal Frequent Itemsets and Maxim.. (context) - Yang - 2004
1   Mining all non-derivable itemsets (context) - Calders, Goethals - 2002
1   A tight upper bound on the number of candidate patterns (context) - Goethals, Geerts et al. - 2001

Documents on the same site (http://www.cs.rpi.edu/~zaki/papers.html):   More
Parallel Classification for Data Mining on Shared-Memory.. - Zaki (1998)   (Correct)
Efficient Enumeration of Frequent Sequences - Zaki (1998)   (Correct)
PlanMine: Predicting Plan Failures using Sequence Mining - Zaki, Lesh, Ogihara (1999)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC