MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications (1998) [96 citations — 4 self]

Download:
pdf | ps
by Sunita Sarawagi, Shiby Thomas, Rakesh Agrawal
In SIGMOD
http://sage.chungbuk.ac.kr/Damine/Papers/Integration/Integ01.ps
Add To MetaCart

Abstract:

Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loosecoupling through a SQL cursor interface; encapsulation of a mining algorithm in a stored procedure; caching the data to a file system on-the-fly and mining; tight-coupling using primarily user-defined functions; and SQL implementations for processing in the DBMS. We comprehensively study the option of expressing the mining algorithm in the form of SQL queries using Association rule mining as a case in point. We consider four options in SQL-92 and six options in SQL enhanced with object-relational extensions (SQL-OR). Our evaluation of the different architectural alternatives shows that from a performance perspective, the Cache-Mine option is superior, although the performance of the SQL-OR option is within a factor of two. Both the Cache-Mine and the SQL-OR approaches incur a higher storage penalty than the loose-coupling approach which performance-wise is a factor of 3 to 4 worse than Cache-Mine. The SQL-92 implementations were too slow to qualify as a competitive option. We also compare these alternatives on the basis of qualitative factors like automatic parallelization, development ease, portability and inter-operability.

Citations

1449 Mining association rules between sets of items in large databases – Agrawal, Imielinski, et al. - 1993
358 Mining generalized association rules – Srikant, Agrawal - 1995
342 Dynamic itemset counting and implication rules for market basket data – Brin, Motwani, et al. - 1997
299 Mining sequential patterns: Generalizations and performance improvements – Srikant, Agrawal - 1996
272 Sampling Large Databases for Association Rules – Toivonen - 1996
212 Verkamo. Fast discovery of association rules – Agrawal, Mannila, et al. - 1996
210 Understanding the New SQL: A Complete Guide – Melton, Simon - 1992
194 New Algorithms for fast discovery of association rules – Zaki, Ogihara, et al. - 1997
173 A database perspective on knowledge discovery – IMIELINSKI, MANNILA - 1996
157 Parallel mining of association rules – Agrawal, Shafer - 1996
121 Psaila G., A New SQL-like Operator for Mining Association Rules – Ceri, Meo - 1996
85 DMQL: A Data Mining Query Language for Relational Databases – Han, Fu, et al. - 1996
63 Set-oriented mining of association rules – Houtsma, Swami - 1993
52 The Quest data mining system – Agrawal, Mehta, et al. - 1996
49 Developing tightly-coupled data mining applications on a relational database system – Agrawal, Shim - 1996
37 Using the new DB2: IBM's Object-relational database system – Chamberlin - 1996
13 Query Flocks: A Generalization of Association Rule Mining – Tsur, Ullman, et al. - 1998
9 Abdulghani A., Discovery board application programming interface and query lan- guage for database mining – Imielinski, Virmani - 1996
4 Using DB/2's object relational extensions for mining associations rules – Rajamani, Iyer, et al. - 1997
2 DB2 Universal Database Application programming guide Version 5 – Corporation - 1997
2 Object oriented extensions in SQL3: a status report. Sigmod record – Kulkarni - 1994
2 Oracle RDBMS Database Administrator's Guide Volumes – Oracle - 1992
2 SQL table function open architecture and data access middleware – Pirahesh, Reinwald - 1998