47 citations found. Retrieving documents...
M. Houtsma and A. Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Center, San Jose, CA, October 1993.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Knowledge Discovery From Distributed And Textual Data - Cho (1999)   (1 citation)  (Correct)

....the large itemsets with confidence greater than the predetermined threshold. The first stage, which usually involves counting on the original database, is the most timeconsuming process and thus attracts many researchers efforts in speeding it up. Popular algorithms such as the AIS [5] and SETM [67] generate candidate itemsets on the fly while reading the data. These algorithms have the disadvantage of generating and counting too many candidate itemsets that turn out to be small. Agrawal and Srikant [3] later on modified the candidate itemset generation so that only itemsets found large in ....

Houtsma M. and Swami A., "Set-oriented Mining of Association Rules". Research Report RJ 9567, IBM Almaden Research Center, San Jose, California, 1993.


Mining Association Rules From Market Basket Data.. - Hilderman.. (1998)   (Correct)

....Machine Learning, Itemsets, Association Rules. 1 Introduction The problem of mining association rules from market basket data has recently been an important research topic in the area of knowledge discovery from databases. It was originally introduced in [2] and studied extensively in [1, 5, 25, 26, 31, 19, 23, 29, 30, 3, 4, 33, 14]. The problem is typically examined in the context of discovering buying patterns from retail sales transactions. Although there are many similar data mining applications which can be modelled in this way, we again study the problem using the retail store example because of its intuitive nature ....

....of share for itemsets, and redefine the notions of frequent itemsets and confidence. We refer to this extended formalism as the shareconfidence framework for association rules and refer to the new itemset measures as simply share measures. In this framework, any of the algorithms presented in [2, 3, 16, 19, 22, 23, 29, 30, 31, 32, 33] can used to generate frequent itemsets using our new definition for frequent itemset. The definitions in this section have been implemented in a data mining system for analyzing market basket data. This system is an extension of DB Discover, a software tool for knowledge discovery form databases ....

M. Houtsma and A. Swami. Set-oriented mining of association rules. In Proceedings of the 11th International Conference on Data Engineering (ICDE'95), pages 25--34, 1995.


An Algorithm for Mining Association Rules Using Perfect.. - Özel, Güvenir   (Correct)

....is greater than or equal to the minimum confidence. However the first step of association rule mining, finding the frequent itemsets, is very resource consuming task and for that reason, it has been one of the most popular research field in data mining. Several algorithms, AIS [3] SETM [8], Apriori [4] Direct Hashing and Pruning [5, 9] Partition [10] Sampling [11] and some other parallel algorithms [12] have been developed. In this study, a fast algorithm based on Direct Hashing and Pruning (DHP) algorithm is proposed. The DHP algorithm is described in Section II, our algorithm ....

M. Houtsma and A. Swami, "Set-Oriented Mining of Association Rules", Research Report RJ 9567, IBM Almaden Research Center, San Jose, California, (Oct. 1993).


Intension Mining: A New Paradigm in Knowledge Discovery - Gupta, Bhatnagar, Wasan.. (2000)   (Correct)

....cation with no change in semantics. Data Mining algorithms operate on previously selected, cleaned and transformed data. The choice of the mining algorithm depends on the type of knowledge to be discovered. Intra record links, termed Association Rules can be discovered using algorithms given in [4, 5, 6, 29, 31, 35, 41] etc. Database segmentation can be performed using various clustering techniques [18, 28, 37, 38, 51] etc. Classi cation can be performed by inducing either a decision tree or decision rule [1, 3, 8, 21, 23] or by neural network techniques [32] Presentation of the Discovered Knowledge is the ....

M. Houstma and A. Swami. Set oriented Mining of Association Rules. In Proceedings of the Int'l Conf. on Data Engineering, pages 25-33, 1995. 64


Discovering Interesting Association Rules in Medical Data - Ordonez, Santana, de Braal (2000)   (Correct)

....are used with basket data. Medical data sets are more complex and thus present many new challenges. This paper incorporates some ideas from our previous work to mine rules on segmented images [16] Most papers published in the database literature concentrate on optimizing the first phase [18, 7, 12, 13, 14, 19, 17] but a few look at the problem of also improving rule generation (2nd phase) 6, 18, 15] For instance, 14] proposes an algorithm to summarize associations when they are too many. 7] attacks the problem of inserting transactions on an already mined set and proposes an algorithm that ....

M. Houtsma and A. Swami. Set-oriented mining of association rules. Technical Report RJ 9567, IBM, October 1993.


Fast Algorithms for Mining Association Rules - Agrawal, Srikant (1994)   (881 citations)  (Correct)

....example, D could be a data file, a relational table, or the result of a relational expression. An algorithm for finding all association rules, henceforth referred to as the AIS algorithm, was presented in [AIS93b] Another algorithm for this task, called the SETM algorithm, has been proposed in [HS93] In this paper, we presenttwo new algorithms, Apriori and AprioriTid, that differ fundamentally from these algorithms. We present experimental results, using both synthetic and real life data, showing that the proposed algorithms always outperform the earlier algorithms. The performance gap is ....

....person who orders a comforter also orders a flat sheet, a fitted sheet, a pillow case, and a ruffle. The algorithms in Section 3 generate suchmulti consequent rules. In Section 4, we show the relative performance of the proposed Apriori and AprioriTid algorithms against the AIS [AIS93b] and SETM [HS93] algorithms. To make the paper self contained, we include an overview of the AIS and SETM algorithms in this section. We also describe how the Apriori and AprioriTid algorithms can be combined into a hybrid algorithm, AprioriHybrid, and demonstrate the scale up properties of this algorithm. We ....

[Article contains additional citation context not shown here]

Maurice Houtsma and Arun Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden ResearchCenter, San Jose, California, October 1993. 30


SQL Based Association Rule Mining using Commercial.. - Yoshizawa.. (2000)   (Correct)

....and credit card fraud indications are widely recognized. One method of data mining is finding association rule [1] Basket data analysis is typical of this method. There are some approaches proposed to mine association rules, 1,2,6,9] some of them are based on relational database standard SQL [3,7,8]. But this kind of mining is known as CPU power demanding application and it has to handle very large amounts of transaction data. Unfortunately SQL approach is reported to have drawback in performance although it has many advantages such as seamless integration with existing system and high ....

....required by association rule mining. This fact motivated us to examine how efficiently SQL based association rule mining can be parallelized and speeded up using commercial parallel database system (IBM DB2 UDB EEE) We propose two techniques to enhance association rule mining query based on SETM [3]. And we have also compared the performance with commercial mining tool (IBM Intelligent Miner) Our performance evaluation shows that we can achieve comparable performance with commercial mining tool using only 4 nodes. Some considerable works on effective SQL queries to mine association rule ....

[Article contains additional citation context not shown here]

M. Houtsma, A. Swami. Set-oriented Mining of Association Rules. In Proc. of International Conference on Data Engineering (ICDE), 1995.


Extended Concepts for Association Rule Discovery - Rantzau (1997)   (Correct)

....at most twice. In the first scan it generates all candidates and in the second their support is computed. Apriori outperforms Partition only when the minimum support threshold is set high. The Partition algorithm lends itself to an implementation on parallel computers. The SETM algorithm [HS95, AS94a] uses SQL to generate frequent itemsets. Like AIS, candidates are generated while transactions are read from the database. However, SETM separates candidate generation from counting. It has a worse performance than AIS for both synthetic and real life datasets. The algorithm presented in ....

Maurice Houtsma and Arun Swami. Set-oriented Mining of Association Rules. In Proceedings of the 11th International Conference on Data Engineering, Taipei, Taiwan, pages 25--33, March 1995.


Performance Evaluation and Optimization of Join Queries.. - Thomas, Chakravarthy (1998)   (3 citations)  (Correct)

....objectrelational extensions to execute mining operations. This entails transforming the mining operations into database queries and in some cases developing newer techniques that are more appropriate in the database context. The UDF based (user de ned function) approach in [2] the SETM algorithm [5], the formulation of association rule mining as query ocks [10] and SQL queries for mining [9] all belong to this category. Two categories of SQL implementations for association rule mining one based purely on SQL 92 and the other using the object relational extensions to SQL (SQL OR) are ....

....= q.tid and q.tid = r.tid We can also use the Subquery approach to generate T 3 if that is less expensive. T 3 will contain exactly the same tuples produced by subquery Q 3 . The Set oriented Apriori algorithm bears some resemblance with the three way join approach in [9] the SETM algorithm in [5] and the AprioriTid algorithm in [3] In the three way join approach, the temporary table T k stores for each transaction, the identi ers of the candidates it supported. T k is generated by joining two copies of T k 1 with C k . The generation of F k requires a further join of T k with C k . The ....

[Article contains additional citation context not shown here]

M. Houtsma and A. Swami. Set-oriented mining of association rules. In Int'l Conference on Data Engineering, Taipei, Taiwan, March 1995.


Algorithms For Computing Association Rules Using A.. - Graham Goulbourne Frans (2000)   (1 citation)  (Correct)

....support for all members of C k , and from this, produces the set L k of interesting sets of size k. This is then used to derive the candidate sets C k 1 , using the downward closure property, that all the subsets of any member of C k 1 must be members of L k . Other algorithms, AIS [1] and SETM [3], have the same general form but differ in the way the candidate sets are derived. Two aspects of the performance of these algorithms are of concern: the number of passes of the database that are required, which will in general be one greater than the number of attributes in the largest ....

Houtsma, M. and Swami, A. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Centre, San Jose, October 1993.


Performance Evaluation and Optimization of Join Queries.. - Thomas, Chakravarthy (1998)   (3 citations)  (Correct)

....extensions to execute mining operations. This entails transforming the mining operations into database queries and in some cases developing newer techniques that are more appropriate in the database context. The UDF based (user de ned function) approach in [AS96] the SETM algorithm [HS95] the formulation of association rule mining as query ocks [TUA 98] and SQL queries for mining [STA98] all aim at tighter database integration. STA98] presents a detailed study of the various architectural alternatives for mining data stored in a DBMS. It has been reported that for ....

....Section 3.3. Figure 11: Comparison of Subquery and Set oriented Apriori approaches In Figure 11, we show the relative performance of Subquery and Set oriented Apriori approaches for the two datasets. The chart shows the total time taken for each of the di erent passes. We ran the SETM algorithm [HS95] also for a few support values and found that it is an order of magnitude slower. Set oriented Apriori performs better than Subquery for all the support values. The rst two passes of both the approaches are similar and they take approximately equal amount of time. The di erence between ....

Maurice Houtsma and Arun Swami. Set-oriented mining of association rules. In Int'l Conference on Data Engineering, Taipei, Taiwan, March 1995.


Incremental Mining of Constrained Associations - Thomas, Chakravarthy (1998)   (6 citations)  (Correct)

....frequent itemsets) generated at each level is based on the observation that if an itemset S appears in c baskets, then any subset of S appears in at least c baskets. The need for applying association rule mining to data stored in databases data warehouses has motivated researchers to [SK97, HS95, AS96, STA98, TC98, TS98] i) study alternative architectures for mining over data stored in databases, ii) translate association rule mining algorithms to work with relational and object relational databases, ii) optimize the mining algorithms beyond what the current relational query optimizers ....

Maurice Houtsma and Arun Swami. Set-oriented mining of association rules. In Int'l Conference on Data Engineering, Taipei, Taiwan, March 1995.


Parallel Mining of Association Rules - Agrawal, Shafer (1996)   (53 citations)  (Correct)

....data mining is that it will deliver technology that will enable development of a new breed of decision support applications. Discovering association rules is an important data mining problem [1] Recently, there has been considerable research in designing fast algorithms for this task [1] 3] 5] [6] [8] 12] 9] 11] However, with the exception of [10] the work so far has been concentrated on designing serial algorithms. Since the databases to be mined are often very large (measured in gigabytes and even terabytes) parallel algorithms are required. We present in this paper three parallel ....

....upon the patterns the different transactions support. This algorithm also incorporates load balancing. These algorithms are based upon the serial algorithm Apriori which was first presented in [3] We chose the Apriori algorithm because of its superior performance over the earlier algorithms [1] [6], as shown in [3] We preferred Apriori over AprioriHybrid, a somewhat faster algorithm in [3] because AprioriHybrid is harder to parallelize; the performance of AprioriHybrid is sensitive to heuristically determined parameters. Furthermore, the performance of Apriori can be made to approximate ....

[Article contains additional citation context not shown here]

Maurice Houtsma and Arun Swami. Set-oriented mining of association rules. In Int'l Conference on Data Engineering, Taipei, Taiwan, March 1995.


Parallel SQL Based Association Rule Mining on.. - Pramudiono.. (1999)   (Correct)

....called large itemsets. 2. Generate the desired rules using large itemsets. Since the first step consumes most of processing time, development of mining algorithm has been concentrated on this step. In our experiment we employed ordinary standard SQL query that is similar to SETM algorithm [3].It is shown in figure 1. CREATE TABLE SALES (id int, item int) PASS 1 CREATE TABLE C 1 (item 1 int, cnt int) CREATE TABLE R 1 (id int, item 1 int) INSERT INTO C 1 SELECT item AS item 1, COUNT( FROM SALES GROUP BY item HAVING COUNT( min support; INSERT INTO R 1 SELECT ....

M. Houtsma, A. Swami. Set- oriented Mining of Association Rules. In Proc. of International Conference on Data Engineering, 1995.


Pincer-Search: An Efficient Algorithm for Discovering the.. - Lin, Kedem (1999)   (Correct)

....Frequent Set Discovery We briefly discuss existing frequent set discovery algorithms in a roughly chronological order. AIS and SETM Algorithms The problem of association rule mining was first introduced in [2] An algorithm called AIS was given for discovering the frequent set. SETM algorithm [13] was later designed to use only standard SQL commands to find the frequent set. The Apriori algorithm [3] described above, performs much better than AIS and SETM. The OCD Algorithm It is worth adding, that concurrently with the Apriori algorithm, OCD algorithm [19] used the same closure property ....

M. Houtsma and A. Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Center, Oct. 1993.


Computing Association Rules Using Partial Totals - Graham Goulbourne Frans (2001)   (2 citations)  (Correct)

....support for all members of C k , and from this, produces the set L k of interesting sets of size k. This is then used to derive the candidate sets C k 1 , using the downward closure property, that all the subsets of any member of C k 1 must be members of L k . Other algorithms, AIS [1] and SETM [3], have the same general form but differ in the way the candidate sets are derived. Two aspects of the performance of these algorithms are of concern: the number of passes of the database that are required, which will in general be one greater than the number of attributes in the largest ....

Houtsma, M. and Swami, A. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Centre, San Jose, October 1993. 10


Fast Algorithms for Discovering the Maximum Frequent Set - Lin (1998)   (Correct)

....Itemsets f1,2,3,5g and f1,2,5g were not considered, since the item 5 was not in the transaction. Two complicated heuristics, remaining tuples optimization and pruning function optimization, were used to prune candidates. Unfortunately, this algorithm still generates too many candidates. SETM [HS93] algorithm was later designed to use only standard SQL commands to find the frequent set. However, like AIS, SETM also creates candidates on the fly while reading the database. Both algorithms are not efficient, since they generate and count too many unnecessary candidates. 2.4.2 Apriori and OCD ....

M. Houtsma and A. Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Center, Oct. 1993.


Integrating Data Mining with Relational DBMS: A.. - Nestorov, Tsur (1999)   (1 citation)  (Correct)

.... that requires consideration: can we achieve a comparable, or at least an acceptable level of performance from these integrated methods when compared to the special purpose external methods This question was previously examined in a more narrow context of association rules and a particular DBMS in [7] and [2] Section 2 of this paper elaborates on the general architectural choices available and their comparison. The idea of flocks [11] was presented as a framework for performing complex data analysis tasks on relational database systems. The method consists of a generator of candidate query ....

H. Houtsma and A. Swami. Set-oriented mining of association rules. In Proceedings of International Conference on Data Engineering, pages 25--33, Taipei, Taiwan, March 1995.


Efficient Mining for Association Rules with Relational.. - Rajamani, Cox, Iyer, al. (1999)   (4 citations)  (Correct)

....With the (Transaction id, Item) schema the Transaction id value would be repeated for every item bought in that transaction. The SC data model would be useful for performing conventional relational queries against items bought in transactions. Some of the early work in association rule mining [11] propose the use of such relational queries for discovering association rules, and work with this data model. However, later work [3] have shown significant performance improvement by using Apriori based algorithms that did not use relational queries in their implementation. But, to the best of ....

....proposed for inter operability in multi database systems and not for providing the flexibility and functionality required by data mining applications. Agrawal and Shim [2] show the benefit of using UDFs for the development of applications tightly coupled with the database engine. Houtsma and Swami [11] proposed SETM, an SQL based algorithm for association rule mining. Their algorithm uses simple database operations sorting and mergescan joins. However, their joins are more expensive as they are against the input data table and they do not have an efficient candidate set pruning such as ....

M. Houtsma and A. Swami. Set-oriented Mining of Association Rules. Technical Report RJ 9567, IBM Almaden Research Center, October 1993.


Algorithms for Clustering High Dimensional and - Tao   (Correct)

No context found.

M. Houtsma and A. Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Center, San Jose, CA, October 1993.


Association-Based Similarity Testing and Its Applications - Tao Li Department   (Correct)

No context found.

M. Houtsma and A. Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Center, San Jose, CA, October 1993.


March 2002 - Un Vers Ty   (Correct)

No context found.

M. Houtsma and A. Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Center, San Jose, CA, October 1993.


Itemset Materializing for Fast Mining of Association Rules - Wojciechowski, Zakrzewicz   (Correct)

No context found.

Houtsma M., Swami A., "Set-Oriented Mining of Association Rules", Research Report RJ 9567, IBM Almaden Research Center, San Jose, California, USA, October 1993


Similarity Testing Between Heterogeneous Basket Datasets - Li, al. (2002)   (Correct)

No context found.

M. Houtsma and A. Swami. Set-oriented mining of association rules. Research Report RJ 9567, IBM Almaden Research Center, San Jose, CA, October 1993.


MINTO: A Software Tool for Mining Manufacturing Databases - Haritsa   (Correct)

No context found.

M. Houtsma and A. Swami, "Set-Oriented Mining Association Rules", International Conference on Data Engineering , March 1995.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC