| Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama and K. Yoda, "Algorithms for mining association rules for binary segmentations of huge categorical databases ", Proc. of VLDB, 1998. |
....There has been much research done in recent years quantifying interestingness of a rule, and several metrics have been proposed and used as a result of this work. 124 Among objective metrics, besides confidence and support [8] there are gain [36] variance and chi squared value [59] gini [58], strength [27] conviction [20] sc and pc optimality [13] etc. Subjective metrics include unexpectedness [73, 53, 79, 60] and actionability [66, 73, 1] Any of these metrics can be used as a part of the interestingness based filter ing tool, and the validation system can support different ....
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama, and K. Yoda. Algo- 180 rithms for mining association rules for binary segmentations of huge categorical databases. In Proceedings of the 2Jth VLDB Conference, pages 380 391, 1998.
....measures can be found in [1] Here we concentrate on measures that assess how much knowledge we gain about the joint distribution of a set of attributes Q by knowing the joint distribution of some set of attributes P . Examples of such measures are entropy gain, mutual information, Gini gain, [9, 10, 4, 1, 12, 11]. The rules considered here are di erent from classical association rules studied in data mining, since we consider full joint distributions of both antecedent and consequent, while association rules consider only the probability of all attributes having some speci ed value. This approach has the ....
Morimoto Y., Fukuda T., Matsuzawa H., Tokuyama T. and Yoda K. Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases, Proc. of the 24th Conf. on Very Large Databases, pp. 380-391, 1998
....for a survey) In this paper we concentrate on measures that assess how much knowledge we gain on the joint distribution of a set of attributes Q from the knowing the joint distribution of some set of attributes P . Examples of such measures are entropy gain, mutual information, Gini gain, [7, 9, 3, 1, 11, 10]. The rules considered here are thus di erent from association rules studied in data mining, since we consider full joint distributions of both antecedent and consequent, while association rules consider only the probability of all attributes having some speci ed value. This approach has the ....
Morimoto Y., Fukuda T., Matsuzawa H., Tokuyama T. and Yoda K. Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases, Proc. of the 24th Conf. on Very Large Databases, pp. 380-391, 1998
....and revealing a specific type of hypothesis regarding the interrelation or correlation between these sets of attributes. The interestingness or usefulness of the rule is usually measured by some predefined metric function such as confidence and support [2] gain [9] chi squared value [4] gini [22], entropy gain [23, 22] laplace [6, 32] lift [16] interest [5] strength [8] and conviction [5] Several proposals for mining different types of rules according to different types of pre specified interest metrics have been suggested in the literature. The suggested techniques are fully ....
....specific type of hypothesis regarding the interrelation or correlation between these sets of attributes. The interestingness or usefulness of the rule is usually measured by some predefined metric function such as confidence and support [2] gain [9] chi squared value [4] gini [22] entropy gain [23, 22], laplace [6, 32] lift [16] interest [5] strength [8] and conviction [5] Several proposals for mining different types of rules according to different types of pre specified interest metrics have been suggested in the literature. The suggested techniques are fully automatic but need to have ....
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama, and K. Yoda. Algorithms for mining association rules for binary segmentations of huge categorical databases. In Proc. of the 24th VLDB conf., pages 380--391, 1998.
....expensive. As our algorithms can be faster than flat classification at a taxonomy of as low as four levels (Data Four) they represent a good trade off between speed and accuracy for most applications. 5 Related Work and Conclusion Classification has been studied extensively in the last decades [2, 3, 9, 13,15, 14, 17, 21]. However, most of the work on the classification ignores the hierarchical structure of classes. In [1] the authors explore the hierarchical structure of attributes to improve the efficiency, but assume only a single level of classes. The work reported in [4, 12] propose hierarchical ....
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama and K. Yoda, "Algorithms for mining association rules for binary segmentations of huge categorical databases", Proc. of VLDB, 1998.
....of the tuple. Since during the hash join each attribute list is read and distributed sequentially, the initial sort order of the attribute list is preserved. In recent work, Morimoto et al. developed algorithms for decision tree construction for categorical predictor variables with large domains [YFM 98] the emphasis of this work is to improve the quality of the resulting tree. Rastogi and Shim developed PUBLIC, a scalable decision tree classifier using top down pruning [RS98] Since pruning is an orthogonal dimension to tree growth, their techniques can be easily incorporated into our schema. ....
Y.Morimoto, T.Fukuda, H.Matsuzawa, T.Tokuyama, and K.Yoda. Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. In Proc. of VLDB, 1998.
....Categorical attributes need to be handled in a di erent fashion since it is dicult to de ne similarity between the values of categorical attributes due to lack of a priori structure. There have been some work on categorical databases in the areas of clustering [GKR98] and association rule nding [YFM98], although the problem has yet to be tackled in the area of classi cation. In order to have one value to represent others for a categorical attribute, the simplest method is to take the majority: a value with the largest count is chosen to represent the others. Another is to create a vector of ....
Y. Morimoto, T. Fukuta, H. Matsuzawa, T. Tokuyama, and K. Yoda. Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. In Procs. of VLDB Conference, New York, August 1998.
....the performance of generated association rules are significantly superior to that of one dimensional rules. 1 Introduction In recent years, data mining has made it possible to discover valuable rules by analyzing huge databases. E#cient algorithms for finding association rules have been proposed [1, 9, 10, 20, 24], and classification and regression trees that use these rules as branching tests have been extensively studied [19, 21, 22] One of important application fields of data mining is marketing. In particular, we are interested in developing an e#ective strategy of direct mail distribution # ....
....[14] the recent popularity of decision trees such as C4.5 by Quinlan [22] is due to their simplicity and e#ciency and one of the advantage of using decision trees is potential interpretability to humans. One dimensional association rules for categorical attributes can be e#ciently obtained [20]. On the other hand, as will be shown in this paper, finding twodimensional association rules for categorical attributes is NP hard. Nevertheless we shall develop a practically e#cient approximation algorithm for obtaining two dimensional association rules for categorical attributes. One of the ....
Y. Morimoto, T. Fukuda, H. Matsuzawa, K. Yoda and T. Tokuyama, "Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases", Proceedings of VLDB 98, New York, USA, August 1998.
No context found.
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama and K. Yoda, "Algorithms for mining association rules for binary segmentations of huge categorical databases ", Proc. of VLDB, 1998.
No context found.
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama, and K. Yoda, "Algorithms for Mining Association Rules for Binary Segmentations of 130 Huge Categorical Databases," Proceedings of the VLDB, 1998, New York, NY, pp. 380-391.
No context found.
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama, and K. Yoda. Algorithms for mining association rules for binary segmentations of huge categorical databases. In Proceedings of the 24th VLDB Conference, pages 380--391, 1998.
No context found.
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama, and K. Yoda. Algorithms for mining association rules for binary segmentations of huge categorical databases. In Proceedings of VLDB 1998, pages 380--391, Aug. 1998.
No context found.
Y.Morimoto, T.Fukuda, H.Matsuzawa, T.Tokuyama, and K.Yoda. Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. In Proc. of VLDB, 1998.
No context found.
Y. Morimoto, T. Fukuda, H. Matsuzawa, T. Tokuyama, and K. Yoda. Algorithms for mining association rules for binary segmentations of huge categorical databases. VLDB 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC