| Cheng, J., Fayyad, U. M., Irani, K. B., & Qian, Z. (1988). Improved decision trees: A generalized version of ID3. Proceedings of the Fifth International Conference on Machine Learning (pp. 100-106). Ann Arbor, MI: Morgan Kaufman. |
....implementing them listed below. 3.1 Decision Trees Decision trees are perhaps the most widely studied inductive learning models in the machine learning community. The literature abounds with papers proposing new models or variations of existing models and case studies using decision trees ([14, 21, 22, 25, 30, 34, 22 40, 43, 49, 50, 51, 53, 89, 93, 98, 99, 100, 101, 102, 104, 105, 106, 107, 109, 110, 111, 112, 113, 114, 118, 120, 123, 126, 129, 130, 131, 133, 134, 136]) For this case study, we use decision tree software from Quinlan and Buntine. Quinlan introduces decision trees and illustrates the use of his C4.5 software for decision trees (c4.5tree) and production rules derived therefrom (c4.5rule) in [105] Several decision tree algorithms (cart, id3, c4, ....
Jie Cheng et al (1988). Improved Decision Trees: A Generalized Version of ID3. Proceedings of the 5th International Conference on Machine Learning.Morgan Kaufmann Publishers, San Mateo, CA. 100-106.
....its descendant C4.5 [Quinlan, 1993a] are the best known. They use information based heuristic functions for test selection. Other members of the Tdidt family include Acls [Patterson and Niblett, 1983] Assistant [Cestnik, Kononenko, and Bratko, 1987] Expert ease, Ex tran, RuleMaster, and GID3 [Cheng et al. 1988]. Acls [Patterson and Niblett, 1983] generalizes ID3 by allowing attributes to be integer valued. Expert ease, Ex tran, and RuleMaster are commercial derivatives of Acls. Furthermore, based on ID3, Assistant [Cestnik et al. 1987] introduces mechanisms to handle real valued and nominal attributes ....
....values. It can split the values of a nominal attribute into two subsets to create a binary test. For null leaves (leaves without any training examples in them) Cestnik et al. 1987] Assistant applies the Bayesian classification principle when an unseen example falls into the null leaves. GID3 [Cheng et al. 1988] can build more general trees than ID3. Instead of creating one branch for each possible outcome of a nominal attribute based test at a decision node, GID3 generates branches only for those outcomes of the test that are relevant to the classification depending on the information measure and a ....
J. Cheng, U.M. Fayyad, K.B. Irani, and Z. Qian, Improved decision trees: a generalized version of ID3. Proceedings of the Fifth International Conference on Machine Learning, San Mateo, CA: Morgan Kaufmann, 100-106.
....3.3 Vacuous extension of belief functions Let X and Y be two sets of variables such that Y # X. Let m Y be a bba defined on the domain # Y of Y. The vacuous extension of m Y to #X , denoted m Y #X , is the bba obtained by extending the information in m Y to a larger frame X [4]: m Y #X (A #X Y ) m Y (A) for A # # Y (19) m Y #X (B) 0 if B is not in the form A#X Y (20) 3.4 Pignistic transformation The problem of decision making in the context of the Transferable Belief Model is handled by the pignistic transformation. The TBM is based on a two level mental ....
J. Cheng, U. M. Fayyad, K. B. Irani, Z. Qian, Improved decision trees : a generalized version of ID3, Proceedings of the fifth Internatioanl Conference on Machine Learning, pp 100-106, June 12-14, 1988.
....the scalability requirements of a data mining environment. The main insight, based on a careful analysis of the algorithms in the literature, is that most (to our knowledge, all) algorithms (including C4.5 [Qui93] CART [BFOS84] CHAID [Mag93] FACT [LV88] ID3 and extensions [Qui79, Qui83, Qui86, CFIQ88, Fay91] SLIQ and Sprint [MAR96, MRA95, SAM96] and QUEST [LS97] access the data using a common pattern, as described in Figure 1. We present data access algorithms that scale with the size of the database, adapt gracefully to the amount of main memory available, and are not restricted to a ....
....is designed to enable efficient sequential access to ordered attributes in sorted order. Thus, decision tree algorithms that exhibit this access pattern (e.g. CART [BFOS84] can be implemented with the data management of Sprint. But other decision tree algorithms (e.g. ID3 [Qui86] or GID3 [CFIQ88] that do not exhibit a sequential access pattern can not be scaled using this approach. 4 Algorithms In this section, we present algorithms for two of the three cases listed above. The first three algorithms, RF Write, RF Read and RF Hybrid, require that the AVC group of the root node # (and ....
J.. Cheng, U.M. Fayyad, K.B. Irani, and Z. Qian. Improved decision trees: A generalized version of ID3. In Proc. of Machine Learning, 1988.
....The first method was developed at IBM Almaden Research Center by Aggrawal and Srikant [2] to which we refer the reader for details. In the second method, test data is generated according to a set of prespecified correlation rules, following standard practice in machine learning experiments [7, 6]. While the purpose of the first method is to simulate the real world , that of the second is to verify that our algorithms do really correctly mine out all the correlation rules, which are known in advance. In the first method, we varied the number of baskets from 10,000 to 100,000 to study the ....
J. Cheng et al. Improved Decision Trees: A Generalized Version of ID3. In Proc. of the Fifth International Conference on Machine Learning. 1988.
....classification algorithms with data warehouse facilities are discussed in Section 2.3. 2.1 Scalable classification tree algorithms The machine learning community has proposed many algorithms for classification tree induction. Examples include Hunt s Concept Learning System [23] CART [3] ID3 [36, 37, 38, 12] and its extension to C4.5 [39] and FACT [28] Incremental versions of ID3 include ID4 [44] and ID5 [50] KATE [29] learns classification trees from complex structured data, while INFERULE [51] learns classification trees from inconclusive data. Classification tree algorithms that address ....
J. Cheng, U. M. Fayyad, K. B. Irani, and Z. Qian. Improved decision trees: a generalized version of ID3. In Proc. Fifth Int. Conf. Machine Learning, pages 100--107, San Mateo, California, 1988.
....are made. One assumption is that the activation of a unit is either very close to 1 or very close to 0. This can restrict the capability of the network since when the sigmoid transfer function is used as the the activation function, the activation of a unit can have any value in the interval [0,1]. In this paper, a novel way to understand a neural network is proposed. Understanding a neural network is achieved by having a symbolic representation of it, or extracting rules from it. The proposed algorithm (NNRE neural network rule extraction) consists of four parts: first, a weight decay ....
....attained. If necessary, the intermediate process can also be explicitly explained by rules R i Gammah and R h Gammao . C4.5 and C4.5rules [12] were run on the above three datasets to generate DT rules. Briefly, C4.5 generates a decision tree which C4.5rules generalizes to rules. Since researchers [1, 17, 20] observed that mapping many valued variables to two valued variables results in decision trees with higher classification accuracy 3 , the same binary coded data for neural networks were used for C4.5 and C4.5rules. Being explicable is only one aspect of understandability. A rule with many ....
J. Cheng, U.M. Fayyad, K.B. Irani, and Z Qian. Improved decision trees: A generalized version of id3. In Proceedings of the Fifth International Conference on Machine Learning, pages 100--106. Morgan Kaufman, 1988.
....how the prediction is attained. If necessary, the intermediate process can also be explicitly explained. C4.5 and C4.5rules [ Quinlan, 1993 ] were run on the above three datasets to generate DT rules. Briefly, C4.5 generates a decision tree which C4.5rules generalizes to rules. Since researchers [ Cheng et al. 1988; Shavlik et al. 1991 ] observed that mapping many valued variables to two valued variables results in decision trees with higher classification accuracy 4 , the same binary coded data for neural networks were used for C4.5 and C4.5rules. Being explicable is only one aspect of ....
J. Cheng, U.M. Fayyad, K.B. Irani, and Z Qian. Improved decision trees: A generalized version of id3. In Proceedings of the Fifth International Conference on Machine Learning, pages 100-- 106. Morgan Kaufman, 1988.
....will be measured. The chapter concludes with a review of several prediction task requirements that are closely related to on line classification, but that are outside the scope of this framework. Where possible, the terminology in this chapter was drawn from previous studies of classification [12, 22, 42, 52, 58]. Section 2.1 presents an example that will help to tie in the discussions within this and other chapters in this thesis. Sections 2.2, 2.3, 2.4 describe the input, output and control requirements for these tasks. The focus will be on mandatory requirements, however some of the more common ....
....change the values of the attributes. In Section CHAPTER 3. RELATED WORK 37 4.8 we propose a method that avoids the need for discretization. Hypothesis Generation The constraints imposed by LazyDT on its hypothesis generator is that it develop nodes for a univariate binary decision tree [12] that applies to event #e. To achieve this, each node performs a true or false test against only one attribute. Within this framework, LazyDT chooses to generates tests in the form of A i # = a ij , where a ij # Domain(A i ) except for the value that the value being used by the event vector, #e i ....
J. Cheng, U. M. Fayyad, K. B. Irani, and Z. Qian. Improved decision trees: a generalized version of id3. In Proc. Fifth Int. Conf. Machine Learning, pages 100--107, San Mateo, California, 1988.
....was developed at IBM Almaden Research Center by Agrawal and Srikant [2] In the second method test data was generated according to a set of prespecified correlation rules. This generation method has been widely used, in particular in experiments for machine learning of classification rules [7, 6]. In all experiments the minimum support threshold was set to 25 of the number of transactions. The threshold for CT support was also set to be 25 . We used a confident level of 90 for the 2 tests. All the experiments were conducted on a Pentium PC with a 200 MHz processor and 64 MB of ....
J. Cheng,, U. M. Fayad, K. B. Irani, and Z. Quian. 1988. Improved Decision Trees: A Generalized Version of ID3. In Proc. of the Fifth International Conference on Machine Learning. Morgan Kaufmann. San Mateo, California. pp. 100--107, 1988
....to be useful in general. There are many aspects of the Backpropagation algorithm which can be changed; for example, number of hidden units, learning rate, and encoding strategies. We have only investigated changing the number 5. At least one variation of ID3 has been proposed which avoids this (Cheng et al. 1988) 6. The BP training curve is not the result of directly testing the net on the training cases, but rather an internal measure used by the back propagation algorithm which tests output patterns against training data without decoding. Thus only the shape of the curve, not its absolute value, should ....
Cheng, J., Fayyad, U., Irani, K., and Qian, Z. (1988). Improved decision trees: A generalized version of id3. In Proceedings of the Fifth International Conference on Machine Learning, pages 100--106. Morgan Kaufmann.
....the scalability requirements of a data mining environment. The main insight, based on a careful analysis of the algorithms in the literature, is that most (to our knowledge, all) algorithms (including C4.5 [Qui93] CART [BFOS84] CHAID [Mag93] FACT [LV88] ID3 and extensions [Qui79, Qui83, Qui86, CFIQ88, Fay91] SLIQ and Sprint [MAR96, MRA95, SAM96] and QUEST [LS97] access the data using a common pattern, as described in Figure 1. We present data access algorithms that scale with the size of the database, adapt gracefully to the amount of main memory available, and are not restricted to a ....
....is designed to enable efficient sequential access to ordered attributes in sorted order. Thus, decision tree algorithms that exhibit this access pattern (e.g. CART [BFOS84] can be implemented with the data management of Sprint. But other decision tree algorithms (e.g. ID3 [Qui86] or GID3 [CFIQ88] that do not exhibit a sequential access pattern can not be scaled using this approach. 4 Algorithms In this section, we present algorithms for two of the three cases listed above. The first three algorithms, RF Write, RF Read and RF Hybrid, require that the AVC group of the root node r (and ....
J.. Cheng, U.M. Fayyad, K.B. Irani, and Z. Qian. Improved decision trees: A generalized version of ID3. In Proc. of Machine Learning, 1988.
....The process repeats for the subtrees until all objects in the subtree are from a single class. Utgoff proposed an algorithm for the incremental update of the decision tree based on ID3 [107] Cheng et al. proposed grouping some branches into one to improve the quality of the induced tree [17]. Smyth and Goodman [102] used a measurement call Jmeasure to induce classification rules directly from databases. Manago and Yodratoff induced decision trees from complex structured data [70] Later enhancement of ID3 by Quinlan led to C4.5 which could extract compact classification rules and ....
....and classifies sky objects in digitalized sky images. These images are processed to generate image segmentations. The features of these segmentations are extracted to represent the objects. The objects are classified by a classifier and put into the catalog. SKICAT used a generalized ID3 algorithm [17] to induce a decision tree. Some objects are classified by the experts (astronomers) and used as training data to help the induction of the decision tree. A set of rules is extracted from the decision tree to form the classifier. 2.2.4 Other KDD Systems Many other data mining systems have been ....
J. Cheng, U. M. Fayyad, K. B. Irani, and Z. Qian. Improved decision trees: a generalized version of id3. In Proc. Fifth Int. Conf. on Machine Learning, pages 100--107, San Mateo, California, 1988.
....easier to understand and communicate to others, and they can also be used as the basis for an experience based decision support system. Algorithms for learning and refining classification rules from examples include the AQ family (Michalski et al. 1983, 1986) the ID3 family (Quinlan 1993, 1986; Cheng et al. 1988; Fayyad et al. 1993, 1994) and CN2 (Clark and Niblett 1989) The continuing development of rule induction algorithms is motivated by the increasing application of knowledge discovery from database methods (Fayyad et al. 1993; Gemello and Mana 1989; Piatetsky Shapiro et al. 1991) which apply ....
....attribute value system is created based on all attribute value pairs whose information gains reach or exceed a threshold. The threshold can be calculated as the product of MaxGain (the maximum information gain of all attribute value pairs) and the user specified tolerance level TL (0 TL 1) (Cheng et al. 1988;Fayyad 1994) After generalization, the approximate classification tries to partition it into the fi positive, fi boundary, and fi negative regions. If the fi positive region exists, then we can apply the following steps to generate rules. For the sake of convenience, the generalized ....
Cheng, J., Fayyad, U.M., Irani, K.B. and Qian, Z. 1988. "Improved Decision Trees: A Generalized Version of ID3," Proc. of the Fifth International Conference on Machine Learning. Morgan Kaufmann.
....This encoding scheme makes it possible to describe instances by any mix of variable types, numeric or symbolic. There is an additional advantage to this encoding scheme; mapping many valued variables to two valued variables has been observed to produce trees with higher classification accuracy (Cheng, Fayyad, Irani Qian, 1988; Mooney, Shavlik, Towell Gove, 1989) The encoded variables are also normalized automatically at each node as part of the encoding process. The scaling is accomplished for each encoded variable by mapping it to the mean of the observed values plus one standard deviation to 1 and the mean minus ....
Cheng, J., Fayyad, U. M., Irani, K. B., & Qian, Z. (1988). Improved decision trees: A generalized version of ID3. Proceedings of the Fifth International Conference on Machine Learning (pp. 100-106). Ann Arbor, MI: Morgan Kaufman.
....induction of rules represented as a decision tree has gained widespread popularity in the machine learning community. Much research has been devoted to improving ID3 to make it more applicable to real world problems by finding ways to deal with noise [37] avoiding some of its inherent weaknesses [8], making it incremental (such as ID4 and ID5 [46] and handling inconclusive data [42] ID3 search of the space of decision trees is essentially a hill climbing approach guided locally by an information theoretic measure. Other machine learning algorithms are also guided by similar heuristics and ....
J. Cheng, U.M. Fayyad, K. Irani, and Z. Qian. Improved decision trees: A generalized version of ID3. In Proceedings of the Fifth International Conference on Machine Learning, pages 100--107, San Mateo, Ca., 1988. Morgan Kaufmann.
....that we prefer a different term, action based hierarchies, for our decision trees. There is a fairly extensive body of literature devoted to exploring different heuristics for decision tree structuring. An information theoretic heuristic is used by various researchers: by Cheng et.al. in GID3 [6], by Fayyad in GID3 [19] by Quinlan in C4 [34] by Breiman et.al. in CART [4] and by Clark and Niblett in CN2 [9] Fayyad and Irani introduce a class separation approach (C SEP in [20] where the heuristic measures not the information content of a test but the degree to which it separates ....
Cheng, J., Fayyad, U., Irani, K., Qian, Z. Improved decision trees: a generalized version of ID3, Proceedings of the Fifth International Conference on Machine Learning (1988) 100-108.
.... first was chosen because it is somewhat of a standard in the machine learning community; the latter two were chosen because they were developed at the University of Michigan and because they compare favorably to other decision tree systems in terms of their predictive accuracy on a range of tasks [32, 33, 18]. ID3 and GID3 both use information gain as the evaluation measure; O Btree uses the orthogonality measure. ID3 creates standard decision trees that branch on every value of every attribute; GID3 allows for default branches, and O Btree creates binary trees. Due to the default branches ....
J. Cheng, U. M. Fayyad, K. B. Irani, and Z. Quian. Improved decision trees: A generalized version of ID3. In Proceedings of the Fifth International Conference on Machine Learning, Ann Arbor, MI, pages 100--108. Morgan Kaufmann, San Mateo, CA, 1988.
No context found.
Cheng, J., Fayyad, U. M., Irani, K. B., & Qian, Z. (1988). Improved decision trees: A generalized version of ID3. Proceedings of the Fifth International Conference on Machine Learning (pp. 100-106). Ann Arbor, MI: Morgan Kaufman.
No context found.
J. Cheng, U.M. Fayyad, K.B. Irani, and Z Qian, "Improved Decision Trees: A Generalized Version of ID3," Proc. of the Fifth Int'l Conf. on Machine Learning, Morgan Kaufman, 1988, pp.100--106. 21
No context found.
Cheng, J.; Fayad, U. M.; Irani, K.B; Qian, Z (1988) --- Improved Decision Trees: A Generalized Version of ID3, 5th Int'l Conf. on Machine Learning, pp. 100-106.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC