| M., F. U. and Irani, K. B. (1993). Multi-interval discretization of continuousvalued attributes for classification learning. In Kau#man, M., editor, Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 1022--1029. |
....one experiment, shown in Section 7.3, where I show that using the automatic method is often the better, or at least competitive, choice. The second method the one I make use of in the majority of my experiments, uses an existing entropy based method for discretization to find good split points [FI93, KS96] This method makes use of information theoretic techniques to analyze the values of a numerical feature and create split points that have high information gain [FI93, KS96] The pseudo code for the algorithm I use, as derived from Fayyad and Irani [FI93] is shown Figure 6.2, and works in ....
.... of in the majority of my experiments, uses an existing entropy based method for discretization to find good split points [FI93, KS96] This method makes use of information theoretic techniques to analyze the values of a numerical feature and create split points that have high information gain [FI93, KS96] The pseudo code for the algorithm I use, as derived from Fayyad and Irani [FI93] is shown Figure 6.2, and works in the following way: given a set of data, S, pruned to contain the values of only one numerical feature, A, then sorted on the values of A, it finds the partitioning value, #, ....
[Article contains additional citation context not shown here]
U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of the 13th International Joint Conference on AI, pages 1022--1027. Morgan Kaufmann, 1993.
....highest probability given an instance. It is plausible that it is less important to form intervals dominated by a single class for naive Bayes classifiers than for decision trees or decision rules. Thus discretization methods that pursue pure intervals (containing instances with the same class) [1, 5, 10, 11, 14, 15, 19, 29] might not suit naiveBayes classifiers. Besides, naive Bayes classifiers deem attributes conditionally independent of each other and do not use attribute combinations as predictors. There is no need to calculate the joint probabilities of multiple attribute values. Thus discretization methods that ....
....probabilities. But when the training instances influence on each interval does not follow the normal distribution, FD s performance can degrade. The number of initial intervals k is a predefined parameter and is set as 10 in our experiments. 4. 4 Entropy Minimization Discretization (EMD) EMD [10] evaluates as a candidate cut point the midpoint between each successive pair of the sorted values. For evaluating each candidate cut point, the data are discretized into two intervals and the resulting class information entropy is calculated. A binary discretization is determined by selecting the ....
Fayyad, U. M., and Irani, K. B. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (1993), pp. 1022--1027.
....flag indicating whether the parent was cut using a line or a space; 2. the four coordinates of the bounding box (in normalized document space coordinates) 3. the average grey level of the region. Real variables were quantized by running the maximum entropy discretization algorithm described in [8], obtaining discrete attributes with 10 possible realizations each. Note that since the discretization algorithm collects statistics over the available data, it must be run 4 Implementation and results We compared HTMMs to the document decision trees (DDT) algorithm described in [1] DDT is an ....
U. M. Fayyad and K. B. Irani, "Multi-interval discretization of continuousvalued attributes for classification learning," in Proc. 13th Int. Joint Conf. on Artificial Intelligence, pp. 1022--1027, Morgan Kaufmann, 1993.
....it possible to compute conditional probabilities, we discretize the attributes prior to applying the learning scheme, quantizing the numeric attributes into ranges so that each value of the resulting new attribute represents a range of values of the original numeric attribute. Fayyad and Irani s [8] discretization scheme, which is based on the Minimum Description Length principle, is suitable for this purpose. The naive Bayes learning scheme is a simple application of Bayes formula. It assumes that the attributes in this case TF#IDF and distance are independent given the class. Making ....
Fayyad, U. M. and Irani, K. B. (1993) "Multi-interval discretization of continuousvalued attributes for classification learning.." In Proc International Joint Conference on Artifical Intelligence, pp. 1022-- 1027.
....makes the task of guaranteeing quick response times dicult. In order to solve this problem researchers have taken a two pronged approach. To minimize the I O trac involved in these applications researchers have evaluated the viability of using data reduction techniques such as discretization [FI93] and sampling [ZPOL97b] while sacri cing little in terms of result quality. Simultaneously to compute results faster, researchers are turning to e ective parallelization of existing data mining algorithms [SAM96, ZPOL97, CHN 96] Modern day enterprises usually contain a cluster of shared ....
....it is often the case that the format in which data is stored for report generation is not the most appropriate format for mining the data. In such cases the data has to be transformed into a form which is acceptable to the application. Furthermore, several applications like Discretization [FI93, SVC97] Clustering [SGIF97] and Similarity Analysis [AF93] may accept a distilled compressed form of the output. Reduction of the output data prior to transmission reduces the bandwidth requirement of the interconnection network and storage requirements at the compute server. The need to ....
Usama. M. Fayyad and Keki B. Irani. Multi-interval discretization of continuousvalued attributes for classication learning. In Proceedings of the 14th International Joint Conference on Articial Intelligence, pages 1022-1027, 1993.
....e#ective increase factors . 111 xiv 1 Chapter 1 Introduction 1.1 Motivation Among the foremost of the techniques proposed for data mining has been been association rule mining. Entire sections in major database conferences are devoted solely to this topic. The technique was first introduced in 1993 by Agrawal, Imielinski and Swami in [AIS93b] Since then it has generated a lot of research and commercial interest. Algorithms proposed include mining multiple level (using hierarchies) association rules, mining numerical, categorical and interval data mining etc. In this thesis, we approach ....
....drilldown to deeper levels of consolidation etc. Coub] Interestingly, OLAP was initiated in the commercial world well before it became of interest to researchers. Several products like Express, Comshare, Metaphor, Cognos PowerPlay and Essbase were in the market before the term was first coined in 1993 [CCS93] The top three current providers are Hyperion Solutions, Oracle and Cognos [Cos00] Data in OLAP is typically organized in cubes (or hypercubes) and defined over a multidimensional space, consisting of several dimensions. From start there has been a debate on the underlying architecture ....
U. Fayyad and K. B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-- 93), pages 1022 -- 1027, Chambery, France, 1993. Morgan Kaufmann.
....level of the region associated with the node. Since the HTMM model is formulated for discrete variables only, real variables lebeling each node (the last 5 attributes described above) must be quantized. This was accomplished by running the maximum entropy discretization algorithm described in [4], obtaining 5 discrete attributes with 10 values each. It must be remarked that this discretization algorithm collects statistics over the available data and, therefore, it was run using the documents belonging to the training set. 5 Implementation and results We have used one HTMM, k , for ....
U. M. Fayyad and K. B. Irani, \Multi-interval discretization of continuousvalued attributes for classication learning," in Proc. 13th Int. Joint Conf. on Articial Intelligence, pp. 1022-1027, Morgan Kaufmann, 1993.
.... A new input attribute is selected to maximize the total significant decrease in the conditional entropy, which is termed conditional mutual information [1] If an input attribute is continuous, it is partitioned into intervals by a discretization procedure, which is based on the algorithm of [3]. The nodes of a new hidden layer are defined for a Cartesian product of split nodes of the previous hidden layer and values of the new input attribute. The construction stops when there is no candidate input attribute, which significantly decreases the conditional entropy of the target attribute. ....
U. Fayyad and K. Irani, "Multi-interval discretization of continuousvalued attributes for classification learning," in Proc. 13th Int. Joint Conference on Artificial Intelligence, San Mateo, CA, 1993, pp. 1022--1027.
....improve the quality of the prediction. 9 5.2 Experimental setting We tested five algorithms over 14 datasets from the Irvine repository [2] plus our own credit screening database. The dataset characteristics are described in Table 1. To discretize continuous attributes we tried maximum entropy [7] discretization and equal frequency discretization with 5 intervals. We present the results for equal frequency because it provided better accuracy. For each dataset and algorithm we tested both accuracy and LogScore. LogScore is calculated by adding the minus logarithm of the probability assigned ....
Usama M. Fayyad and Keki B. Irani. Multi-Interval Discretization of ContinuousValued Attributes for Classification Learning. In 13th International Joint Conference of Artificial Intelligence, pages 1022--1027, 1993.
....indicates that the normal distribution is not appropriate in this application. Discretization quantizes a numeric attribute into ranges so that the resulting new attribute can be treated as a nominal one: eachvalue represents a range of values of the original numeric attribute. Kea uses Fayyad and Irani s # 1993 # discretization scheme, which is based on the Minimum Description Length principle. It recursively splits the attribute into intervals, at each stage minimizing the entropy of the class distribution. It stops splitting when the total cost for encoding both the discretization and the ....
Irani. Multi-interval discretization of continuousvalued attributes for classi#cation learning. In Proceedings of the 13th International Joint Conferenceon Arti#cal Intelligence, pages 1022#1027. Morgan Kaufmann, 1993.
....of cut points examined. Subsequently we test the practical effects of static pruning. 4.2 Dynamic Pruning We contrast the multisplitting algorithm without pruning, with static pruning only, and with both pruning schemes. As baseline we use a breadth first implementation of Fayyad and Irani s [10] widely used heuristic greedy multisplitting method. Keep in mind that this method does not produce optimal partitions, even though the scores of the resulting partitions often are very close to optimal [6] As the evaluation function we use Information Gain [13] which is convex and cumulative. ....
.... bars represent the relative number of evaluations per attribute for the static pruning algorithm operating on boundary points, the white bars are those of the algorithm, where also the dynamic candidate pruning is employed, and the black ones correspond to those of the greedy heuristic selection [10]. Empirical Evaluation 7 # cut points 25 50 75 Abalone 863.7 Adult 3,673.7 Annealing 27.5 Australian 188.2 Auto insurance 61.3 Breast Wisconsin 9.9 Colic 85.8 Diabetes 156.8 Euthyroid 164.0 German 203.0 Glass 115.3 Heart Cleveland 30.5 Heart Hungarian 26.8 Hepatitis 53.8 ....
U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1022--1027, San Francisco, Calif., 1993. Morgan Kaufmann.
....under the Borland C 5.02 environment, but are also usable with the GNU C compiler. The new classifying algorithms, PatMat and its variant were compared with C4.5, a famous classi er[8] We used eight data sets from the UCI ML Repository[5] The data sets were discretized using the method in [3]. The three algorithms under consideration were applied to each of the eight data sets. In order to compare the performance of the algorithms, we measured the error rate of each classi cation run. The error rate is de ned as the number of wrongly classi ed records divided by the total number of ....
U. M. Fayyad and K.B. Irani, Multi-interval discretization of continuousvalued attributes for classication learning, IJCAI-93, pp. 1022-1027.
....as decision tree and rule learning algorithms on many datasets [9, 6] Because the current version of LazyRule only accepts nominal attribute inputs, continuous valued attributes are discretized as a pre process in the experiments. The discretization method is based on an entropy based method [11]. For each pair of training set and test set, both the training set and the test set are discretized by using cut points found from the training set al..one. LazyRule with MV or NB uses the N fold cross validation method (also called leave one out estimation) 4] in the attribute evaluation process ....
Fayyad, U.M. & Irani, K.B. Multi-interval discretization of continuousvalued attributes for classification learning. Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1022-1027, 1993. San Mateo, CA: Morgan Kaufmann.
....more than once on a path leading from the root to a leaf) which certainly damages the intelligibility of the result, if nothing else. More importantly, the repetitive binarization of an attribute range cannot guarantee optimal multi way partitioning even if the binary splits were optimal (cf. Fayyad and Irani, 1992, 1993). This paper tackles the problem of finding optimal (w.r.t. a given evaluation function) multi way partitionings (up to k intervals) for continuous attribute value ranges. The usual setting in this task is the following. The underlying assumption is that we have sorted our N examples into ....
....level of the tree, and possibly still after that. Even though the example set, which determines the goodness of an attribute, changes dynamically as the construction process proceeds, we do not have to keep sorting the examples over and over again, but can rely on careful bookkeeping instead (Fayyad and Irani, 1993; Fulton, Kasif, and Salzberg, 1995) One remaining topic is the arity, k, of the resulting partition. Usually we would like to penalize increase in k; that is, we would like to keep the arity of the partition relatively low because of the following reasons: The utility of an attribute in class ....
[Article contains additional citation context not shown here]
Fayyad, U. and Irani, K. (1993). Multi-interval discretization of continuousvalued attributes for classification learning. In Proc. Thirteenth International Joint Conference on Artificial Intelligence (pp. 1022--1027). San Mateo, CA: Morgan Kaufmann.
....domain at hand has a very high number of candidate cut points. Even a linear time method like binarization can require substantial amount of time. This presents a particular problem for learning algorithms that have to manipulate numerical attributes exhaustively; e.g. optimal [13, 9] or greedy [11] multisplitters in decision tree learning, rule induction, and nearest neighbor methods. The inconvenience for all attribute selection strategies alike is that the time consumption of attribute selection is dominated by the attributes that are the heaviest ones to evaluate. Hence, even a single ....
Fayyad, U. & Irani, K. (1993). Multi-interval discretization of continuousvalued attributes for classification learning. In Proc. Thirteenth International Joint Conference on Artificial Intelligence. Morgan Kaufmann.
....runs. As well as naive Bayes for regression, we also ran the state of the art decision tree learner C4.5 Revision 8 (1993) and the standard naive Bayes procedure for classification on these datasets. Our implementation of naive Bayes for classficiation discretizes numeric attributes using Fayyad and Irani s (1993) method, ignores missing values, and employs the Laplace estimator to avoid zero counts (Domingos Pazzani, 1997) The results in Table 6 show that C4.5 performs significantly better than naive Bayes for regression on five datasets, and significantly worse on eleven. Compared to naive Bayes for ....
Fayyad, U. M. & Irani, K. B. (1993). Multi-interval discretization of continuousvalued attributes for classfiication learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (pp. 1022--1027). San Francisco: Morgan Kaufmann.
....in performance in continuousvalued attribute domains and nominal attribute domains that motivates the work reported in this paper. Relative little research on the discretization of continuous valued attributes has been reported in the machine learning research literature. Two recent works (Fayyad Irani, 1993; Van de Merckt, 1993) under the framework of decision trees have benefited the work reported here. The objective of this paper is to explore a way in which continuous valued attributes and nominal attributes can be treated cohesively in instancebased learning. We begin the next section with a ....
....let us examine the methods of discretization of continuous valued attributes in the next section. 3 Discretization of Continuous Valued Attributes We describe three discretization methods used in decision trees in this section and use these methods for the experiments described in Section 4. Fayyad and Irani (1993) devised a method for multi value splitting for continuous valued attributes. The same criterion (i.e. information gain) as used in binary splitting (Quinlan, 1986) is employed; however, multivalue splitting is realised by recursively applying the same criterion to the subsets of the first split ....
[Article contains additional citation context not shown here]
Fayyad, U.M. & Irani, K.B. (1993), Multi-Interval Discretization of ContinuousValued Attributes for Classification Learning, in Proceedings of 13th International Joint Conference on Artificial Intelligence, pp.1022-1027, Morgan Kaufmann.
....ideas easily apply to classifiers that learn a different tree for each class value. 3 GAUSSIAN TAN The TAN classifier, as described by FGG, applies only to discrete attributes. In experiments run on data sets with continuous attributes, FGG use the prediscretizion described by Fayyad and Irani [7] before learning a classifier. In this paper, we attempt to model the continuous attributes directly within the TAN network. To do so, we need to learn CPDs for continuous attributes. In this section, we discuss Gaussian distributions for such CPDs. The theory of training such representations is ....
....; A k , are the continuous attributes in our domain. We denote by A 1 ; A k the corresponding discretized attributes (i.e. A 1 is the discretized version of A 1 ) based on a predetermined discretization policy (e.g. using a standard method, such as Fayyad and Irani s [7]) Given this semantics for the discretized variables, we know that that each A i is a deterministic function of A i . That is, A i state corresponds to the interval [x 1 ; x 2 ] if and only if A i 2 [x 1 ; x 2 ] Thus, even though the discretized variables are not observed in the training ....
[Article contains additional citation context not shown here]
U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In IJCAI '93. 1993.
....and NB (Ting, 1994; 1997) are used in our experiments. IB1 is a variant of IB1 (Aha, Kibler Albert, 1991) that incorporates the modified value difference metric (Cost Salzberg, 1993) and NB is an implementation of the Naive Bayes (Cestnik, 1990) algorithm. Both algorithms include a method (Fayyad Irani, 1993) for discretising continuousvalued attributes in the preprocessing. This preprocessing improved the performance of the two algorithms in most of the continuous valued attribute domains studied by Ting (1994) We use the nearest neighbour for making prediction in IB1 and the default settings are ....
Fayyad, U.M. & K.B. Irani (1993), Multi-Interval Discretization of ContinuousValued Attributes for Classification Learning, in Proceedings of 13th International Joint Conference on Artificial Intelligence, pp. 1022-1027.
.... ID5 which provides an incremental method for building ID3 type decision trees but differs from ID4 in its method for replacing the test attribute [Utgoff, 1988] GID3 and GID3 which does not branch on each value of the chosen attribute to reduce the unnecessary sub division of data [Irani, Cheng, Fayyad, and Qian, 1993], and C4.5 which handles uncertain data [Quinlan, 1993] with the expense of increasing classification error rate. ID3 is a top down, nonbacktracking decision tree algorithm. One of the problems with ID3 is that the decision tree produced overfits the training examples because it performs a ....
Fayyad, U.M., and Irani, K.B. (1993). "Multi-Interval Discretization of ContinuousValued Attributes for Classification Learning", Proceedings of 13th International Joint Conference on Artificial Intelligence, (Ed. R. Bajcsy) Philadelphia, PA: Morgan Kaufmann 1022-1027.
....] This indicates that the normal distribution is not appropriate in this application. Discretization quantizes a numeric attribute into ranges so that the resulting new attribute can be treated as a nominal one: each value represents a range of values of the original numeric attribute. Kea uses Fayyad and Irani s [ 1993 ] discretization scheme, which is based on the Minimum Description Length principle. It recursively splits the attribute into intervals, at each stage minimizing the entropy of the class distribution. It stops splitting when the total cost for encoding both the discretization and the class ....
Usama M. Fayyad and Keki B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of the 13th International Joint Conference on Artifical Intelligence, pages 1022--1027. Morgan Kaufmann, 1993.
No context found.
M., F. U. and Irani, K. B. (1993). Multi-interval discretization of continuousvalued attributes for classification learning. In Kau#man, M., editor, Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 1022--1029.
No context found.
Usama M. Fayyad and Keki B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of IJCAI-93, volume 2, pages 1022--1027. Morgan Kaufmann Publishers, August/September 1993.
No context found.
U. M. Fayyad, K. B. Irani, Multi-interval discretization of continuousvalued attributes for classification learning, in: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Francisco, CA, 1993, pp. 1022--1027. 28
No context found.
U. M. Fayyad and K. B. Irani. Multiinterval discretization of continuousvalued attributes for classification learning. In Proc. of the 13th International Joint Conference on Artificial Intelligence IJCAi-93, 1993.
No context found.
Fayyad, U. M., Irani, K.B. (1993). Multi-interval discretization of continuousvalued attributes for classification learning. In "Proc. of the 13th International Joint Conference on Artificial Intelligence", Morgan Kaufmann, pp. 1022-1027.
No context found.
U.M. Fayyad and K.B. Irani, Multi-Interval Discretization of ContinuousValued Attributes for Classification Learning, In Proceedings of the Thirteenth International Joint Conference on AI, pp: 1319-1324, Morgan Kaufmann, 1993.
No context found.
Fayyad, U. M., Irani, K.B. (1993). Multi-interval discretization of continuousvalued attributes for classification learning. In Proc. of the 13th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 1022--1027.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC