| D. Michie, D.J. Spiegelhalter and C.C. Taylor, "Machine learning, neural and statistical classification". Elis Horwood, London 1994 |
....index might fall. Figure 2 shows a hypothetical gini curve, and three alive intervals (shaded areas in the figure) Our experiments of the estimation method are summarized in Table 1. The first four small datasets (Letter, Satimage, Segment and Shuttle) in the table are from the STATLOG project[6], and the two large datasets (Function 2 and Function 7) are synthetic datasets described in [5] In these test cases there were at most N = 2 alive intervals: i) the one whose left boundary (or right boundary, depending on boundary gradient) has gini index gini min (the middle shaded area in ....
D. Michie, D. J. Spiegelhalter, and C. C. Taylor. "Machine Learning, Neural and Statistical Classification. " Ellis Horwood, 1994
....large (multiple gigabytes) and are often archived in deep storage before valuable information can be obtained from them. An objective of spatial data stream mining is to mine such data in near real time prior to deep storage archiving. Classification is one of the important areas of data mining [6,7,8]. In classification task, a training set (or called learning set) is identified for the construction of a classifier. Each record in the learning set has several attributes, one of which, the goal or class label attribute, indicates the class to which each record belongs. The classifier, once ....
....[3] and SPRINT [3, 5] which concentrate on making it possible to mine databases that do not fit in main memory by only requiring sequential scans of the data. Classification has been applied in many fields, such as retail target marketing, customer retention, fraud detection and medical diagnosis [8]. Spatial data is a promising area for classification. In this paper, we propose a decision tree based model to perform classification on spatial data streams. We use the Peano Count Tree (P tree) structure [11] to build the classifier. P trees [11] represent spatial data bit by bit in a ....
D. Michie, D. J. Spiegelhalter, and C. C. Taylor, "Machine Learning, Neural and Statistical Classification", Ellis Horwood, 1994.
....dragging them to a desired position. Second, as for numerical attributes, the user is not restricted to set just one split point for the selected attribute, but he can set an arbitrary number of split points. Figure 3 depicts the visualization of the DNA training data from the Statlog benchmark [MST 94] which have only categorical attributes. Note that this figure shows only a small subset (the most interesting ones) of the 120 attributes. The visualization indicates that attributes 85 and 90 are good candidates for splitting. In fact, attribute 90 is chosen as the optimal one if using the ....
....clearly fails for more complex information such as the class distribution. We argue that the proposed visualization provides a lot of additional information in a rather compact way. Figure 5 illustrates the visualization of a decision tree for the Segment training data from the Statlog benchmark [MST 94] having 19 numerical attributes. Figure 5. Visualization of a decision tree for the Segment training data root leaf split point inherited split point 11 4 Incorporating Class Semantics We have introduced a new method for visualizing multi dimensional data with a class label such that ....
[Article contains additional citation context not shown here]
Michie D., Spiegelhalter D.J., Taylor C.C.: "Machine Learning, Neural and Statistical Classification", Ellis Horwood, 1994. See also http://www.ncc.up.pt/liacc/ML/statlog/datasets.html.
....and explicit logical rules are almost never published. Several neural methods have been compared experimentally [1] on the mushroom and the three monk problems benchmark datasets [7] but no comparison with machine learning methods has been given. There is a strong competition from decision trees [8], which are fast, accurate and can easily be converted to sets of logical rules, from inductive methods of machine learning [4] and from systems based on fuzzy [9] 10] and rough sets [11] 12] Despite this competition neural networks seem to have important advantages, especially for problems ....
....condition) of 94.9 . More complex network gave 5 disjunctive rules for the malignant cases, with benign cases covered by the ELSE condition: R 1 : f 2 6# f 4 4# f 7 2# f 8 5 (100) R 2 : f 2 6# f 5 4# f 7 2# f 8 5 (100) R 3 : f 2 6# f 4 4# f 5 4# f 7 2 (100) R 4 : f 2 # [6, 8] # f 4 4# f 5 4# f 7 2# f 8 5 (100) R 5 : f 2 6# f 4 4# f 5 4# f 7 # [2, 7] # f 8 5 (92.3) The first 4 rules achieve 100 accuracy (i.e. they cover cases of malignant class only) the last rule covers only 39 cases, 36 malignant and 3 benign. The confusion matrix is: P = ....
[Article contains additional citation context not shown here]
D. Michie, D.J. Spiegelhalter and C.C. Taylor, "Machine learning, neural and statistical classification". Elis Horwood, London 1994
....consist of submodels of high dimensionality. Table 3: Comparison of results for the Pima diabetes detection problem. Approach. Accuracy( Additive Cartesian granule feature Model 79.7 Mass Assignment based MATI [5] 79.7 Oblique Decision Trees [17] 78.5 Neural Net (normalised Data) 78 C4.5 [36] 73 Data browser 70 The Pima diabetes problem is a notoriously difficult machine learning problem. Part of this difficulty arises from the fact the dependent output variable is really a binarised form of another variable which itself is highly indicative of certain types of diabetes but does not ....
D. Michie, D. J. Spiegelhalter and C. C. Taylor (Ed), (1993), "Machine Learning, Neural and Statistical Classification",
....consist of submodels of high dimensionality. Table 3: Comparison of results for the Pima diabetes detection problem. Approach. Accuracy( Additive Cartesian granule feature Model 79.7 Mass Assignment based MATI [4] 79.7 Oblique Decision Trees [16] 78.5 Neural Net (normalised Data) 78 C4.5 [37] 73 Data browser 70 The Pima diabetes problem is a notoriously difficult machine learning problem. Part of this difficulty arises from the fact the dependent output variable is really a binarised form of another variable which itself is highly indicative of certain types of diabetes but does not ....
D. Michie, D. J. Spiegelhalter and C. C. Taylor (Ed), (1993), "Machine Learning, Neural and Statistical Classification", Wiley, New York.
....that take all the features into account (IBk, nave Bayes and Neural Net) had an error rate of less than 4 . The rule induction classifiers ID3 and J4.8 used 4 and 5 of the available attributes respectively, from which to induce their decision trees. Both had very similar error rates. Others [Mic1] have also found the performance of C4.5 and ID3 to be similar. The neural network was the classifier that produced the best results on this data, with an error rate of (just) less than 2 . It should be noted that the error rates generally accepted by haematologists, from flow cytometer based ....
D. Michie, D.J.Spiegelhalter and C.C.Taylor, "Machine Learning, Neural and Statistical Classification", Ellis Horwood Limited, Hemel Hempstead, Hertfordschire, Great Britain, 1994.
....this approach can be used for building the foundations for approximate reasoning. Our approach is based on rough mereology, the recently developed extension of mereology of Le sniewski. 1 INTRODUCTION Different aspects of theory of decision systems are extensively investigated (see e.g. 13] [19], 20] 22] 23] 24] 25] 29] 42] 47] 51] We adopt here the point of view that decision systems are built as hierarchies of teams of 1 2 Chapter 1 intelligent agents, and we discuss some logical tools for synthesis of this kind of systems. Our approach is computationally ....
....given decision table [3 4] We show here how to compute dynamic reducts from reduct (approximations) and how to generate dynamic rules from dynamic reducts. The dynamic reducts have shown their utility in various experiments with data sets of various kinds e.g. market data [14] monk s problems [19], handwritten digits recognition [5] or medical data [3 4] The quality of unseen objects classification by decision rules generated from dynamic reducts increases especially when data are very noisy e.g. market data [14] In all cases we have obtained a substantial reduction of the decision rule ....
Michie D., Spiegelhalter D.J., Taylor C.C., "Machine Learning: Neural and Statistical Classification", Ellis Horwood, New York 1994.
....for classification problems has been extended to include a range of new techniques, such as neural networks and decision tree induction. This observation has led to an increase in the number of empirical comparisons of classification methods on a variety of problems. The European StatLog project [19] can be considered the epitome of this work, comparing 24 techniques on 23 datasets. Unfortunately, the main conclusion of this project was that the performance of the techniques, both in absolute and relative terms, varied considerably for different datasets. As a result, the choice of technique ....
....PO model was fitted using the LOGIST IC procedure of the SAS statistical package [28] For the DT, the popular C4:5 program [22] was used. The main measure of comparison for the two methods is classification performance on test data independent of the design dataset, as carried out in [17] and [19]. Percentage correct classification (agreement with Dr. James diagnosis) is the overall measure of performance used; crosstabulations of the computed diagnoses with Dr. James diagnoses give a more detailed picture of the methods strengths and weaknesses. However, in addition to this evaluation, ....
D. MICHIE, D. J. SPIEGELHALTER, and C. C. TAYLOR. "Machine Learning, Neural and Statistical Classification". Ellis Horwood, New York, 1994.
....dragging 7 them to a desired position. Second, as for numerical attributes, the user is not restricted to set just one split point for one attribute but he can set an arbitrary number of split points. Figure 2 depicts the visualization of the DNA training data from the STATLOG benchmark [MST 94] which have only categorical attributes. As suggested in the description of the training data, only 60 of the 180 attributes were used. The visualization indicates that attributes 85 and 90 are good candidates for splitting. In fact, attribute 90 is chosen as the optimal one if using the ....
....clearly fails for more complex information such as the class distribution. We argue that the proposed visualization provides a lot of additional information in a rather compact way. Figure 3 illustrates the visualization of a decision tree for the Segment training data from the Statlog benchmark [MST 94] having 19 numerical attributes. 4 Integrating Algorithms into Cooperative Decision Tree Construction We have argued for an effective cooperation between the user and the computer so that both contribute what they do best. Our fundamental paradigm is the user as the supervisor , i.e. the ....
[Article contains additional citation context not shown here]
Michie D., Spiegelhalter D.J., Taylor C.C.: "Machine Learning, Neural and Statistical Classification", Ellis Horwood, 1994. See also http://www.ncc.up.pt/liacc/ML/statlog/datasets.html.
....and match processes. 5. Experimental Results on Benchmarks The performance of the CMM classifier has been evaluated and compared with the conventional or simple k NN method. Four benchmarks were used in the evaluation, which consist of large sets of real world problems from the Statlog project [9], including a satellite image database, letter image recognition database, shuttle data set and image segmentation data set. Table 2 gives input and output dimensions and numbers of training and test patterns in each data set. In the evaluation we used both the PCI implementation of the CMM ....
Michie D, Spiegelhalter DJ, Taylor CC. "Machine learning, neural and statistical classification (Chapter 9)". New York, Ellis Horwood, 1994.
....index might fall. Figure 2 shows a hypothetical gini curve, and three alive intervals (shaded areas in the figure) Our experiments of the estimation method are summarized in Table 1. The first four small datasets (Letter, Satimage, Segment and Shuttle) in the table are from the STATLOG project[6], and the two large datasets (Function 2 and Function 7) are synthetic datasets described in [5] In these test cases there were at most N = 2 alive intervals: i) the one whose left boundary (or right boundary, depending on boundary gradient) has gini index gini min (the middle shaded area in ....
....second split 1 3 4 2 3 4 Figure 6. Splitting Matrices Twice into 4 Sub nodes [1] For each attribute i do [2] If (i is the X axis or Y axis of submatrix) then [3] gini = gini index on the submatrix along X or Y axis [4] Else [5] gini = gini Index on attribute i of the parent node [6] End If [7] End For [8] Return the attribute having the minimal gini Index Figure 7. predictSplit(node,submatrix) However, if the first splitting attribute is on any of the Yaxis, we won t be able to split the subnodes again without a data scan since we do not have the histogram of attributes ....
D. Michie, D. J. Spiegelhalter, and C. C. Taylor. "Machine Learning, Neural and Statistical Classification. " Ellis Horwood, 1994
....from the network. It is found that the networks can be constructed very rapidly without any need for user interaction. Klaus Peter Huber and Michael Berthold also conducted a comparative study between two learning algorithms for generating rules from data. Many such comparative studies, e.g. (Michie et al. 1994), can be found in the literature, with some reporting very little difference between methods, while others report sharp contrasts. This raised an issue which concerns many of the analysts working in practical problem solving: if there are many methods which appear to be applicable to the problem ....
Michie, D., Spiegelhalter, D. J., and Taylor, C.C. (eds) (1994) "Machine Learning, Neural and Statistical Classification", London: Ellis Horwood.
....computed from the examples that fall at the leaves. Both processes, Pruning the tree and missing values, explore the smoothed class distributions. Another side effect is when considering problems with cost matrices: there are very simple algorithms for minimizing costs that use class probabilities[12]. Pruning is considered to be the most important part of the tree building process at least in noisy domains. Statistics computed at deeper nodes of a tree have low level of significance due to the small number of examples that fall at these nodes. Deeper nodes reflect too much the training set ....
Michie, D., Spiegelhalter,J. Taylor,C., "Machine Learning, Neural and Statistical Classification", Ellis Horwood, 1994
....terms of accuracy, it was not statistically significant. Pruning the tree and processing missing values explore the smoothed class distributions. Another side effect is when considering problems with cost matrices: there are very simple algorithms for minimizing costs that use class probabilities[9]. Pruning Pruning is considered to be the most important part of the tree building process, at least in noisy domains. Statistics computed at deeper nodes of a tree have low level of significance due to the small number of examples that fall at these nodes. Deeper nodes reflect the training set ....
Michie, D., Spiegelhalter,J. Taylor,C., "Machine Learning, Neural and Statistical Classification", Ellis Horwood, 1994
No context found.
D. Michie, D.J. Spiegelhalter and C.C. Taylor, "Machine learning, neural and statistical classification". Elis Horwood, London 1994
No context found.
Michie D., Spiegelhalter D.J., Taylor C.C. "Machine Learning, Neural and Statistical Classification". Ellis Horwood 1994.
No context found.
D. Michie, D.J. Spiegelhalter and C.C. Taylor, "Machine learning, neural and statistical classification". Elis Horwood, London 1994
No context found.
D. Michie, D. J. Spiegelhalter, and C. C. Taylor, editors. " Machine Learning: Neural and Statistical Classification" , Ellis Horwood, London, ( 1994 ).
No context found.
D. Michie, DJ. Spiegelhalter, and CC. Taylor, "Machine learning, neural and statistical classification," 1994, vol. 6, pp. 84-- 106.
No context found.
D. Michie, D. J. Spiegelhalter & C. C. Taylor, eds, "Machine Learning, Neural and Statistical Classification", Ellis Horwood, 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC