| Brodley C. E. and Utgo# P.E. Multivariate versus univariate decision tree (technical report 92-8). Technical report, University of Massachusetts at Amherst, Amherst, Massachusets 01003 USA, 1992. |
....a problem inherent in any hill climbing procedure: it can be deceived by local maxima. Therefore, although the hypothesis space LMDT searches includes univariate decision trees, the heuristic nature of LMDT s search may result in selecting a test from an inappropriate part of the hypothesis space [7]. 47 Over fitting training data is a profound problem and avoiding it an important component of any learning algorithm. During training, the Q # algorithm adds a new prototype into the set of existing prototypes if the first prototype nearest the training instance wrongly classifies this ....
Brodley C. E. and Utgo# P.E. Multivariate versus univariate decision tree (technical report 92-8). Technical report, University of Massachusetts at Amherst, Amherst, Massachusets 01003 USA, 1992.
....as good a split as OC1 because of local minima, and (2) the impurity of the split found by CART LC s does not monotonically decrease with time. LMDT: Another oblique decision tree algorithm, one that uses a very different approach from CART LC, is the Linear Machine Decision Trees (LMDT) system [483, 48], which is a successor to the Perceptron Tree method [480, 482] Each internal node in an LMDT tree is a Linear Machine [364] The training algorithm presents examples repeatedly at each node until the linear machine converges. Because convergence cannot be guaranteed, LMDT uses heuristics to ....
....which we know the correct concept definition. This allows us to quantify more precisely how the parameters of our algorithm affect its performance. A second purpose of this experiment is to compare OCl s search strategy with that of two existing oblique decision tree induction systems LMDT [48] and SADT [204] We show that the quality of trees induced by OC1 is as good as, if not better than, that of the trees induced by these existing systems on three artificial domains. We also show that OC1 achieves a good balance between amount of effort expended in search and the quality of the ....
CARLA E. BRODLEY AND PAUL E. UTGOFF. Multivariate versus univariate decision trees. Technical Report COINS-CR-92-8, Dept. of Computer Science, University of Massachusetts, Amherst, MA, January 1992.
....decision trees, but it uses at least M of N representations. LFC (Ragavan and Rendell, 1993) uses negation and conjunction as constructive operators. It creates one conjunction for each decision node by using a directed lookahead search method. Another multivariate tree learning algorithm is LMDT (Brodley and Utgoff, 1992) that generates a linear machine at each decision node. There are also constructive rule learning methods that use the data driven constructive strategy. AQ17 DCI (Bloedorn and Michalski, 1998a; 1998b) is a system of this type. 15 Examples of this selected class are referred to as positive ....
BRODLEY, C.E. and UTGOFF, P.E. (1992): Multivariate versus univariate decision trees. COINS Technical Report 92--8, Department of Computer Science, University of Massachusetts, Amherst, MA.
....using constructive induction methods is referred to as constructive decision tree learning, while that using selective induction methods is referred to as selective decision tree learning. Decision trees built by selective decision tree learning algorithms such as C4.5 are called univariate trees [Brodley and Utgoff, 1992] since a test at each decision node is based on a single primitive attribute. By contrast, constructive decision tree learning algorithms such as Fringe and ID2 of 3 create multivariate trees where tests consist of multiple primitive attributes. Figure 1.3 on page 12 shows a simple multivariate ....
.... of constructing new continuous valued attributes using mathematical operators (e.g. Bacon [Langley, Simon, Bradshaw, and Zytkow, 1987] Induce [Michalski, 1978] or attribute counting attributes 14 (e.g. Induce, AQ17 dci, and AQ17 mci [Bloedorn et al. 1993] In addition, systems such as Lmdt [Brodley and Utgoff, 1992], Swap1 [Indurkhya and Weiss, 1991] and Ccaf [Yip and Webb, 1994] construct linear machines, linear discriminant functions, or canonical discriminant functions as new attributes. Note that linear machines, as tests, have multiple values, one for each class. They can be considered as nominal ....
[Article contains additional citation context not shown here]
C.E. Brodley and P.E. Utgoff, Multivariate versus univariate decision trees. COINS Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA.
....ice cream in the restaurant, paying for just what you ordered. The emphasis of the research in the machine learning and statistics community has been on improving the accuracy of classifiers. Many studies have been performed to determine which algorithm has the highest prediction accuracy [SMT91, BU92, DBP93, CM94, MST94] These studies indicate that no algorithm is uniformly most accurate over all the datasets studied. Mehta et al. also show quality studies [MRA95, MAR96] which indicate that the accuracy of the decision tree built by Sprint is not uniformly superior. We have therefore ....
C.E. Brodley and P.E. Utgoff. Multivariate versus univariate decision trees. TR 8, Department of Computer Science, University of Massachussetts, 1992.
....a classifier. Usually there is a tradeoff between accuracy and efficiency, since improved accuracy tends to require more cost. As indicated by numerous studies comparing classifiers based on their classification accuracy, no single method has been found to outperform all others in all situations [46, 5, 13, 14, 34]. Conciseness measures the size of a classifier. In the case of a classification tree, the tree size is reflected in both the number of leaf nodes and the depth of a tree. Interpretability refers to the level of understanding and insight that is provided by the learned model. Usually conciseness ....
C. E. Brodley and P. E. Utgoff. Multivariate versus univariate decision trees. In Technical Report 8, Department of Computer Science, Univ. of Massachusetts, 1992.
....Basis Function Network [3] Therefore, the 2 result of StatLog tell us nothing about any remaining algorithms. It is also a fact that some results of comparative studies accompanying software packages. However, these publications only make the comparison between two or three algorithms [21] 22][9] [28] 26] 24] 25] and are seldom selected to be impartial to the product. The lack of information about comparative studies of public domain data mining tools 1 is the main motivation of this research. In addition, a characterization of these analysis tools and their suitability to ....
....the cross validation process is costly in regard computing time and OC1 is unable to run on very large data (e.g. the Letter Recognition database) OC1 is supplied with all necessary facilities for knowledge reuse and classification from unlabeled samples. 2.2. 4 LMDT General Information LMDT [9] [26] stands for Linear Machine Decision Tree and an algorithm for inducing multi class decision trees with multivariate tests at internal decision nodes. LMDT was created by Paul E. Utgoff and Carla E. Brodley and can be obtained through contact with the authors by E mail: brodley ecn.purdue.edu. ....
[Article contains additional citation context not shown here]
Brodley C. E. and Utgoff P.E. Multivariate versus univariate decision tree (technical report 92-8). Technical report, University of Massachusetts at Amherst, Amherst, Massachusets 01003 USA, 1992.
....the soft entropy function, a continuous approximation to the entropy criterion used previously in decision tree induction. Previous work in neural trees has generally used off the shelf training algorithms for neural networks, thus minimizing a mean squared error criterion(Sankar Mammone 1991, Brodley Utgoff 1992). Soft entropy is to be preferred over MSE for several reasons. Figure 1 shows a dataset and the hyperplanes (neural nets with no hidden units) minimizing MSE and soft entropy. Because the number of grey patterns in the middle is outnumbered by both clusters of black patterns on either side, any ....
Brodley, C. E. & Utgoff, P. E. (1992), Multivariate versus univariate decision trees, Technical Report COINS TR 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA, 01003.
....on the accuracy of the classifiers. One study, called the StatLog project (Michie, Spiegelhalter and Taylor, 1994) compared the accuracy of several decision tree classifiers against some non decision tree classifiers on a large number of datasets. Other studies that are smaller in scale include Brodley and Utgoff (1992), Brown, Corruble and Pittard (1993) Curram and Mingers (1994) and Shavlik, Mooney and Towell (1991) Supported in part by grants from the U. S. Army Research Office and Pfizer, Inc. and a University of Wisconsin Vilas Associateship y Supported in part by Republic of China National Science ....
Brodley, C. E. and Utgoff, P. E. (1992). Multivariate versus univariate decision trees, Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA.
....is on the accuracy of the algorithms. One study, called the StatLog Project (Michie, Spiegelhalter and Taylor, 1994) compares the accuracy of several decision tree algorithms against some non decision tree algorithms on a large number of datasets. Other studies that are smaller in scale include Brodley and Utgoff (1992), Brown, Corruble and Pittard (1993) Curram and Mingers (1994) and Shavlik, Mooney and Towell (1991) Recently, comprehensibility of the tree structures has received some attention. Comprehensibility typically decreases with increase in tree size and complexity. If two trees employ the same kind ....
Brodley, C. E. and Utgoff, P. E. (1992). Multivariate versus univariate decision trees, Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA.
....to promise more generality as each node in our tree implements a separate linear discriminant function while only the leaves of a Perceptron Tree have this generality and the remaining nodes in both the Perceptron Tree and the trees produced by ID3 perform a test on only one feature. Recently, Brodley and Utgoff (1992) have also shown that the use of multivariate tests at each node of a decision tree often provides greater generalization when learning concepts in which there are irrelevant attributes. Furthermore, as presented in (Brent 1990) we show how such an LTU tree can be transformed into a three layer ....
....k Dq k = lrate O k (1 O k ) d O k ) Where w is the weight vector being updated and x is a given instance vector. We set lrate = 1.0 and momentum = 0.5 in our experiments. There are many possible extensions to this LTU treebuilding algorithm including irrelevant attribute elimination (Brodley Utgoff 1992), producing several hyperplanes at each node using different weight updating procedures and selecting the hyperplane which causes the fewest number of incorrect classifications, using Bayesian analysis to determine instance separations (Langley 1992) post processing of the tree to reduce its ....
Brodley, C. E., and Utgoff, P. E. 1992. Multivariate Versus Univariate Decision Trees. COINS Technical Report 92-8, Dept. of Computer Science, Univ. of Mass.
....ice cream in the restaurant, paying for just what you ordered. The emphasis of the research in the machine learning and statistics community has been on improving the accuracy of classifiers. Many studies have been performed to determine which algorithm has the highest prediction accuracy [SMT91, BU92, DBP93, CM94, MST94] These studies indicate that no algorithm is uniformly most accurate over all the datasets studied. Mehta et al. also show quality studies [MRA95, MAR96] which indicate that the accuracy of the decision tree built by Sprint is not uniformly superior. We have therefore ....
C.E. Brodley and P.E. Utgoff. Multivariate versus univariate decision trees. TR 8, Department of Computer Science, University of Massachussetts, 1992.
....number of single feature tests may be required to properly separate the data set, while just a few oblique separating functions could perform just as well. Such multivariate decision trees have only very recently begun to attract the attention of researchers in the machine learning community [BU, 1992; 1994] Furthermore, we show how any such TLU trees can be mechanically transformed into a three layer neural network as first suggested by Brent [BR, 1990] and developed by Sahami [SA, 1993] In our investigation, we compare several different methods for learning the linear discriminant at each ....
Brodley, C.E., and Utgoff, P.E. 1992. Multivariate Versus Univariate Decision Trees. COINS Technical Report 92-8, Computer Science Dept., UMass.
....jumps to escape local minima are marked. Note that the impurity of OC1 s perturbations is monotonically decreasing unlike that of CART LC. 46 LMDT: Another oblique decision tree algorithm, one that uses a very different approach from CARTLC, is the Linear Machine Decision Trees (LMDT) system [483, 48], which is a successor to the Perceptron Tree method [480, 482] Each internal node in an LMDT tree is a Linear Machine [364] The training algorithm presents examples repeatedly at each node until the linear machine converges. Because convergence cannot be guaranteed, LMDT uses heuristics to ....
....which we know the correct concept definition. This allows us to quantify more precisely how the parameters of our algorithm affect its performance. A second purpose of this experiment is to compare OC1 s search strategy with that of two existing oblique decision tree induction systems LMDT [48] and SADT [204] We show that the quality of trees induced by OC1 is as good as, if not better than, that of the trees induced by these existing systems on three artificial domains. We also show that OC1 achieves a good balance between amount of effort expended in search and the quality of the ....
Carla E. Brodley and Paul E. Utgoff. Multivariate versus univariate decision trees. Technical Report COINS-CR-92-8, Dept. of Computer Science, University of Massachusetts, Amherst, MA, January 1992.
....function, a new continuous approximation to the entropy criterion used previously in univariate decision tree induction. Previous work in multivariate decision trees has used off the shelf training algorithms for perceptrons (linear discriminants) thus minimizing a mean squared error criterion(Brodley Utgoff 1992). Soft entropy is to be preferred over MSE for several reasons. Figure 1 shows a dataset and the linear discriminants minimizing MSE and soft entropy. Clearly the objective of partitioning in a decision tree is not to minimize error immediately (which is what the MSE solution gives) but rather to ....
Brodley, C. E. & Utgoff, P. E. (1992), Multivariate versus univariate decision trees, Technical Report COINS TR 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA, 01003.
....a problem inherent in any hill climbing procedure: it can be deceived by local maxima. Therefore, although the hypothesis space LMDT searches includes univariate decision trees, the heuristic nature of LMDT s search may result in selecting a test from an inappropriate part of the hypothesis space [7]. A Performance Survey of Public Domain Supervised Machine Learning Algorithms Delta 47 Over fitting training data is a profound problem and avoiding it an important component of any learning algorithm. During training, the Q algorithm adds a new prototype into the set of existing prototypes ....
Brodley C. E. and Utgoff P.E. Multivariate versus univariate decision tree (technical report 92-8). Technical report, University of Massachusetts at Amherst, Amherst, Massachusets 01003 USA, 1992.
....to completely classify the data, and hence fewer nodes in the decision tree. This notion of using multivariate as opposed to univariate separating functions in the induction of decision trees has only very recently begun to attract the attention of researchers in the machine learning community [2, 3]. While earlier attempts in this direction have been made, most notably with the Perceptron Tree algorithm [14] even the Perceptron Tree does not capture the full generality of our structure. In Perceptron Trees only the leaf nodes of the tree may implement a general linear discriminant function. ....
....only the leaf nodes of the tree may implement a general linear discriminant function. All internal nodes of the tree may only use univariate tests (similar to ID3) to shatter the instance space, again causing this algorithm to suffer from the same shortcomings as ID3. Recently, Brodley and Utgoff [2] have shown that learning multivariate (as opposed to univariate) decision trees also has the potential for greater generalization capabilities, but the results of these experiments are still preliminary and conflicting results have also been reported [12] This report helps to address these ....
Brodley, C. E., and Utgoff, P. E. 1992. Multivariate Versus Univariate Decision Trees. COINS Technical Report 92-8, Dept. of Computer Science, U. Mass.
.... is concerned, related work includes the Fringe family of algorithms such as Fringe, Dual Fringe, Symmetric Fringe [2, 5] SymFringe, DCFringe [6] and SFringe [7] the Citre algorithm [1, 14] the CI algorithms [3, 7] the LFC algorithm [15] the ID2 of 3 algorithm [16] the Lmdt algorithm [17], and the XofN algorithm [10] They use different constructive operators and different strategies to create new attributes. As far as systematic search is concerned, the closest related work is Opus [4] Some ideas in CAT about the systematic search with pruning are from it. Opus carries out ....
C.E. Brodley and P.E. Utgoff, Multivariate versus univariate decision trees. COINS Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA, 1992.
....decision trees, but it uses at least M of N representations. LFC (Ragavan and Rendell, 1993) uses negation and conjunction as constructive operators. It creates one conjunction for each decision node by using a directed lookahead search method. Another multivariate tree learning algorithm is Lmdt (Brodley and Utgoff, 1992) that generates a linear machine at each decision node. 7 Conclusions Based on the following three observations, this paper has proposed a novel fixed rule based approach to constructing conjunctions as new attributes for decision tree learning. 1) Existing hypothesis driven constructive ....
Brodley, C.E. and Utgoff, P.E. (1992): Multivariate versus univariate decision trees. COINS Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA.
....are true. M of N representations are also boolean attributes. A few systems such as BACON [ Langley et al. 1987 ] and INDUCE [Michalski, 1978] explore methods to construct new continuous valued attributes using mathematical operators such as multiplication and division. Systems such as LMDT [Brodley and Utgoff, 1992] and Swap1 [Indurkhya and Weiss, 1991] construct linear discriminant functions as new attributes. To the best of the author s knowledge, few researchers have developed constructive induction systems that construct new nominal attributes. One exception is that INDUCE, AQ17 DCI, and AQ17 MCI ....
....and previously constructed new attributes. After growing a tree, XofN applies the pruning mechanism used by C4.5 [ Quinlan, 1993 ] Decision trees built by conventional tree learning algorithms such as C4.5 use a test based on one attribute at each decision node. They are called univariate trees [Brodley and Utgoff, 1992]. By contrast, XofN creates multivariate trees in which tests can refer to multiple attributes. We call them X of N trees as X of N representations are used as multivariate tests. At each decision node, besides creating one new X of N attribute, XofN considers reusing X of N attributes constructed ....
[Article contains additional citation context not shown here]
C.E. Brodley and P.E. Utgoff, Multivariate versus univariate decision trees. COINS Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, Massachusetts, USA, 1992.
....for finding the appropriate representational bias for each test in the decision tree. Specifically, rather than search the space of multivariate tests using a fixed bias (like LMDT) such a system would have the capability to focus its search using heuristic measures of the learning process (Brodley Utgoff, 1992). The problem of bias is not restricted to decision trees. It is a well known problem that the ability of a chosen algorithm to induce a good generalization depends on how well the hypothesis space underlying the learning algorithm and the bias for searching that space fit the given task. Given ....
Brodley, C. E., & Utgoff, P. E. (1992). Multivariate versus univariate decision trees, (Coins Technical Report 92-8), Amherst, MA: University of Massachusetts, Department of Computer and Information Science.
....is the combined model classes of linear discriminant functions, decision trees and instance based classifiers. We choose these model classes because previous research has illustrated that each has different strengths (Schlimmer, 1987; Pagallo, 1990; Utgoff, 1989; Utgoff Brodley, 1990; Aha, 1990; Brodley Utgoff, 1992). Because the ease with which the different representation languages can describe a particular concept varies, combining them allows the system to learn good generalizations for a wider class of learning tasks (Utgoff, 1989) To validate our hypothesis we will evaluate the implemented system ....
....The control strategy for the perceptron tree algorithm employs a fixed order selection strategy; it first tries an LTU, if that fails, it then grows a symbolic decision tree node. There are two successor systems to the original perceptron tree algorithm, PT2 (Utgoff Brodley, 1990) and LMDT (Brodley Utgoff, 1992). PT2 induces decision trees in which each node in the tree is an LTU. At each node in the tree, the algorithm trains an LTU based on all n of the input variables. When a good set of weights has been found, as measured by Gallant s pocket al..gorithm (Gallant, 1986) the system then tries to ....
[Article contains additional citation context not shown here]
Brodley, C. E., & Utgoff, P. E. (1992). Multivariate versus univariate decision trees, (Coins Technical Report 92-8), Amherst, MA: University of Massachusetts, Department of Computer and Information Science.
No context found.
Brodley C. E. and Utgoff P.E. Multivariate versus univariate decision tree (technical report 92-8). Technical report, University of Massachusetts at Amherst, Amherst, Massachusets 01003 USA, 1992.
No context found.
C. E. Brodley and P. E. Utgo. Multivariate versus univariate decision trees. Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA, 1992.
No context found.
C.E. Brodley and P.E. Utgoff, Multivariate versus univariate decision trees. COINS Technical Report 92-8, Department of Computer Science, University of Massachusetts, Amherst, MA (1992).
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC