| Andreas Buja and Yung-Seop Lee. Data mining criteria for tree-based regression and classification. In Proceedings KDD, 2001. |
....L to infer a hypothesis from the complete dataset D. Thatis,C[I d (D 1 ) I d (D 2 ) d (Dn ) I(D) Thus, we can guarantee that L d (D 1 , n ) H(C[I d (D 1 ) d (Dn ) will be exact with respect to L(D) H(I(D) 3 Decision Tree Induction from Distributed Data Decision tree algorithms [4, 5, 6] represent a widely used family of machine learning algorithms for building pattern classifiers from labeled training data. They can also be used to learn associations among di#erent attributes of the data. Some of their advantages over other machine learning techniques include their ability to: ....
....splitting criteria are based on entropy [4] which is used by Quinlan s ID3 algorithm and its variants, and the Gini Index [5] which is used by Breiman s CART algorithm, among others. More recently, additional splitting criteria that are useful for exploratory data analysis have been proposed [6]. Consider a set of instances S which is partitioned into M disjoint subsets (classes) C 1 ,C 2 , CM such that S = M C i and C i C j = ##i #= j. The estimated probability that a randomly chosen instance s S belongs to the class C j is p j = C ,where X denotes the ....
[Article contains additional citation context not shown here]
A. Buja, Y. Lee, Data Mining Criteria for Tree-Based Regression and Classification, 2000.
No context found.
Andreas Buja and Yung-Seop Lee. Data mining criteria for tree-based regression and classification. In Proceedings KDD, 2001.
No context found.
Andreas Buja and Yung-Seop Lee. Data mining criteria for tree-based regression and classification. In Proceedings KDD, 2001.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC