See this document in CiteSeerX!

Global Data Analysis and the Fragmentation Problem in Decision Tree Induction (1997)  (Make Corrections)  (9 citations)
Ricardo Vilalta, Gunnar Blix, Larry Rendell



  Home/Search   Context   Related

 
View or download:
ibm.com/people/v/vilalta/pa...ecml97.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  ibm.com/people/v/v...publications (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: We investigate an inherent limitation of top-down decision tree induction in which the continuous partitioning of the instance space progressively lessens the statistical support of every partial (i.e. disjunctive) hypothesis, known as the fragmentation problem. We show, both theoretically and empirically, how the fragmentation problem adversely affects predictive accuracy as variation r (a measure of concept difficulty) increases. Applying feature-construction techniques at every tree... (Update)

Context of citations to this paper:   More

...experts in the domain. Others (e.g. Holte, Acker, Porter [16] Pagallo Haussler [7] Murphy Pazzani [17] Vilalta, Blix, Rendell [18]) have also reported on the problems associated with unreliably estimating descriptive statistics from small groups of examples and have...

...are either eliminated or alleviated by using compound features constructed automatically. The three datasets are: ffl DNF9b dataset [21] which has 9 binary features x 1 ; x 2 ; x 9 . The 512 instances are labeled as follows: a) Class 1: x 1 x 2 x 3 x 1 x 2 x 7...

Cited by:   More
Rule Induction of Computer Events - Vilalta, Ma, Hellerstein (2001)   (Correct)
Intuitive Representation of Decision Trees Using General Rules .. - Liu, Hu, Hsu (2000)   (Correct)
Feature Transformation and Multivariate Decision Tree Induction - Liu, Setiono   (Correct)

Similar documents (at the sentence level):
11.8%:   Learning Algorithms: Generating Flexible And Adaptable Concept.. - Vilalta (1998)   (Correct)

Active bibliography (related documents):   More   All
0.5:   Quantifying the Value of Constructive Induction, Knowledge, and.. - Kadie (1991)   (Correct)
0.5:   GalaII: Integrating the Construction of Boolean and.. - Hu, Kibler (1997)   (Correct)
0.2:   Constructing New Attributes for Decision Tree Learning - Zheng (1996)   (Correct)

Similar documents based on text:   More   All
0.8:   Integrating Feature Construction with Multiple Classifiers.. - Vilalta, Rendell   (Correct)
0.6:   The Value of Lookahead Feature Construction in Decision.. - Vilalta, Blix, Rendell (1995)   (Correct)
0.4:   On the Importance of Change of Representation in Induction - Pérez, Vilalta, Rendell   (Correct)

Related documents from co-citation:   More   All
7:   Programs for machine learning (context) - Quinlan - 1993
4:   UCI Repository of machine learning databases [http:// www (context) - Merz, Murphy - 1996
4:   A penalty function approach for pruning feedforward neural networks - Setiono - 1994

BibTeX entry:   (Update)

Vilalta, R., Blix, G. & Rendell, L. (1997). Global data analysis and the fragmentation problem in decision tree induction. In Proceedings of the 9th European Conference on Machine Learning (pp. http://citeseer.ist.psu.edu/vilalta97global.html   More

@inproceedings{ vilaltaglobal,
    author = "Ricardo Vilalta and Gunnar Blix and Larry Rendell",
    title = "Global Data Analysis and the Fragmentation Problem in Decision Tree Induction",
    pages = "312--328",
    url = "citeseer.ist.psu.edu/vilalta97global.html" }
Citations (may not include all citations):
1262   Classification And Regression Trees (context) - Breiman, Friedman et al. - 1984
233   The CN2 Induction Algorithm - Clark, Niblett - 1989
147   Boolean Feature Discovery in Empirical Learning (context) - Pagallo, Haussler - 1990
83   Generating Production Rules from Decision Trees (context) - Quinlan - 1987
69   Multivariate Decision Trees - Brodley, Utgoff - 1995
36   Oversearching and Layered Search in Empirical Learning - Quinlan, Cameron - 1995
34   Lookahead Feature Construction for Learning Hard Concepts (context) - Ragavan, Rendell - 1993
31   An SE-tree Based Characterization of the Induction Problem - Rymon - 1993
31   Learning Hard Concepts through Constructive Induction: Frame.. (context) - Rendell, Seshu - 1990
18   A Probabilistic Theory of Pattern Recognition (context) - Luc, L'aszl'o et al. - 1996
2   Learning Despite Concept Variation by Finding Structure in A.. - P'erez, Rendell - 1996
1   Techniques for Efficient Feature Construction in Decision Tr.. (context) - Vilalta, Blix et al. - 1997
1   61--69 This article was processed using the L T E X macro pa.. (context) - Weiss, Indurkhya - 1993
1   The Multipurpose Incremental Learning System AQ15 and its Te.. (context) - Michalsky - 1986



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.research.ibm.com/people/v/vilalta/publications.html):   More
Understanding Accuracy Performance Through Concept.. - Vilalta (1999)   (Correct)
A Quantification Of Distance-Bias Between Evaluation Metrics .. - Vilalta, Oblinger (2000)   (Correct)
On the Importance of Change of Representation in Induction - Pérez, Vilalta, Rendell   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC