See this document in CiteSeerX!

Scaling Up Inductive Algorithms: An Overview (1997)  (Make Corrections)  (10 citations)
Foster Provost, Venkateswarlu Kolluri
Knowledge Discovery and Data Mining



  Home/Search   Context   Related

 
View or download:
nyu.edu/~fprovost/Pa...kdd97scaling.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  nyu.edu/~fprovost (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper establishes common ground for researchers addressing the challenge of scaling up inductive data mining algorithms to very large databases, and for practitioners who want to understand the state of the art. We begin with a discussion of important, but often tacit, issues related to scaling up. We then overview existing methods, categorizing them into three main approaches. Finally, we use the overview to recommend how to proceed when dealing with a large problem and where future... (Update)

Context of citations to this paper:   More

...on very large data sets. In fact, there is a large body of literature on attempts to scale up algorithms to handle large data sets [1, 2, 3]. This body of work primarily addresses the issue of how to reduce the high computational costs of traditional learning algorithms so...

.... approaches are surveyed in more detail in [Freitas Lavington 98] A related survey of approaches for scaling up data mining: Provost Kolluri 97,98] Divides scalable data mining into three main approaches: 1) Data Partitioning instance sampling attribute sampling...

Cited by:   More
A Hybrid Model for Delivering Internet-based Distributed Data.. - Krishnaswamy (2002)   (Correct)
Distributed Data Mining Systems - Prodromidis (1999)   (Correct)
Effective and Efficient Pruning of Meta-Classifiers in a .. - Prodromidis, Stolfo.. (1998)   (Correct)

Similar documents (at the sentence level):
43.6%:   A Survey of Methods for Scaling Up Inductive Learning Algorithms - Provost, Kolluri (1997)   (Correct)
16.3%:   A Survey of Methods for Scaling Up Inductive Algorithms - Provost, Kolluri (1999)   (Correct)

Active bibliography (related documents):   More   All
0.1:   Collective Data Mining: A New Perspective Toward Distributed.. - Kargupta, al (1999)   (Correct)
0.1:   PKDD'98 Tutorial on Scalable, High-Performance Data Mining with.. - Freitas (1998)   (Correct)
0.0:   A Study of Support Vectors on Model Independent Example Selection - Syed, Li, Sung (1999)   (Correct)

Similar documents based on text:   More   All
0.2:   Distributed Data Mining: Scaling up and beyond - Provost (1999)   (Correct)
0.2:   Scaling up and Evaluation - Esther Duflo Paper   (Correct)
0.1:   The WoRLD: Knowledge Discovery from Multiple Distributed.. - Aronis, Provost, Buchanan (1997)   (Correct)

Related documents from co-citation:   More   All
7:   Programs for machine learning (context) - Quinlan - 1993
6:   Jam: Java agents for meta-learning over distributed databases - Stolfo, Prodromidis et al. - 1997
6:   Classification and Regression Trees (context) - Breiman, Friedman et al. - 1984

BibTeX entry:   (Update)

F. Provost and V. Kolluri. Scaling up inductive algorithms: An overview. In Proc. Third Intl. Conf. Knowledge Discovery and Data Mining, pages 239--242, 1997. http://citeseer.ist.psu.edu/provost97scaling.html   More

@inproceedings{ provost97scaling,
    author = "Foster J. Provost and Venkateswarlu Kolluri",
    title = "Scaling Up Inductive Algorithms: An Overview",
    booktitle = "Knowledge Discovery and Data Mining",
    pages = "239--242",
    year = "1997",
    url = "citeseer.ist.psu.edu/provost97scaling.html" }
Citations (may not include all citations):
274   Generalization as search (context) - Mitchell - 1982
216   Very Simple Classification Rules Perform Well on Most Common.. (context) - Holte - 1993
209   Mining Quantitative Association Rules in Large Relational Ta.. - Srikant, Agrawal - 1996
149   Quantifying inductive bias: AI learning algorithms and Valia.. (context) - Haussler - 1988
62   Megainduction: machine learning on very large databases (context) - Catlett - 1991
60   A theory of learning classification rules - Buntine - 1991
53   Knowledge Discovery and Data Mining: Towards a Unifying Fram.. - Fayyad, Piatetsky-Shapiro et al. - 1996
28   Inductive Policy: The pragmatics of bias selection (context) - Provost, Buchanan - 1995
25   Feature subset selection using wrapper model: Overfitting an.. (context) - Kohavi, Sommerfield - 1995
23   Scaling up inductive learning with massive parallelism - Provost, Aronis - 1996
14   Problem solving and rule induction: A unified view (context) - Simon, Lea - 1973
13   A Survey of Methods for Scaling Up Inductive Learning Algori.. - Provost, Kolluri - 1997



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.stern.nyu.edu/~fprovost):   More
On Applied Research in Machine Learning - Provost (1998)   (Correct)
Well-Trained PETs: Improving Probability Estimation Trees - Provost, Domingos (2000)   (Correct)
Machine Learning from Imbalanced Data Sets 101 (Extended Abstract) - Provost   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC