10 citations found. Retrieving documents...
Provost, F, J., and Kolluri, V., (1997), "Scaling Up Inductive Algorithms: An Overview", Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, (KDD-97), AAAI Press, August 14-17, Newport Beach, California, USA, pp.239242.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Distributed Data Mining Systems - Prodromidis (1999)   (Correct)

....platforms like an armada of ships. In such situations, it may not be feasible to inspect all of the data at one processing site to compute one primary global concept or model. 1. 2 Thesis Statement Meta learning is a technique recently developed [ Chan, Stolfo, Wolpert, 1996; Dietterich, 1997; Provost Kolluri, 1997 ] that deals with the problem of learning useful new information from large and inherently distributed databases. Meta learning aims to compute a number of independent classifiers (concepts or models) by applying learning programs to a collection of independent and inherently distributed databases ....

....classification model. Recently, however, there has been considerable interest in metalearning techniques that combine or integrate an ensemble of models computed by the same or di#erent learning algorithms over a single or multiple data subsets [ Chan, Stolfo, Wolpert, 1996; Dietterich, 1997; Provost Kolluri, 1997 ] 2.2 Meta Learning Meta learning is itself a learning process. Loosely defined, meta learning is about learning from learned knowledge. The idea is to execute a number of concept learning processes on a number of data subsets, and combine their collective results through an extra level of ....

[Article contains additional citation context not shown here]

Provost, F., and Kolluri, V. 1997. Scaling up inductive algorithms: An overview. In Proc. Third Intl. Conf. Knowledge Discovery and Data Mining, 239--242.


Effective and Efficient Pruning of Meta-Classifiers in a .. - Prodromidis, Stolfo.. (1998)   (Correct)

....JAM is the only data mining system to date that employs meta learning as a means to mine distributed databases. However, the literature reports an extensive collection of methods that facilitate the use of inductive learning algorithms for mining very large databases [10] Provost and Kolluri [32] categorized the available methods into three main groups; methods that rely on the design of fast algorithms, methods that reduce the problem size by partitioning the data and methods that employ a relational representation. Meta learning can be considered primarily as a method that reduces the ....

F. Provost and V. Kolluri. Scaling up inductive algorithms: An overview. In Proc. Third Intl. Conf. Knowledge Discovery and Data Mining, pages 239--242, 1997.


PKDD'98 Tutorial on Scalable, High-Performance Data Mining with.. - Freitas (1998)   (Correct)

.... than code optimization Note that the some of the above approaches are not mutually exclusive, so we can use two or more of these approaches at the same time The above approaches are surveyed in more detail in [Freitas Lavington 98] A related survey of approaches for scaling up data mining: Provost Kolluri 97,98] Divides scalable data mining into three main approaches: 1) Data Partitioning instance sampling attribute sampling process samples incrementally process samples concurrently (2) Fast Algorithms restrict model space powerful search heuristics optimized representation ....

F.J. Provost and V. Kolluri. Scaling up inductive algorithms: an overview. Proc. 3rd Int. Conf. Knowledge Discovery and Data Mining (KDD-97), 239-242. AAAI Press, 1997.


On the Effect of Data Set Size on Bias and Variance in.. - Brain, Webb   (Correct)

....used relatively small data sets. Therefore, there is little evidence to support the notion that standard versions of common classification algorithms perform well on very large data sets. In fact, there is a large body of literature on attempts to scale up algorithms to handle large data sets [1, 2, 3]. This body of work primarily addresses the issue of how to reduce the high computational costs of traditional learning algorithms so as to make tractable their application to large data sets. However, this begs the question of whether machine learning algorithms developed for small data sets are ....

Provost, F.J. and Kolluri, V (1997) "Scaling Up Inductive Algorithms: An Overview," Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, pages 239-242. AAAI Press.


A Scalable Bottom-Up Data Mining Algorithm for.. - Giovanni Giuffrida..   (Correct)

....with (1) a small number of records, 2) small value sets and (3) a large number of independent variables. The SQL based nature of KDS is beneficial when the size of a problem is too big to fit in physical memory. Smaller datasets can be better processed by other memory bound induction algorithms [13]. The execution of KDS on a real world large database (1.6 millions records, 6 independent variables for a total of 4,334 different values) required a total of about 5 hours on a dual Pentium Pro system with 128Mb of physical memory and over 30Gb of disk storage. For the sake of performance ....

....process) Once Ripper exhausted the physical memory it resorted to using virtual memory (set up to 1 Gb) resulting in a tremendous performance decrease. 3. Related Work Recently, integration of data mining algorithms with relational databases has been receiving attention. Provost and Kolluri [13] mention the problem of mining relational databases (instead of a single flat file) and the integration of KDD with DBMS as a direction in scaling up to very large datasets (when not enough main memory is available. John and Lent [11] propose a middle layer between data mining algorithms and SQL ....

F. Provost and V. Kolluri. Scaling up inductive algorithms: An overview. KDD-97, 1997.


Effective and Efficient Pruning of Meta-Classifiers in a.. - Prodromidis, Stolfo (1999)   (Correct)

....JAM is the only data mining system to date that employs meta learning as a means to mine distributed databases. However, the literature reports an extensive collection of methods that facilitate the use of inductive learning algorithms for mining very large databases [10] Provost and Kolluri [32] categorized the available methods into three main groups; methods that rely on the design of fast algorithms, methods that reduce the problem size by partitioning the data and methods that employ a relational representation. Meta learning can be considered primarily as a method that reduces the ....

F. Provost and V. Kolluri. Scaling up inductive algorithms: An overview. In Proc. Third Intl. Conf. Knowledge Discovery and Data Mining, pages 239--242, 1997.


On the Management of Distributed Learning Agents - Prodromidis (1997)   (Correct)

....group has attempted to take advantage of the benefits of meta learning to develop a distributed data mining system. On the other hand, the literature is quite rich of methods that facilitate the use of inductive learning algorithms for mining very large databases. In fact, Provost and Kolluri [35] have conducted a survey on the available methods and have categorized them into tree main groups; the methods that rely on the design of fast algorithms, the methods that reduce the problem size by partitioning the data and the methods that employ a relational representation. Meta Learning can be ....

F. Provost and V. Kolluri. Scaling up inductive algorithms: An overview. In Proc. Third Intl. Conf. Knowledge Discovery and Data Mining, pages 239--242, 1997.


A Comparative Evaluation of Meta-Learning Strategies over.. - Prodromidis, Stolfo (1999)   (1 citation)  (Correct)

....many realistic problems and databases. One means to address this problem is to apply various inductive learning programs over the distributed subsets of data in parallel and integrate the resulting classification models or classifiers in some principled fashion to boost overall predictive accuracy [8, 20]. This approach has two advantages, first it uses serial code (standard o# the shelf learning programs) at multiple sites without the time consuming process of writing parallel programs and second, the learning processes can use small subsets of data that can fit in main memory (a data reduction ....

F. Provost and V. Kolluri. Scaling up inductive algorithms: An overview. In Proc. Third Intl. Conf. Knowledge Discovery and Data Mining, pages 239--242, 1997.


A Survey of Methods for Scaling Up Inductive Algorithms - Provost, Kolluri (1999)   (31 citations)  Self-citation (Provost Kolluri)   (Correct)

.... Lederberg, and Djerassi 1976) Buchanan and Feigenbaum 1978) Examples of MetaDENDRAL style rule learning include the Brute programs (Riddle, Segal, and Etzioni 1994; Segal and Etzioni 1994a) PVM (Weiss, Galen, and Tadepalli 1990) ITRULE (Smyth and Goodman 1992) the RL programs (Clearwater and Provost 1990; Provost and Buchanan 1995; Fawcett and Provost 1997) SE trees (Rymon 1993) and even Schlimmer s determination learning algorithm (Schlimmer 1993) These programs view rule learning as an explicit search of the rule space rooted at the rule with no conditions in the antecedent, with rules ....

....or unnecessary computations the models induced will not be affected. Some such optimizations are remarkable enough to have appeared in published work. Some of the rule space pruning techniques used in MetaDENDRAL style rule learners can be guaranteed not to discard good rules (Clearwater and Provost 1990; Segal and Etzioni 1994b; Webb 1995) Webb (1995) takes this idea even further, introducing techniques for dynamic search space restructuring to maximize the amount of search space removed with each pruning. He shows that it is possible to search exhaustively for the rule that optimizes the ....

Provost, F. and V. Kolluri (1997a). Scaling Up inductive algorithms: An overview. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining., Menlo Park, CA, pp. 239--242. AAAI Press.


A Hybrid Model for Delivering Internet-based Distributed Data.. - Krishnaswamy (2002)   (Correct)

No context found.

Provost, F, J., and Kolluri, V., (1997), "Scaling Up Inductive Algorithms: An Overview", Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, (KDD-97), AAAI Press, August 14-17, Newport Beach, California, USA, pp.239242.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC