(Enter summary)
Abstract: Classification is an important data mining problem. Although classification is a wellstudied problem, most of the current classification algorithms require that all or a portion of the the entire dataset remain permanently in memory. This limits their suitability for mining over large databases. We present a new decision-tree-based classification algorithm, called SPRINT that removes all of the memory restrictions, and is fast and scalable. The algorithm has also been designed to be easily... (Update)
Cited by: More
Pipelining of Fuzzy ARTMAP without Matchtracking.. - Castro, Secretan, ..
(Correct)
A Data Partitioning Approach to speed up the Fuzzy ARTMAP.. - Castro, al.
(Correct)
The Use of Emerging Patterns in the Analysis of Gene.. - Dong, Li, Wong (2003)
(Correct)
Similar documents (at the sentence level):
65.6%: SPRINT: A Scalable Parallel Classifier for Data Mining - Shafer, Agrawal, Mehta (1996)
(Correct)
Active bibliography (related documents): More All
0.5: CLOUDS: A Decision Tree Classifier for Large Datasets - Alsabti, Ranka, Singh (1998)
(Correct)
0.3: Effect of Data Skewness and Workload Balance in Parallel Data .. - Cheung, Lee, Xiao
(Correct)
0.3: Parallel Mining of Association Rules - Agrawal, Shafer (1996)
(Correct)
Similar documents based on text: More All
0.2: The Quest Data Mining System - Agrawal, Mehta, Shafer, Srikant.. (1996)
(Correct)
0.2: Parallel Classification on SMP Systems - Zaki, Ho, Agrawal (1998)
(Correct)
0.1: Data Engineering - March Vol No
(Correct)
Related documents from co-citation: More All
46: SLIQ: A fast scalable classifier for data mining
- Mehta, Agrawal et al. - 1996
41: Programs for machine learning (context) - Quinlan - 1993
39: Classification and Regression Trees (context) - Breiman, Friedman et al. - 1984
BibTeX entry: (Update)
J. Shafer, R. Agrawal, M. Mehta. SPRINT: A scalable parallel classifier for data mining. In 22nd VLDB Conference, Sept 1996. http://citeseer.ist.psu.edu/article/shafer96sprint.html More
@inproceedings{ shafer96sprint,
author = "John C. Shafer and Rakesh Agrawal and Manish Mehta",
title = "{SPRINT}: {A} Scalable Parallel Classifier for Data Mining",
booktitle = "Proc. 22nd Int. Conf. Very Large Databases, {VLDB}",
month = "3--6~",
publisher = "Morgan Kaufmann",
editor = "T. M. Vijayaraman and Alejandro P. Buchmann and C. Mohan and Nandlal L. Sarda",
isbn = "1-55860-382-4",
pages = "544--555",
year = "1996",
url = "citeseer.ist.psu.edu/article/shafer96sprint.html" }
Citations (may not include all citations):
2177
Programs for Machine Learning (context) - Quinlan - 1993
1359
Induction of decision trees (context) - Quinlan - 1986
1262
Classification and Regression Trees (context) - Breiman, Friedman et al. - 1984
1051
Optimization and Machine Learning (context) - Goldberg, in - 1989
912
MPI: A Message-Passing Interface Standard
- Interface - 1994
417
Stochastic Complexity in Statistical Inquiry (context) - Rissanen - 1989
227
An introduction to computing with neural nets (context) - Lippmann - 1987
200
Neural and Statistical Classification (context) - Michie, Spiegelhalter et al. - 1994
145
SPRINT: A scalable parallel classifier for data mining
- Shafer, Agrawal et al. - 1996
117
IEEE Transactions on Knowledge and Data Engineering (context) - Agrawal, Imielinski et al. - 1993
111
SLIQ: A fast scalable classifier for data mining
- Mehta, Agrawal et al. - 1996
102
Computer Systems that Learn: Classification and Prediction M.. (context) - Weiss, Kulikowski - 1991
95
An interval classifier for database mining applications
- Agrawal, Ghosh et al. - 1992
89
The Gamma database machine project
- DeWitt, Ghandeharizadeh et al. - 1990
62
Megainduction: Machine Learning on Very Large Databases (context) - Catlett - 1991
54
Metalearning for multistrategy and parallel learning (context) - Chan, Stolfo - 1993
51
Parallel sorting on a shared-nothing architecture using prob..
- DeWitt, Naughton et al. - 1991
45
Experiments on multistrategy learning by metalearning
- Chan, Stolfo - 1993
14
Experiments on the costs and benefits of windowing in ID (context) - Wirth, Catlett - 1988
11
Induction over large databases (context) - Quinlan - 1979
9
Distributed tree construction from large data-sets (context) - Fifield - 1992
3
Scalable POWERparallel Systems (context) - Machines - 1995
2
Introduction to IND Version (context) - Research - 1992
1
Classificaton Algorithms (context) - James - 1985
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.almaden.ibm.com/cs/people/ragrawal/pubs.html): More
Mining Sequential Patterns: Generalizations And Performance.. - Srikant, Agrawal (1996)
(Correct)
Parallel Algorithms for High-dimensional Proximity Joins - Shafer, Agrawal (1997)
(Correct)
On the Computation of Multidimensional Aggregates - Agarwal, Agrawal.. (1996)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC