GOOD
Abstract: Free Parallel Data Mining Bin Li New York University, 1998 Research Advisor: Professor Dennis Shasha Data mining is the emerging field of applying statistical and artificial intelligence techniques to the problem of finding novel, useful, and non-trivial patterns from large databases. This thesis presents a framework for easily and efficiently parallelizing data mining algorithms. We propose an acyclic directed graph structure, exploration dag (E- dag), to characterize the computation model of... (Update)
Context of citations to this paper: More
...The resultant single tree is more understandable than the multiple model hybrid. Shasha and his research group have implemented PC4.5 (Li 1998; Shasha 1998) a parallel version of C4.5 (Quinlan 1993) which uses a different instantiation of the framework of Figure 5....
.... architecture based on PLinda which enables parallel data mining programs to run on networks of workstations in a fault tolerant manner [3]. We provide templates for writing parallel data mining programs in PLinda and describe how to run these programs on a network of...
Cited by: More
Free Parallel Data Mining - Li (1998)
(Correct)
A Survey of Methods for Scaling Up Inductive Algorithms - Provost, Kolluri (1999)
(Correct)
Active bibliography (related documents): More All
0.7: Fault-tolerant Parallel Processing Combining Linda, Checkpointing, .. - Jeong (1996)
(Correct)
0.5: Designing A Fault-Tolerant Jini Compute Server - Lazar (2001)
(Correct)
0.5: Semantic Client Caching in a Client-Server-based KDD Architecture - Kamp, Grupe
(Correct)
Similar documents based on text: More All
0.5: A Qualitative Profile-based Approach to Edge Detection - Yen
(Correct)
0.3: A Middleware for Developing Parallel Data Mining Applications - Jin, Agrawal (2001)
(Correct)
0.1: Lessons from Wall Street: case studies in configuration, tuning.. - Shasha (1997)
(Correct)
Related documents from co-citation: More All
2: Parallel Depth First Search on Multiprocessors (context) - Kumar, Rao - 1987
2: Induction of Decision Trees (context) - Quinlan - 1986
2: Data mining and statistics: What's the connection (context) - Friedman - 1997
BibTeX entry: (Update)
Li, B. (1998). Free Parallel Data Mining. Ph.d. thesis, Department of Computer Science, New York University. http://citeseer.ist.psu.edu/li98free.html More
@inproceedings{ li98free,
author = "Bin Li and Dennis Shasha",
title = "Free parallel data mining",
pages = "541--543",
year = "1998",
url = "citeseer.ist.psu.edu/li98free.html" }
Citations (may not include all citations):
2177
Programs for Machine Learning (context) - Quinlan - 1993
1359
Induction of decision trees (context) - Quinlan - 1986
921
Mining Association Rules Between Sets of Items in Large Data..
- Agrawal, Imielinski et al. - 1993
901
Transaction Processing: Concepts and Techniques (context) - Gray, Reuter - 1993
566
Condor---a hunter of idle workstations (context) - Litzkow, Livny et al. - 1988
474
Advances in Knowledge Discovery and Data Mining (context) - Piatetsky-Shapiro, Uthurusamy - 1995
468
Memory Consistency and Event Ordering in Scalable Shared-Mem..
- Gharachorloo, Lenoski et al. - 1990
422
Implementation and Performance of Munin
- Carter, Bennett et al. - 1991
406
TreadMarks: Distributed Shared Memory on Standard Workstatio..
- Keleher, Dwarkadas et al. - 1994
328
PVM 3 User's Guide and Reference Manual
- Geist, Beguelin et al. - 1994
309
Communications of the ACM (context) - Carriero, Gelernter et al. - 1989
300
Lazy Release Consistency for Software Distributed Shared Mem..
- Keleher, Cox et al. - 1992
268
Mining Generalized Association Rules
- Srikant, Agrawal - 1995
213
Discovery of multiple-level association rules from large dat..
- Han, Fu - 1995
208
Fast Algorithm for Mining Association Rules in Large Databas.. (context) - Agrawal, Srikant - 1994
203
Multi-interval Discretization of Continuous-valued Attribute.. (context) - Fayyad, Irani - 1993
190
Wadsworth International Group (context) - Breiman, Friedman et al. - 1984
172
A space-economical suffix tree construction algorithm (context) - McCreight - 1976
111
SLIQ: A Fast Scalable Classifier for Data Mining
- Mehta, Agrawal et al. - 1996
76
How to Write Parallel Programs (context) - Carriero, Gelernter - 1991
72
the Handling of Continuous-valued Attributes in Decision Tre.. (context) - Fayyad, Irani - 1992
68
The Attribute Selection Problem in Decision Tree Generation (context) - Fayyad, Irani - 1992
60
Fast Parallel and Serial Approximate String Matching (context) - Landau, Vishkin
51
Combinatorial Pattern Discovery for Scientific Data: Some Pr.. (context) - Wang, Chirn et al. - 1994
51
Combinatorial pattern discovery for scientific data: Some pr.. (context) - Wang, Chirn et al. - 1994
49
CALYPSO: A Novel Software System for Fault-Tolerant Parallel..
- Baratloo, Dasgupta et al. - 1995
45
Experiments in Multistrategy Learning by Meta-Learning
- Chan, Stolfo - 1993
44
Adaptive and Reliable Parallel Computing on Networks of Work..
- Blumofe, Lisiecki - 1997
44
Fast Sequential and Parallel Algorithms for Association Rule..
- Mueller - 1995
40
The Human Genome Project and Informatics (context) - Frenkel - 1991
37
A Robust Parallel Programming Model for Dynamic Non-Uniform ..
- Kohn, Baden - 1994
35
An effective hash-based algorithm for mining association rul.. (context) - Park, Chen et al. - 1995
34
Linda on Distributed Memory Multiprocessors (context) - Bjornson - 1992
29
Parallel Best-First Search of State-Space Graphs: A Summary ..
- Kumar, Rao et al. - 1988
28
Array Classes for Architecture Independent Finite Difference.. (context) - Parsons, Quinlan - 1994
26
Persistent Linda: Linda + transactions + query processing
- Anderson, Shasha - 1992
26
Color set size problem with applications to string matching (context) - Hui - 1992
24
Enhancements to the Data Mining Process (context) - John - 1997
19
Task Parallelism and High-Performance Languages
- Foster - 1994
16
Artificial Intelligence and Molecular Biology (context) - Hunter - 1993
15
Consistent Linear Speedups to a First Solution in Parallel S.. (context) - Saletore, Kale - 1990
14
Bump Hunting in High-Dimensional Data
- Friedman, Fisher
11
Runtime Support for Portable Distributed Data Structures
- Wen, Chakrabarti et al. - 1995
11
Adaptive Parallelism with Piranha (context) - Kaminsky - 1994
11
Fault-tolerant Parallel Processing Combining Linda
- Jeong - 1996
9
Goblin: A DBPL Designed for Advanced Database Applications (context) - Kersten - 1991
9
Implementing Tuple Space Machines (context) - Carriero - 1987
9
Decision Trees and Multi-valued attributes (context) - Quinlan - 1988
7
Finding Optimal Multi-Splits for Numerical Attributes in Dec..
- Elomaa, Rousu - 1996
7
Computational Approaches to Discovering Semantics in Molecul.. (context) - Lipton, Marr et al. - 1989
6
An Approach to Fault Tolerant Parallel Processing on Intermi..
- Jeong, Shasha et al. - 1997
4
Department of Computer Science (context) - Brown, Jeong et al. - 1996
3
Data Mining and Statistics: What's the Connection (context) - Friedman
3
Towards Scalable and Parallel Inductive Learning: A Case Stu..
- Chan, Stolfo - 1994
2
Shared Tuple Memories: Shared Memories (context) - Leichter - 1989
2
Persistent Linda 2: a transactioncheckpointing approach to f.. (context) - Jeong, Shasha - 1994
2
Parallel Depth First Search on Multiprocessors (context) - Kumar, Rao - 1987
2
Centrum voor Wiskunde en Informatica (context) - Holsheimer, Kersten et al. - 1994
2
Free Parallel Data Mining
- Li, Shasha - 1998
2
Parallel Depth First Search on Multiprocessors (context) - Kumar, Rao - 1987
2
An Efficient Multithreaded Runtime System (context) - Blumofe - 1995
1
Concurrency Control and Recoverability in Database Systems (context) - Bernstein, Hadzilacos et al. - 1987
1
Data Mining Developments Gain Attention (context) - Makulowich - 1997
1
A Framework for Biological Pattern Discovery on Networks of .. (context) - Li, Shasha et al. - 1997
1
An Efficietn Algorithm for Mining Association Rules in Large.. (context) - Sarasere, Omiecinsky et al. - 1995
1
Data and Knowledge Bases for Genome Mapping: What Lies Ahead (context) - Kamel, Delobel et al. - 1991
1
NASA Ames Research Center TR# FIA (context) - Buntine, Caruana et al. - 1991
1
NyuMiner: Classification Trees by Optimal Sub-K-ary Splits (context) - Li, Cheng et al. - 1998
1
van den Berg and Martin D (context) - Carel - 1994
Documents on the same site (http://www.cs.nyu.edu/csweb/Research/theses.html): More
Scheduling for Horizontal Systems: The VLIW Paradigm in.. - Gasperoni (1991)
(Correct)
Dynamic Impact Analysis: Analyzing Error Propagation In Program.. - Goradia (1993)
(Correct)
Matching Algorithms And Feature Match Quality Measures For.. - Keller (1999)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC