MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Java as a Basis for Parallel Data Mining in Workstation Clusters

Download:
Download as a PDF | Download as a PS
by Matthias Gimbel, Bernhard Haumacher, Peter C. Lockemann, Walter F. Tichy
http://wwwipd.ira.uka.de/~gimbel/1999-1.ps.gz
Add To MetaCart

Abstract:

Abstract. The exploitation of hidden information from large datasets by means of data mining techniques suffers from long response times. We address this problem by using the processing power of workstation clusters and have studied the performance of OLAP queries as a first step towards a portable data mining platform. The results of our study suggest that with the availability of parallel workstation clusters that are equipped with high performance communication networks, fine-grained and communication-intensive parallelizations of queries are promising-- even though they are considered too costly in traditional database systems. The paper describes our Java framework for parallel OLAP-type query execution, necessary optimizations to the standard Java implementation, and analyzes the performance of non-standard parallel execution schemes on a workstation cluster. 1

Citations

800 Myrinet: A gigabit-per-second local area network – Boden, Cohen, et al. - 1995
378 Parallel Database Systems: The Future of High Performance Database Systems – Dewitt, Gray - 1992
120 JavaParty -- transparent remote objects – Philippsen, Zenger - 1997
99 Integrating association rule mining with relational database systems: Alternatives and implications – Sarawagi, Thomas, et al. - 1998
65 Parallel Database Systems: open problems and new issues Distr – Valduriez - 1993
63 Bitmap index design and evaluation – CHAN, IOANNIDIS - 1998
62 GAMMA - A High Performance Dataflow Database Machine – DeWitt - 1986
61 Mining very large databases with parallel processing – Freitas, Lavington - 1998
45 Exploiting inter-operation parallelism in XPRS – Hong - 1992
19 More efficient object serialization – Philippsen, Haumacher - 1999
6 DBS3: A Parallel Data Base System for Shared Store (Project Synopsis – Bergsten, Couprie, et al. - 1993
6 ParaStation: Efficient Parallel Computing by Clustering Workstations: Design and Evaluation – Warschko, Blum, et al. - 1997
4 GAMMA | a high performance data ow database machine – Dewitt, Gerber, et al. - 1986
4 ParaStation: E cient parallel computing by clustering workstations: Design and evaluation – Warschko, Blum, et al. - 1997
3 More e cient object serialization – Haumacher, Philippsen - 1999
3 Data Warehouse Performance – Inmon, Rudin, et al. - 1998
1 scalable parallel architecture for open data warehousing – server - 1995