Download:
|
by Daniel C. Zilio, Anant Jhingran, Sriram Padmanabhan
IBM Research Report RC
http://www.db.toronto.edu/~zilio/pub/cfgv1.ps
Add To MetaCart
Abstract:
A shared nothing database system which tries to leverage the knowledge of partitioning attributes of relations can outperform a system where such knowledge is either not available or not used. The performance improvements are typically obtained by function shipping more database operations (joins, aggregates etc.), thus minimizing the communication overhead. In such a system, it is critical that the correct partitioning keys are selected so that the query workload is optimized. Previous research has ignored the importance of selecting the partitioning keys and have mostly focused on the degree of declustering. In this study we show that by following a systematic methodology, especially for the partitioning key selection and associated relation grouping issues, the entire data placement strategy for a given database schema and workload can be determined in a very efficient manner. We describe different flavors of this methodology and demonstrate the performance improvements resulting from them. 1
Citations
|
7715
|
Computers and Intractability: A Guide to the Theory of NP-Completeness
– Garey, Johnson
- 1979
|
|
369
|
Parallel Database Systems: The future of high performance database systems
– DeWitt, Gray
- 1992
|
|
363
|
The grid file: An adaptable, symmetric multikey file structure
– Nievergelt, Hinterberger, et al.
|
|
170
|
Disk Striping
– Salem, H
- 1986
|
|
96
|
Multiprocessor hash-based join algorithms
– DeWitt, Gerber
- 1985
|
|
91
|
Tradeo s in processing complex join queries via hashing in multiprocessor database machines
– Schneider, DeWitt
- 1990
|
|
89
|
Data Placement in Bubba
– Copeland
- 1988
|
|
62
|
GAMMA: A high performance dataflow database machine
– DeWitt, Gerber, et al.
- 1986
|
|
38
|
Dynamic File Allocation in Disk Arrays
– Weikum, Scheuermann
- 1991
|
|
35
|
CMD: A multidimensional declustering method for parallel database systems
– Li, Srivastava, et al.
- 1992
|
|
30
|
A multiuser performance analysis of alternative declustering strategies
– Ghandeharizadeh, DeWitt
- 1990
|
|
30
|
Data placement in shared-nothing parallel database systems
– Mehta, DeWitt
- 1997
|
|
27
|
Object Placement in Parallel Hypermedia Systems
– Ghandeharizadeh, Ramos, et al.
- 1991
|
|
16
|
An adaptive data placement scheme for parallel database computer systems
– Hua, Lee
- 1990
|
|
13
|
Declustering using error correcting codes
– Faloutsos, Metaxas
- 1989
|
|
12
|
Physical Database Design in Multiprocessor Database Systems
– Ghandeharizadeh
- 1990
|
|
11
|
et al, "The Gamma Database Machine Project
– DeWitt
- 1990
|
|
5
|
ORACLE Parallel RDBMS on Massively Parallel Systems
– Linder
- 1993
|
|
4
|
et al, "Prototyping Bubba: A Highly Parallel Database System
– Boral
- 1990
|
|
3
|
Data Placement in Shared-Nothing Parallel Database Systems
– Padmanabhan
- 1992
|
|
2
|
Parallel Features of NonStop SQL
– Chambers, Cracknell
- 1993
|
|
1
|
Data Reorganization in Parallel Database Systems
– Baru, Zilio
- 1993
|
|
1
|
Informix Parallel Data Query" Parallel and Distributed Information Systems Conf
– Clay
- 1993
|
|
1
|
Hybrid-Range Partitioning: A New Declustering Strategy for Multiprocessor Database Machines
– Ghandeharizadeh, DeWitt
- 1990
|
|
1
|
Performance and Availability in Database Machines with Replicated Data
– Hsaio
- 1990
|
|
1
|
and Sakiti Pramanik, "Optimal File Distribution for Partial Match Retrieval
– Kim
- 1988
|
|
1
|
Why Decision Support FAILS and How to FIX it", Datamation
– Kimball, Strehlo
- 1994
|
|
1
|
Setrag Khoshafian, and Haran Boral, "Multi-Disk Management Algorithms
– Livny
- 1987
|
|
1
|
and Chaitanya Baru, "Data Placement in Shared-Nothing Parallel Database Systems
– Padmanabhan
- 1992
|