Parallelism is key to high performance relational database systems. Since there are several parallel architectures suitable for database systems, a few interesting problems arise, mostly from an emphasis on the differences among the architectures. Specifically, in the literature, differences rather than similarities between the architectures are pointed out, and the specific details of a particular architecture, crucial to high performance, are generally ignored. In this thesis we have attempted to remedy this situation by emphasizing the similarities and a deeper understanding of two popular parallel architectures, shared nothing and shared memory, from a database perspective. We show that there is complementarity and similarity in the two architectures by showing that software shared-memory support can be used to improve performance on shared-nothing hardware and by showing that shared-nothing software can run on shared-memory hardware with performance comparable to that of "native " algorithms. We also show that by understanding the architectural details and tradeoffs, we can design algorithms that have superior performance. We illustrate this via examples of hash join algorithms on shared-memory hardware that exploit cache memories, hash aggregation algorithms on sharednothing hardware that tradeoff communication for memory consumption, and hash aggregation algorithm on shared-memory hardware that tradeoff computation for reduced latch conflicts. All these algorithms show performance superior to the previously known algorithms. ii
|
888
|
Memory coherence in shared virtual memory systems
– Li, Hudak
- 1989
|
|
560
|
Implementation and performance of Munin
– Carter, Bennett, et al.
- 1991
|
|
550
|
Query evaluation techniques for large databases
– Graefe
- 1993
|
|
369
|
Adaptive load sharing in homogeneous distributed systems
– Eager, Lazowska, et al.
- 1986
|
|
265
|
Parallelism in random access machines
– Fortune, Wyllie
- 1978
|
|
236
|
Cache coherence protocols: Evaluation using a multiprocessor simulation model
– Archibald, Baer
- 1986
|
|
126
|
A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment
– Schneider, DeWitt
- 1989
|
|
118
|
Multi-disk management algorithms
– Livny, Khoshafian, et al.
- 1987
|
|
98
|
Multiprocessor hash-based join algorithms
– DeWitt, Gerber
- 1985
|
|
92
|
Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines
– Schneider, DeWitt
- 1990
|
|
73
|
Optimization of parallel query execution plans in xprs
– Hong, Stonebraker
- 1991
|
|
70
|
Practical Skew Handling in Parallel Joins
– DeWitt, Naughton, et al.
- 1992
|
|
69
|
Cache conscious algorithms for relational query processing
– Shatdal, Kant, et al.
- 1994
|
|
64
|
Where is Time Spent in Message-Passing and Shared-Memory Programs
– Chandra, Larus, et al.
- 1994
|
|
63
|
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins
– Walton, Dale, et al.
- 1991
|
|
47
|
Parallel algorithms for the execution of relational database operations
– Bitton, Boral, et al.
- 1983
|
|
43
|
Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines
– Ghandeharizadeh, DeWitt
- 1990
|
|
43
|
Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning
– Hua, Lee
- 1991
|
|
42
|
Managing memory to meet multiclass workload response time goals
– Brown, Carey, et al.
- 1993
|
|
39
|
Distributed query processing in a relational database system
– Epstein, Stonebraker, et al.
- 1978
|
|
38
|
Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew
– Kitsuregawa, Ogawa
- 1990
|
|
38
|
Evaluation of distribution criteria for distributed database systems
– Ries, Epstein
- 1978
|
|
35
|
CSIM Users' Guide
– Schwetman
- 1990
|
|
33
|
Hash-Based Join Algorithms for Multiprocessor Computers with Shared Memory
– Lu, Tan, et al.
- 1990
|
|
31
|
Estimating the number of species: A review
– Bunge, Fitzpatrick
- 1993
|
|
29
|
Techniques for processing of aggregates in relational database systems
– Epstein
- 1979
|
|
25
|
A Performance Study of Three High Availability Data Replication Strategies
– Hsiao, DeWitt
- 1991
|
|
25
|
Effectiveness of parallel joins
– Lakshmi, Yu
- 1990
|
|
24
|
An Analysis of Three Transaction Processing Architectures
– Bhide
- 1988
|
|
22
|
An effective algorithm for parallelizing hash joins in the presence of data skew
– Wolf, Dias, et al.
- 1991
|
|
18
|
Using Shared Virtual Memory for Parallel Join Processing
– Shatdal, Naughton
- 1993
|
|
16
|
Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor
– Omiecinski
- 1991
|
|
11
|
On classical problem of probability theory
– Erdos, R'enyi
- 1961
|
|
8
|
Informix Online XPS
– Gerber
- 1995
|
|
5
|
Probabilistic Methods in Query Processing
– Seshadri
- 1992
|
|
4
|
Managing databases in distributed virtual memory
– Hsu, Tam
- 1988
|
|
4
|
Transaction synchronization in distributed shared virtual memory systems
– Hsu, Tam
- 1989
|
|
3
|
Parallel Algorithms and Their Implentation in MICRONET
– Su, Mikkilineni
- 1982
|
|
2
|
A dynamic load-balanced task-oriented approach to parallel query processing
– Lu, Tan
- 1991
|
|
2
|
Sybase Navigation Server
– Inc
|
|
2
|
Teradata: DBC/1012 Database Computer Concept and Facilities
– Corp
- 1983
|