See this document in CiteSeerX!

Practical Skew Handling in Parallel Joins (1992)  (Make Corrections)  (56 citations)
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri
Proceedings of the 18th Conference on Very Large Databases, Morgan Kaufman pubs. (Los Altos CA), Vancouver



  Home/Search   Context   Related

 
View or download:
wisc.edu/pub/techreports...tr1098.ps.Z
wisc.edu/~dewitt/papers/par...vldb92.ps
wisc.edu/pub/tech...CSTR921098.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  wisc.edu (more)
From:  wisc.edu
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: We present an approach to dealing with skew in parallel joins in database systems. Our approach is easily implementable within current parallel DBMS, and performs well on skewed data without degrading the performance of the system on non-skewed data. The main idea is to use multiple algorithms, each specialized for a different degree of skew, and to use a small sample of the relations being joined to determine which algorithm is appropriate. We developed, implemented, and experimented with four ... (Update)

Context of citations to this paper:   More

...than the degree of declustering of the consumer operator. Creating numerous partitions for handling skew is a well known technique [10]. These numerous, small partitions are distributed amongst the consumer operator instances at initialization time, and those instances are...

...than the degree of declustering of the consumer operator. Creating numerous partitions for handling skew is a well known technique [9]. These numerous, small partitions are distributed amongst the consumer operator instances at initialization time, and those instances are...

Cited by:   More
A Dynamic Load Balancing Strategy for Parallel Datacube.. - Muto, Kitsuregawa (1999)   (Correct)
Skew Handling Techniques in Sort-Merge Join - Wei Li And (2002)   (Correct)
Flux: An Adaptive Partitioning Operator for.. - Shah, Hellerstein, .. (2002)   (Correct)

Similar documents (at the sentence level):
73.9%:   Practical Skew Handling in Parallel Joins - DeWitt, Naughton, Schneider.. (1992)   (Correct)

Active bibliography (related documents):   More   All
0.7:   Parallel Query Processing - Yu, Chen, Wolf, Turek (1993)   (Correct)
0.5:   Optimizing Multi-Join Queries in Parallel Relational Databases - Srivastava, Elsesser (1993)   (Correct)
0.4:   Architectural Considerations For Parallel Query Evaluation.. - Shatdal   (Correct)

Similar documents based on text:   More   All
0.3:   Design and Evaluation of Alternative Selection Placement.. - Chen, DeWitt, Naughton (2002)   (Correct)
0.2:   An Evaluation of Non-Equijoin Algorithms - DeWitt, Naughton, Schneider (1991)   (Correct)
0.1:   The BUCKY Object-Relational Benchmark - Carey, DeWitt, Naughton.. (1997)   (Correct)

Related documents from co-citation:   More   All
21:   Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning (context) - Hua, Lee - 1991
21:   Parallel database systems: The future of high performance database systems - DeWitt, Gray - 1992
18:   A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins (context) - Walton, Dale et al. - 1991

BibTeX entry:   (Update)

D.J. DeWitt, J.F. Naughton, D.A. Schneider, S. Seshadri, "Practical Skew Handling in Parallel Joins," Proc. 18th VLDB Conf. (1992), pp. 27-40. http://citeseer.ist.psu.edu/article/dewitt92practical.html   More

@inproceedings{ dewitt92practical,
    author = "D. J. DeWitt and J. F. Naughton and D. A. Schneider and S. Seshadri",
    title = "Practical Skew Handling in Parallel Joins",
    booktitle = "Proceedings of the 18th Conference on Very Large Databases, Morgan Kaufman pubs. (Los Altos {CA}), Vancouver",
    year = "1992",
    url = "citeseer.ist.psu.edu/article/dewitt92practical.html" }
Citations (may not include all citations):
358   Universal classes of hash functions (context) - Carter, Wegman - 1979
298   Parallel database systems: The future of high performance da.. - DeWitt, Gray - 1992
219   Bounds on multiprocessing timing anomalies (context) - Graham - 1969
153   Sampling Techniques (context) - Cochran - 1977
84   IEEE Transactions on Knowledge and Data Engineering (context) - DeWitt, Ghandeharizadeh et al. - 1990
81   A performance evaluation of four parallel join algorithms in.. - Schneider, DeWitt - 1989
74   The case for shared nothing - Stonebraker - 1986
69   Multiprocessor hash-based join algorithms - DeWitt, Gerber - 1985
55   Application of hash to data base machine and its architectur.. (context) - Kitsuregawa, Tanaka et al. - 1983
54   Design and implementation of the Wisconsin Storage System (context) - Chou, Dewitt et al. - 1985
53   GAMMA --- a high performance dataflow database machine - Dewitt, Gerber et al. - 1986
42   A taxonomy and performance model of data skew effects in par.. (context) - Walton, Dale et al. - 1991
34   Handling data skew in multiprocessor database computers usin.. (context) - Hua, Lee - 1991
29   A performance analysis of the GAMMA database machine (context) - DeWitt, Ghandeharizadeh et al. - 1988
24   Distributed query processing in a relational database system (context) - Epstein, Stonebraker et al. - 1978
24   An effective algorithm for parallelizing hash joins in the p.. (context) - Wolf, Dias et al. - 1990
19   Bucket spreading parallel hash: A new (context) - Kitsuregawa, Ogawa - 1990
16   Random sampling from B + -trees (context) - Olken, Rotem - 1989
14   Performance analysis of a load balancing hash-join algorithm.. (context) - Omiecinski - 1991
13   Operating System Review (context) - Blasgen, Gray et al. - 1979
11   Random sampling from hash files (context) - Olken, Rotem et al. - 1990
10   Sampling issues in parallel database systems (context) - Seshadri, Naughton - 1992
9   IEEE Transactions on Knowledge and Data Engineering (context) - Lakshmi, Yu et al. - 1990
8   Algebra operations on a parallel computer -- performance eva.. (context) - Bratbergsengen - 1987
7   A comparison of non-equijoin algorithms (context) - DeWitt, Naughton et al. - 1991
2   Join on a cube: Analysis (context) - Baru, Frieder et al. - 1987
2   Parallel external sorting using probabilistic splitting (context) - DeWitt, Naughton et al. - 1991



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://fermivista.math.jussieu.fr/ftp/ftp.cs.wisc.edu.html):   More
Precise Interprocedural Dataflow Analysis with.. - Sagiv, Reps, Horwitz (1995)   (Correct)
A Comparison of Trace-Sampling Techniques for Multi-Megabyte .. - Kessler, HIll, Wood (1994)   (Correct)
Parallel Processing on Dynamic Resources with CARMI - Pruyne (1995)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC