10 citations found. Retrieving documents...
G. Graefe, A. Linville, and L. Shapiro. Sort versus hash revisited. IEEE Trans. on Data and Knowledge Eng., 6(6):934--944, Dec. 1994.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Bucket Skip Merge Join: A Scalable Algorithm for Join.. - Kamath, Ramamritham (1996)   (Correct)

.... database systems [SC90, DLM93] Several issues related to join processing are surveyed in [ME92] The issue of which one is the better among merge join and hash join was debated a lot until a recent comprehensive study concluded that there exist dualities between the two schemes [Gra94, GLS94] Commercial database products currently support both the merge join and hash join schemes [CHH 91] Thus it can be observed that there has been no work to improve join performance by maintaining additional information and skipping data items during join processing. 3 325763957 402847343 ....

....value of a bucket has to be changed in the bucket table. If a leaf split merge occurs then an additional row is added deleted in the bucket table and the low and high values of the old and the new nodes are changed accordingly. Handling data skew is an important consideration in join algorithms [GLS94] Since the B tree is balanced, in our case data skew does not have an effect on bucket maintenance. Note that the overhead for maintaining buckets via the bucket table is very minimal compared to the cost of maintaining the B tree itself. The advantage of having a separate bucket table to ....

[Article contains additional citation context not shown here]

G. Graefe, A. Linville, and L. D. Shapiro. Sort versus hash revisited. IEEE Trans. on Knowledge and Data Eng., 1994.


SEEKing the Truth about Ad Hoc Join Costs - Haas, Carey, Livny, Shukla (1993)   (21 citations)  (Correct)

....allocation for hash joins as in [2] and derived an optimal allocation using his cost metric (assuming a fixed number of buckets) While this paper clearly showed that the I O cost model does affect predictions, it made no attempt to show that the metric proposed was correct . The main focus of [9] was to explore the dualities and differences between sort based and hash based join methods. The paper presented interesting discussions of these dualities and gave several possible optimizations to the algorithms as a result. Included were experimental results using the Volcano system that ....

....Hagmann [11] argued for a different, but equally simplistic model, i.e. counting only the number of I O requests. The predominant transfer only model has been used for most comparisons of these algorithms, resulting in a certain set of beliefs about their relative merits. While several papers [11, 9, 2] have noted the importance of multi page I Os, exploring to a degree the impact of performing I O in clusters [9, 8, 21] none has studied their implications as thoroughly or examined their impact for the range of join algorithms examined here. In this paper, we will analyze all five of the key ....

[Article contains additional citation context not shown here]

G. Graefe, A. Linville, L. Shapiro, "Sort versus Hash Revisited", IEEE Trans. on Knowledge and Data Eng., 6(6), June 1994.


Diag-Join: An Opportunistic Join Algorithm for 1:N.. - Helmer, Westmann.. (1997)   (4 citations)  (Correct)

....simple nested loop join algorithm, the first improvement was the introduction of the merge join [1] Later, the hash join [2, 6] and its improvements [21, 25, 30, 37] became alternatives to the merge join. For overviews see [29, 35] and for a comparison between the sort merge and hash joins see [12, 13]. A lot of effort has also been spent on parallelizing join algorithms based on sorting [9, 27, 28, 33] and hashing [5, 10, 34] All of these algorithms concentrate on simple join predicates based on the comparison of two atomic values. Predominant is the work on equi joins, i.e. where the join ....

G. Graefe, A. Linville, and L. Shapiro. Sort versus hash revisited. IEEE Trans. on Data and Knowledge Eng., 6(6):934--944, Dec. 1994.


Evaluation of Main Memory Join Algorithms for Joins with.. - Helmer, Moerkotte (1997)   (9 citations)  (Correct)

....a simple nested loop join algorithm, the first improvement was the introduction of the merge join [1] Later, the hash join [2, 7] and its improvements [20, 23, 28, 39] became alternatives to the merge join. For overviews see [27, 37] and for a comparison between the sort merge and hash joins see [13, 14]. A lot of effort has also been spent on parallelizing join algorithms based on sorting [10, 25, 26, 34] and hashing [6, 12, 36] Another important research area is the development of index structures that allow to accelerate the evaluation of joins [16, 22, 21, 29, 40, 42] All of these ....

G. Graefe, A. Linville, and L. Shapiro. Sort versus hash revisited. IEEE Trans. on Data and Knowledge Eng., 6(6):934--944, Dec. 1994.


Evaluation of Main Memory Join Algorithms for Joins with Set .. - Helmer, Moerkotte (1996)   (9 citations)  (Correct)

....a simple nested loop join algorithm, the first improvement was the introduction of the merge join [1] Later, the hash join [2, 7] and its improvements [19, 22, 28, 39] became alternatives to the merge join. For overviews see [27, 37] and for a comparison between the sort merge and hash joins see [13, 14]. A lot of effort has also been spent on parallelizing join algorithms based on sorting [10, 25, 26, 34] and hashing [6, 12, 36] Another important research area is the development of index structures that allow to accelerate the evaluation of joins [16, 21, 20, 29, 40, 42] All of these ....

G. Graefe, A. Linville, and L. Shapiro. Sort versus hash revisited. IEEE Trans. on Data and Knowledge Eng., 6(6):934--944, Dec. 1994.


Memory-Adaptive External Sorting - Pang, Carey, Livny (1993)   (8 citations)  (Correct)

....incurs extra cost, and should therefore merge as few runs as possible (without increasing the number of merge steps) to keep the extra cost down. By merging more runs, naive merging increases the cost of the preliminary steps unnecessarily. Thus, the general rule is to adopt optimized merging [Grae93]. Another important aspect of the merging strategy concerns the choice of input runs. All of the merge steps, other than the final merge, have a choice of input runs and should thus merge the shortest possible runs. Such a choice minimizes the cost of the preliminary merges in two ways: Firstly, ....

....preliminary merge step may in turn be combined by a subsequent merge step, the output run of which may be the input of yet another merge step, and so on. The decision of naive to include more runs in the first preliminary step thus leads to an increase in the cost of each of these affected steps [Grae93]. The more merge steps there are, the larger the number of affected steps becomes, and consequently the higher the penalty of naive gets. For small M values, the number of sorted runs that the merge phase has to combine is large relative to the available memory, as shown in Table 6. This results ....

[Article contains additional citation context not shown here]

G. Graefe, A. Linville, L.D. Shapiro, "Sort versus Hash Revisited", IEEE Transactions on Knowledge and Data Engineering, to appear, 1993.


Query Processing in Firm Real-Time Database Systems - Pang (1994)   (2 citations)  (Correct)

....runs, naive merging increases the cost of the preliminary steps unnecessarily. Thus, the R 1 10 R 10 R 1 R 4 R 1 4 R 5 (b) Optimized Merging R 1 10 R 1 7 R 8 R 9 R 10 R 1 R 7 (a) Naive Merging Step 2: Step 1: Figure 4. 1: Merging Strategies 53 general rule is to adopt optimized merging [Grae91]. Another important aspect of the merging strategy concerns the choice of input runs. All of the merge steps, other than the final merge, have a choice of input runs and should thus merge the shortest possible runs. Such a choice minimizes the cost of the preliminary merges in two ways: Firstly, ....

....preliminary merge step may in turn be combined by a subsequent merge step, the output run of which may be the input of yet another merge step, and so on. The decision of naive to include more runs in the first preliminary step thus leads to an increase in the cost of each of these affected steps [Grae91]. The 63 more merge steps there are, the larger the number of affected steps becomes, and consequently the higher the penalty of naive gets. For small M values, the number of sorted runs that the merge phase has to combine is large relative to the available memory, as shown in Table 4.3. This ....

[Article contains additional citation context not shown here]

G. Graefe, A. Linville, L.D. Shapiro, "Sort versus Hash Revisited", Technical Report CU-CS534 -91, University of Colorado, Boulder, July 1991.


Diag-Join: An Opportunistic Join Algorithm for 1:N.. - Helmer, Westmann.. (1998)   (4 citations)  (Correct)

No context found.

G. Graefe, A. Linville, and L. Shapiro. Sort versus hash revisited. IEEE Trans. on Data and Knowledge Eng., 6(6):934--944, Dec. 1994.


Fast Joins Using Join Indices - Li, Ross (1998)   (7 citations)  (Correct)

No context found.

Graefe, G., Linville, A., and Shapiro, L. (1994). Sort versus hash revisited. IEEE Transactions on knowledge and data enginnering, 6(6).

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC