| Pang, H., Carey, M.J. and Livny, M. "Partially Preemptible Hash Joins" SIGMOD, May 1993. |
....can adapt to changing memory conditions. It can use additional memory pages that become available during its processing or give up memory pages when they are needed elsewhere, at the cost of additional disk accesses to store intermediate results and manage additional passes over the relations [Zeller90, Pang93]. 3.3.2. Semi Join Another way to address the problem of an R that is larger than the available memory is by performing a semi join [Bernstein81] In a semi join, that amount of memory required is reduced by not retaining an entire copy of R at each disk, but only the join keys necessary for ....
Pang, H., Carey, M.J. and Livny, M. "Partially Preemptible Hash Joins" SIGMOD, May 1993.
....during its lifetime, a family of memory adaptive hash joins, called partially preemptible hash joins Technique LL PP Par Sch Focus AimFN FFC EnvAD real. Sorting [19] no sort no no mem fl trt mem av intra U O O Memory Adaptive Merge Sort [27] no sort no no mem fl thr mem av intra U O A PPHJ [20] no hash no no mem fl trt mem av intra U O O Ripple [9] no PJ no no user pr irt user in intra U O O XJoin [24] no PJ no no dar user pr trt irt in av intra U O O Dynamic Reordering [23] no sort no no user pr irt user in intra U O O Mid Query Re optimisation [15] rem any maybe ....
....Tukwila [13, 14] rem PJ maybe yes any trt irt stats inter D O G S Telegraph [11] op or parl no yes any trt stats l dr intra D O L S dQUOB [21, 22] op or no no no ac statstrt stats inter D L A Table 1. Properties of existing adaptive query processing techniques (PPHJs) is proposed in [20]. Initially, source relations are split and held in memory. When memory is insu#cient, one partition held in memory flushes its hash table to disk and deallocates all but one of its bu#er pages. The most e#cient variant of PPHJs for utilising additional memory is when partitions of the inner ....
H. Pang, M. Carey, and M. Livny. Partially preemptible hash joins. In Proc. of ACM SIGMOD 1993, pages 59--68, 1993.
....of memory resident buckets of R can adjust depending on the availability of memory. ADH basically follows the same philosophy as that of DDS. Split of buffers is the main new feature added. Partially preemptible hash join (PPHJ) A class of partially preemptible hash joins were proposed in [50]. DDS and ADH can be regarded as restricted versions of PPHJ . Here we only discuss the most sophisticated version of it, i.e. with dynamic expansion and contraction of memory buffers and prioritized spooling policy, and name it PPHJ . This technique is more flexible than either DDS or ADH. In ....
....strategy. Pages from R have higher priority 19 than pages from S and are more memory resilient, i.e. when memory space is not enough, pages of S are more likely to be flushed out of memory and when memory becomes more available, pages of R are more likely to be read in memory. Experiments [50] shows that if memory availability fluctuates, hybrid hash does not have a satisfactory performance. Unless memory availability fluctuates too fast, PPHJ has a substantial better performance than DDS and ADH in a wide variety of parameter settings. The reduction of execution time comes mostly from ....
H. Pang, M. J. Carey, and M. Livny. Partially preemptible hash joins. In Proc of the 1993 ACM SIGMOD Int'l Conf. on the Management of Data, pp. 59-68, Washington, DC, USA, May 1993.
....in memory to change dynamically, if the amount of memory available to perform the join changes. Partitions can be split by dividing the buckets between new partitions if the amount of memory available decreases, and partitions can be joined if the amount of memory available increases. Pang et al. [64] introduced a class of partially preemptible hash joins. It is more general than the scheme of Zeller and Gray. Not only does it turn internal partitions into external partitions when the amount of available memory decreases, but it turns external partitions into internal partitions if more memory ....
....and searches for a nearby solution) and a number of other algorithms of their own. Our analysis is a generalisation of these two cost models and allows the relative, or absolute, cost of each disk and CPU operation to be specified. Other researchers, such as Cheng et al. 11] and Pang et al. [64], have used a cost model of similar power to ours when evaluating their algorithms. However, they do not attempt to optimise the buffer usage based on this information and often read a block at a time from disk during each I O operation. Graefe noted the importance of reading and writing clusters ....
[Article contains additional citation context not shown here]
H. Pang, M. J. Carey, and M. Livny. Partially preemptible hash joins. In Proceedings of the 1993 ACM SIGMOD International Conference on the Management of Data, pages 59--68, Washington, DC, USA, May 1993.
....the disk. This extension also makes the algorithm very attractive for use as an adaptive algorithm. Whereas the basic Hash Join requires that sufficient memory be available for the entire hash table of R before the operation can proceed, the Hybrid algorithm can adapt to changing memory conditions [Zeller90,Pang93]. It is straightforward for the Hybrid join to give up memory pages when they are needed elsewhere in the system or make use of additional pages that become available during its processing. It simply places more of the buckets onto disk, or loads additional buckets into memory. This algorithm ....
Pang, H., Carey, M.J. and Livny, M. "Partially Preemptible Hash Joins" SIGMOD, May 1993.
....resources such as internal memory from ongoing external memory computations. Traditional external memory algorithms exhibit drastic performance degradation (because of thrashing like behavior) when they are subjected to sudden fluctuations in internal memory allocation. It has become necessary [10, 11, 18] to design memory adaptive algorithms (or simply MA algorithms) that adapt in an online manner to arbitrary and unpredictable fluctuations in internal memory. An MA algorithm must dynamically reorganize its computation in response to sudden decreases or increases in available internal memory with ....
....or increases in available internal memory with the goal of optimal online internal memory utilization. The MA approach is fundamentally more powerful than the passive approach of dealing with internal memory fluctuations by means of virtual memory paging. 1. 1 Our Contributions Prior work [10, 11, 18, 19] on MA algorithms has been exclusively empirical in nature, and often with restrictions on the extent of internal memory fluctuations. This paper considers, for the first time, MA algorithms for various problems in a theoretical framework. By their nature, MA algorithms exhibit extremely ....
[Article contains additional citation context not shown here]
H. Pang, M. Carey, and M. Livny. Partially preemptible hash joins. Proc. 1993 ACM-SIGMOD Conf. on Management of Data, 1993.
.... Not only are the admission and initial allocation of query operators determined by a bidding process, but their allocations may also be dynamically adjusted in flight in order to insure that resources are always being used by the highest bidder (i.e. adaptive query processing algorithms [Pang 93a, Pang 93b, Davison 94] are exploited in this scheme) While both BrokerM and BrokerM D were shown to outperform the adaptive algorithm of [Mehta 93] it is not clear how such an approach could be used for goal oriented allocation. Because of the difficulty of accurately characterizing response ....
....operator; this phenomenon, combined with the variance in memory demands from transaction to transaction, can cause frequent over commitment of memory. One way to address the problem of goal class memory waits is to implement memory adaptive query operators, such as those described in [Zeller 90, Pang 93a, Pang 93b, Davison 94] that can dynamically adjust their working storage requirements during the execution of the operation. Using these algorithms, the no goal class can be throttled back to the new, smaller global pool size. However, because these memory adaptive mechanisms are not common in ....
H. Pang, M. Carey, M. Livny, "Partially Preemptible Hash Joins," Proc. ACM SIGMOD '93 Conf., Washington D.C., May 1993.
....memory management issues continue to be the focus of the database research community. Numerous studies have been devoted to static memory allocation and buffer management [Chou85, Sacc86, Corn89, Falo91, O Neil93] A very few recent papers address techniques that adapt to a dynamic environment [Zell90, Pang93, Brow92, Brow93, Meht93b]. A hash join algorithm requires a significant amount of memory to keep a hash table for efficient execution. This amount varies from the square root of the size of the inner relation to the full inner relation size. The hash table must be held in memory for the entire period of join execution. ....
....this method does not adjust memory during a lifetime of queries. However, the hash join can take a long execution time and the DBMS may desire to appropriate a part of the join s memory to satisfy the memory requirements of higher priority transactions. Several algorithms have been presented [Zell90, Pang93] for adjusting hash joins to work with fluctuating memory. They differ one from another in the way they handle memory shortages, and in the way they utilize excess memory. But each of them obviously causes the performance of the hash join to deteriorate when memory is reduced by the external ....
[Article contains additional citation context not shown here]
Pang H., Carey M., and Livny, M., "Partially Preemptible Hash Joins", Proceedings of the 1993 SIGMOD Conference on the Management of Data, Washington, 1993.
....same value for the join columns) More memory is required for sorting the two input tables and the performance of sort merge join depends largely on sort performance. Dynamic memory adjustment is more important to hash join algorithms. Memory adjustment for hash joins has been studied by [ZG90] PCL93b] and [DG94] However, their work focused on how a single join can use extra space or release part of its space to affect I O transfer unit size. They did not take into account the memory requirements in different stages of a join and did not consider balancing memory requirements among ....
HweeHwa Pang, Michael J. Carey, and Miron Livny. Partially preemptible hash joins. In Proc. of ACM SIGMOD Conf., pages 59--69, May 1993.
....assumption often results in situations where more memory becomes available later, but ends up being wasted because all active queries have already received their allocations. In the area of adaptive algorithms, Pang, Carey and Livny propose a family of algorithms for adaptive sorting and hashing [PCL93a, PCL93b], but do not investigate memory allocation strategies. Zeller and Gray implemented an adaptive hash join [ZG90] but relied on the operating system for memory allocation. Our work is the first of which we are aware that investigates the use of a central agent to allocate memory among competing ....
....whether an idea makes use of adaptive algorithms. Second, whether it uses a central broker. In Table 2.1 we show where we believe the ideas discussed in this chapter fall. Not Adaptive Adaptive Not Global LRU Adaptive Hash Join [ZG90] Brokered DBMIN [CD85] Partially Preemptible Hash Join [PCL93b] Memory Adaptive External Sort [PCL93a] Brokered MG x y [NFS91] BAS ROC [YC93] Table 2.1: Classification of Related Work 2.1 DBMIN Memory Allocation One of the earliest investigations of memory management that directly addressed issues of partitioning memory among concurrent queries was ....
[Article contains additional citation context not shown here]
HweeHwa Pang, Michael J. Carey, and Miron Livny. Partially Preemptible Hash Joins. SIGMOD, 1993.
....it is unlikely that fragment fencing can ever be prevented from making mistakes, there are certainly ways to limit the penalty of doing so. One promising possibility is the exploitation of memory adaptive query processing algorithms, e.g. memory adaptive hash join and sorting methods [Zeller 90, Pang 93a, Pang 93b] These join methods can dynamically adapt to changes in the amount of available working storage during execution, so fragment fencing could actually take back some of the working storage from long running queries when it is necessary increase the resident volume while such queries ....
.... for individual fragments could also be a useful input to recently proposed techniques for run time selection of query plans [Hong 91, Ioann 92] Finally, we would like to exploit the capabilities of memory adaptive query processing techniques, e.g. preemptible hash join and sorting methods [Pang 93a, Pang 93b] Acknowledgements The authors would like to thank Manish Mehta, Mike Franklin, Hwee Hwa Pang, and Joe Hellerstein for many helpful discussions and comments on previous versions of this paper. ....
H. Pang, M. Carey, M. Livny, "Partially Preemptible Hash Joins," to appear Proc. ACM SIGMOD '93 Conf., Washington D.C., May 1993.
.... of accurate estimation techniques for intermediate result sizes (e.g. see [17, 12] and techniques to enable join algorithms to tolerate data skew (see [8] for a survey of skew issues and approaches) and to adapt to variations in system load due to the multiuser nature of database systems (see [5, 19] for two recently proposed approaches and pointers to other related work) These other challenges, while important, are subjects of ongoing database research and are beyond the scope of this paper. The remainder of this paper is organized as follows. In Section 2, we briefly review the relevant ....
H. Pang, M. Carey, and M. Livny, "Partially Preemptible Hash Joins", Proc. ACM SIGMOD Conf., Washington, DC, May 1993.
....pages because, for our system configuration, this choice gives a good compromise between reducing the number of random I Os, and keeping pages around in the hope that these pages will be fetched again while they are still in memory, thus eliminating some I O operations. It should be noted that, in [Pang93], spooled pages are written out one page at a time. This accounts for the different performance figures reported there. However, the relative performance between different algorithms mechanisms remains the same. 1. Contraction. In step (1) of PPHJ, instead of assigning all ##### F R pages ....
H. Pang, M.J. Carey, M. Livny, "Partially Preemptible Hash Joins", Proc. of the ACM 1993 SIGMOD Conf., May 1993.
....# 4) and allowing no goal class MPLs to rise above one. Beyond these enhancements, our future work will integrate M M with goal oriented processor and disk scheduling mechanisms and will exploit memory adaptive query processing techniques (such as the preemptible hash join and sorting methods of [Pang 93a, Pang 93b] Finally, we would like to explore other approaches to specifying goals for low throughput classes. Response time goals for low throughput classes result in long search times for the appropriate solution, as they require a certain number of completions to achieve statistical ....
H. Pang, M. Carey, M. Livny, "Partially Preemptible Hash Joins," Proc. ACM SIGMOD '93 Conf., Washington D.C., May 1993.
....of memory. This seriously reduces the effectiveness of priority scheduling. Moreover, this practice does not allow a query to take advantage of excess memory that may become available. There is therefore a need for large queries to be adaptive when memory availability varies. In a recent paper [Pang93], we presented and evaluated techniques that allow hash joins to adapt to changes in their allocated memory. This study focuses on the same problem for large external sorts, i.e. sorts that involve relations that cannot fit entirely in the available memory, and for sort merge joins. Sorting is ....
H. Pang, M.J. Carey, M. Livny, "Partially Preemptible Hash Joins", Proc. of the ACM 1993 SIGMOD Conf., May 1993.
....pages because, for our system configuration, this choice gives a good compromise between reducing the number of random I Os, and keeping pages around in the hope that these pages will be fetched again while they are still in memory, thus eliminating some I O operations. It should be noted that, in [Pang93a], spooled pages are written out one page at a time. This accounts for the different performance figures reported there. However, the relative performance between different algorithms mechanisms remains the same. 25 (2) Scan R. Hash each tuple with h. If the tuple belongs to an expanded ....
....Instead, an RTDBS needs to be willing to run queries at memory allocations that are below their maximum requirements so that enough queries can be admitted to take advantage of the RTDBS s disk and CPU resources. This is facilitated by memory adaptive query processing techniques (such as those of [Pang93a, Pang93b]) that permit queries to execute efficiently in the face of memory fluctuations. Among the algorithms that do not insist on maximum memory allocations, Proportional allocation leads to very large miss ratios and should be avoided. This is why PMM employs MinMax, instead of Proportional, ....
H. Pang, M.J. Carey, M. Livny, "Partially Preemptible Hash Joins", Proc. of the ACM SIGMOD Conf., May 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC