| M. Mehta and D. DeWitt. Managing Intra-Operator Parallelism in Parallel Database System. In VLDB, 1995. |
....can be serviced in parallel. For example, a disk drive can only service one request at a time, while a RAID 1 mirrored logical volume can service two reads in parallel. This attribute is useful in determining the appropriate level of parallelism for different parallel sort and join algorithms [13, 21] and proper data placement in parallel database systems [22] ACCESS DELAY BOUNDARIES captures the track alignment performance characteristic described in Section 2.3. By allocating and accessing data within these boundaries, the storage device offers the most efficient access. These units are ....
Manish Mehta and David J. DeWitt. Managing intra-operator parallelism in parallel database systems. International Conference on Very Large Databases (Zurich, Switzerland, 11--15 September 1995.
....memory across a shared nothing cluster to implement hybrid hash join. DeWitt et al. present practical techniques for handling data skew for a hash join and external sort [9, 10] These techniques rely on sampling a static data set, which is infeasible in the streaming scenario. Mehta and DeWitt [17] and Rahm and Marek [21] describe how to account for current CPU utilization, memory usage, and I O load to perform site selection and determine degree of declustering for hash joins. None of these previous schemes consider repartitioning the join operator during execution. Wolf et al. 31] and ....
M. Mehta and D. DeWitt. Managing Intra-operator Parallelism in Parallel Database Systems. In VLDB, 1995.
....were proposed to compose such operators into a dataflow. In [10] and [9] the authors present practical techniques for handling data skew for a hash join and external sort, respectively. These techniques rely on sampling a static data set, which is infeasible in the streaming scenario. Work in [17] and [21] describes how to account for current CPU utilization, memory usage, and I O load to perform site selection and determine degree of declustering for hash joins. None of these previous schemes consider repartition11 ing the join operator during execution. In [30] and [15] the authors ....
M. Mehta and D. DeWitt. Managing Intra-operator Parallelism in Parallel Database Systems. In VLDB, 1995.
....the architectures experimented with. Then, we explain how the whole query can be executed on smart disk system. We also explain the notion of operation bundling. 4. 1 Individual Database Operations Query optimization and processing in distributed environments has been studied by many researchers [19, 20, 28, 23, 37]. Many of the algorithms we have used in this work are adopted from the algorithms developed for distributed systems. We had to simplify some of the algorithms. But, these simplifications do not invalidate our comparisons, because we use the same assumptions and similar algorithms for both the ....
M. Mehta and D.J. DeWitt. Managing Intra-Operator Parallelism in Parallel Database Systems. In Proc. 21st Conference on Very Large Databases (VLDB'95), pp. 382-394, 1995.
....operations to execute the whole query. We introduce the notion of operation bundling and explain the protocol we devised for reducing the communication. 3. 1 Individual Database Operations Query optimization and processing in distributed environments has been studied by many researchers [20, 21, 28, 23, 38]. Many of the algorithms we have used in this work are adopted from the algorithms developed for distributed systems. We had to simplify some of the algorithms. But, these simplifications do not invalidate our comparisons, because we use the same assumptions and similar algorithms for both the ....
M. Mehta, D.J. DeWitt. Managing Intra-Operator Parallelism in Parallel Database Systems. In Proc. 21st Conference on Very Large Databases (VLDB'95), pp. 382-394, 1995. 18
....the architectures experimented with. Then, we explain how the whole query can be executed on smart disk system. We also explain the notion of operation bundling. 4.1. Individual Database Operations Query optimization and processing in distributed environments has been studied by many researchers [19, 20, 28, 23, 37]. Many of the algorithms we have used in this work are adopted from the algorithms developed for distributed systems. We had to simplify some of the algorithms. But, these simplifications do not invalidate our comparisons, because we use the same assumptions and similar algorithms for both the ....
M. Mehta and D.J. DeWitt. Managing Intra-Operator Parallelism in Parallel Database Systems. In Proc. 21st Conference on Very Large Databases (VLDB'95), pp. 382-394, 1995.
....are neither a base relation nor a single join) 4 . Next section will position the new LBT processing strategy in relation to previous works. 3 Previous works Early related work in parallel query optimization (e.g. 18,20] have concentrated on linear trees and intra operator parallelism (e.g. [21,22]) These works have not yet consider operator orderings including inter operation parallelism because of its high scheduling complexity and its difficult synchronization. In the last years, several parallel DBMS products, as PARIS [19] Prisma [23] and the DB2 Parallel edition DBMS [24] have ....
M. Mehta and D.J. DeWitt. Managing Intra-operator Parallelism in Parallel Databases. In Proceedings of the International Conference on Very Large Databases, pages 382--394, Zurich, Switzerland, September 1995.
....Relation partitioning can be static (on the disks) or dynamic (at run time) With static partitioning, relations are physically partitonned using a parallel storage model based on a partitioning function like hashing. Relation partitioning typically dictates the degree of intra query parallelism [Mehta95]. This approach is very popular in research prototypes, e.g. Bubba [Boral90] Gamma [Dewitt90] and Volcano [Graefe94] and commercial products, e.g. DB2, Informix, Tandem and Teradata. Static partitioning reduces well interference between processors as they 4 Luc Bouganim, Benot Dageville, Patrick ....
....which offsets the gains obtained from a better load balancing. We address this problem in the next section. 5. 6 Expt 3: Varying the degree of partitioning The degree of relation partitioning on disks typically determines the degree of parallelism, hence the choice of full declustering [Mehta95] or partial declustering [Copeland88] In DBS3, the degree of partitioning can be higher than the number of disks which is useful to reduce the effect of skewed data distribution. However, having more fragments than disks can induce some overhead since there are more queues to be created and ....
M. Metha, D. DeWitt, "Managing Intra-operator Parallelism in Parallel Database Systems" Int. Conf. on VLDB, Zurich, Switzerland, September 1995.
....down [Val93] On the other hand, some authors propose to decide a processor allocation at compile time and to eventually readjust it at run time. Thus, Hong [Hon92] mentions such a readjustment for shared memory systems, but his ideas have not been implemented. Rahm et al. RR95] and Mehta et al. [MD95] study reallocation strategies for shared nothing systems. Both authors only consider single join queries for which they propose a readjustment of the processors allocation based on the I O and the CPU consumption of the operation. DB2 Version3 [TW94] also performs a dynamic reallocation, but ....
M. Mehta and D. DeWitt. Managing Intraoperator Parallelism in Parallel Database Systems. In Proceedings of the International Conference on Very Large Databases, Zurich, Switzerland, September 1995.
....combine CPU bound and IO bound tasks from different plans [Hon92] this may not be possible after the collapse. Wilschut et al. WFA95] present a comparative performance evaluation of various multi join execution strategies on the PRISMA DB parallel main memory database system. Mehta and DeWitt [MD95] and Rahm and Marek [RM95] present experimental evaluations of various heuristic strategies for determining the degree of intra operation parallelism and assigning processors in shared nothing DBMSs. Both of these papers avoid dealing with complex query scheduling issues by assuming workloads ....
....is obvious. We should note that we decided to maintain the simulator s execution model intact even though we could have modified it to fit our earlier assumptions, since the same base simulation model has been used in a number of recent experimental studies on parallel query processing [MD95, MD97, Meh94, PI96] In a real system, where some of the simulator s assumptions above may not hold, simple changes to our algorithm s cost model to improve effectiveness should be easily doable. 7.2 Experimental Testbed and Methodology We have experimented with scheduling a collection of ....
Manish Mehta and David J. DeWitt. "Managing Intra-operator Parallelism in Parallel Database Systems ". In Proceedings of the 21st International Conference on Very Large Data Bases, pages 382--394, Zurich, Switzerland, September 1995.
....de temps de reponse et de debit de requetes. En effet, le partage de ressources induit des sur couts additionnels lies aux changements du contexte. Meme dans un environnement multi threads, ce cout n est pas negligeable. En pratique, MODPAROPT adopte une methodologie proposee par Mehta et al. MD95] dans le cadre du traitement d une jointure : si un processeur est deja utilise par un operateur, allouer un deuxieme operateur a ce processeur induira un sur cout d execution de 10 par rapport au temps d execution estime sur un processeur libre. Pour determiner l allocation des ressources ....
M. Mehta and D.J. DeWitt. Managing Intra-operator Parallelism in Parallel Databases. In Proceedings of the International Conference on Very Large Databases, Zurich, Switzerland, September 1995.
....different kinds of skew [Kit90, DeW92, Ber92, Sha93] based on dynamic data redistribution. With inter operator parallelism, distributing the query s operators among all processors can also yield poor load balancing. Much research has been dedicated to inter operator load balancing in sharednothing [Meh95, Rah95, Gar96] which is done statiPage cally during optimization or dynamically prior to execution. The potential reasons for poor load balancing in shared nothing are studied in [Wil95] First, the degree of parallelism and the allocation of processors to operators, decided in the parallel optimization ....
....a shared virtual memory space which includes all local caches. To experiment with various hierarchical system configurations, we cluster processors as SM nodes 2 and simulate inter node communication using the following typical network parameters: Network Parameters Values Bandwidth (based on [Meh95]) Infinite End to end transmission delay 0.5 ms CPU cost for sending 8K byte 10000 instr. CPU cost for receiving 8K byte 10000 instr. Furthermore, to experiment with multiple disks (only one disk of the KSR1 was available to us) we simulate disk accesses to base relations with the following ....
[Article contains additional citation context not shown here]
M. Metha, D. DeWitt, "Managing Intraoperator Parallelism in Parallel Database Systems ". Int. Conf. on VLDB, Zurich, September 1995.
....may be violated because the run time workload is unpredictable. When these assumptions are violated, queries need to be re optimized to avoid performance degradation. Some authors propose to decide a processors allocation at compile time and to eventually readjust it at runtime [HON92] RAH95] [MEH95]. On the other hand, systems like DBS3 [VAL91] and Informix [INF94] determine the processors allocation only at run time. However this makes the code very complex and slows the execution down [VAL93] The first phase of our two phase optimization strategy only optimizes sequential query execution ....
M. Mehta, D. DeWitt. Managing Intra-operator Parallelism in Parallel Database Systems. Proceedings of the International Conference on Very Large Databases, Zurich, Switzerland, September 1995.
....of an earlier phase of conventional centralized query optimization) CHM95, GW93, HM94, Hon92, HCY94, LCRY93] and 2. run time execution: achieving some system wide performance goals (e.g. maximizing query throughput) by adaptive scheduling of the operators of multiple concurrent queries [MD93, MD95, RM95] We address the first problem, i.e. parallelization of query execution plans. We consider the full variety of bushy plans and schedules that incorporate independent and pipelined forms of inter operation parallelism as well as intra operation (i.e. partitioned) parallelism. One of the ....
Manish Mehta and David J. DeWitt. "Managing Intra-operator Parallelism in Parallel Database Systems ". In Proceedings of the 21st International Conference on Very Large Data Bases, pages 382--394, Zurich, Switzerland, September 1995.
....of an earlier phase of conventional centralized query optimization) CHM95, GW93, HM94, Hon92, HCY94, LCRY93] and 2. run time execution: achieving some system wide performance goals (e.g. maximizing query throughput) by adaptive scheduling of the operators of multiple concurrent queries [MD93, MD95, RM95] We address the first problem, i.e. parallelization of query execution plans. We consider the full variety of bushy plans and schedules that incorporate independent and pipelined forms of inter operation parallelism as well as intra operation (i.e. partitioned) parallelism. One of the ....
M. Mehtaand D. J. DeWitt. "Managing Intra-operator Parallelism in Parallel Database Systems". In Proc. of the 21st Intl. Conference on Very Large Data Bases, Zurich, Switzerland, September 1995.
....Mariposa s brokering and bidding process is more expensive but also more flexible. For example, each Mariposa bidder can formulate its bid in any way it chooses. Much research in parallel query processing has focused on speeding up single queries, primarily by exploiting intraoperator parallelism [MD95]. An operation which can be performed in parallel by several processors at once, such as sorting [DNS91] or hash joins [ZG90] is divided among all available processors. Intra operator parallelism is sensitive to data skew; since each processor is performing essentially the same task over different ....
M. Mehta, D.J. DeWitt, "Managing Intra-Operator Parallelism in Parallel Database Systems," Proc. 21st VLDB Conf., (1995) pp. 382-394.
....query optimizer, but it also leads to instability in the plans generated. It is also much easier to adapt the two phase approach to shared nothing parallel database systems. The second area of related research concentrates on the optimal scheduling of operators and queries in a parallel system. [7] and [12] both propose strategies for determining the degree of parallelism and choosing the nodes on which the operators of the query should run. 4] also provides scheduling algorithms based on multi dimensional bin packing that take into account both time shared (CPU, disk, network) and ....
....the memory available for running the query Msys [k] might also be different on each node. We will assume that the number of clones to be activated for a particular operator as well as the choice of which nodes to run them on is decided by some scheduling algorithm such as the ones described in [4][7][12] The task of the LINEAR algorithm is to find optimal values for M [i] If M [i] pages of memory are allotted to each operator clone, we obtain the following 2N(OP; Q) operator constraints: 8 N(OP;Q) i=1 ( M [i] N(nodes) X k=1 N(OP i ; k) Mmax [i] 8 N(OP;Q) i=1 ( M [i] N(nodes) ....
M. Mehta, D. J. DeWitt. "Managing Intra-operator Parallelism in Parallel Database Systems". Proc. of VLDB Conf., 1995.
No context found.
M. Mehta and D. DeWitt. Managing Intra-Operator Parallelism in Parallel Database System. In VLDB, 1995.
No context found.
M. Mehta and D.J.DeWitt.: Managing Intra-operator Parallelism in Parallel Database Systems. Proc. 21 st Intl. Conference on Very Large Data Bases, Zurich, Switzerland (September 1995)
No context found.
M. Mehta and D. DeWitt. Managing Intra-operator Parallelism in Parallel Database Systems. In VLDB, 1995.
No context found.
M. Mehta and D. DeWitt. Managing Intra-operator Parallelism in Parallel Database Systems. In VLDB, 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC