| Y. Ioannidis, Y. Kang. Randomized algorithms for Optimizing Large Join Queries. SIGMOD 1990. |
....Algorithms Traditionally, randomized or genetic algorithms have been proposed to replace dynamic programming when dynamic programming is infeasible. The most successful of these algorithms, called 2PO, combines iterative improvement (a variant of hill climbing) with simulated annealing [27]. The problem with any of these randomized algorithms is that they must compute the costs of the plans under consideration (typically the current plan and its neighbors in some plan space) after each step, which means the optimizer will require multiple rounds of messages for costing. A natural ....
....cost. The Iterative Dynamic Programming [33] Section 3.3) can be used if the exhaustive algorithm turns out to be too complex in time or space; this is not unlikely in a distributed scenario even with small queries. Randomized algorithms have also been proposed for complex join queries [27, 34]. The second row from the bottom is relevant even in centralized database systems, where run time conditions can significantly affect the execution cost of a query plan. 11, 29, 16, 10] discuss how parametric optimization can be used to compute a set of plans optimal for different values of the ....
[Article contains additional citation context not shown here]
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. In SIGMOD, 1990.
....have been successfully applied to various combinatorial optimization problems in the past, including the optimization of queries with many joins. We adapt three such algorithms (simulated annealing (SA) 133 [KGV83, IW87] iterative improvement (II) NSS86, SG88] and two phase optimization (2PO) [IK90, IK91]) for parametric query optimization of select project join queries, and present experimental results that show the effectiveness of the devised adaptations. Several projects have considered supporting multiple plans for a query. The earliest significant work in this area is by Graefe and Ward ....
....if it has the lowest cost among all states. It is on a plateau if it has no lower cost neighbor, and yet it can reach lower cost states without uphill moves. Using the above terminology, we briefly 134 outline three randomized optimization algorithms that have been used for query optimization [IW87, SG88, IK90, IK91]. First, II performs a large number of local optimizations. A local optimization starts at a random state and improves the solution by repeatedly accepting random downhill moves until it reaches a local minimum. Its output at the end is the least cost local minimum that has been visited. Second, ....
[Article contains additional citation context not shown here]
Ioannidis YE, Kang Y (1990) Randomized algorithms for optimizing large join queries. In: Proc. of the 1990 ACM-SIGMOD Conference on the Management of Data, Atlantic City, N.J., pp 312--321
....15, 16, 28, 44, 51, 52, 57, 58, 62, 77] In particular, we are interested in scenarios, like on the Web, with many sources and subgoals. So, these schemes are too expensive. Some other solutions reduce the search space through techniques like simulated annealing, random probes, or other heuristics [19, 33, 34, 50, 60, 69, 70]. While these approaches may generate efficient plans in some cases, they do not have any performance guarantees in terms of the quality of plans generated (i.e. the plans generated by them can be arbitrarily far from the optimal one) Many of these techniques may even fail to generate a feasible ....
Y. Ioannidis, Y. Kang. Randomized Algorithms for Optimizing Large Join Queries. In Proc. ACM SIGMOD Conference, 1990.
....search algorithms are unacceptably slow. Although they cannot guaranty theoretical bounds for their results, they have been proven very efficient for some classes of optimization problems. In the database literature they have successfully been applied to the optimization of large join queries [24, 12]. Our contribution includes: i) the adaptation of randomized search methods for the view selection problem. We propose transformation rules that help the algorithms move through the search space of valid view selections, in order to identify sets of views that minimize the query cost. ii) the ....
....algorithms, like dynamic programming, which find the optimal ordering of joins is efficient. As the number of joins increases, however, the running time of systematic methods grows exponentially, rendering exhaustive optimization inapplicable. Motivated by this fact, a number of approaches [24, 12, 8] use fast randomized search heuristics, that estimate sub optimal execution plans for large relational queries. In practice, however, the demand for query optimization using randomized search is limited, since queries that involve more than 10 joins are seldom posed. On the other hand, in OLAP ....
[Article contains additional citation context not shown here]
Y. Ioannidis, Y.C. Kang, "Randomized Algorithms for Optimizing Large Join Queries", in: Proc. ACM SIGMOD, 1990.
....heuristics incrementally construct a QEP by making an optimal decision at each step: If a decision taken in one step is affected by previous decisions or affects later decisions, then the optimality may be lost due to a wrong decision. This is the case when the merge sort join algorithm is used [9] or if data locality is taken into account in a shared nothing architecture [16] 17] A viable alternative to heuristic based and dynamic optimization is found in the usage of combinatorial optimization techniques. The most promising results are observed for iterative improvement [35] 34] ....
....taken into account in a shared nothing architecture [16] 17] A viable alternative to heuristic based and dynamic optimization is found in the usage of combinatorial optimization techniques. The most promising results are observed for iterative improvement [35] 34] simulated annealing [12] [9], 33] and combinations of the two [10] The largest amount of work on nonexhaustive optimization techniques refers to uniprocessor query execution. Simulated annealing for parallel spaces is studied in [16] where an enhanced variation of the base technique is proposed. Lin et al. use parallel ....
[Article contains additional citation context not shown here]
Y. Ioannidis and Y. Kang, "Randomized Algorithms for Optimizing Large Join Queries," Proc. SIGMOD Int'l Conf. Management of Data, pp. 312--321, Atlantic City, N.J., ACM, 1990.
....methods must be designed to solve this optimization problem. Almost all the past methods adopt an existing algorithm or a modified version of it to find a near optimal solution from the huge search space. Some techniques using combinatorial algorithms [27] 28] such as iterative improvement [11] and simulated annealing [13] 12] require a long execution time to find a near optimal plan. Some algorithm using methods, such as dynamic programming with branch and bounds [26] achieved polynomial time, however, with a high complexity of at least On 3 . A major problem of these methods is ....
....selectivity factor algorithm. The simulated annealing algorithm can theoretically generate an optimal solution (i.e. join sequence) if it is allowed to run for a long time. Many papers have proven that its produced result is close to optimum even if its running time is reasonably limited [27] [11], 18] The other two algorithms are also often used algorithms for query optimization. 4.1 The Simulated Annealing (SA) Algorithm The simulated annealing algorithm applied to query optimization in [12] is a probability based hill climbing algorithm. The algorithm is given as follows: Algorithm ....
Y.E. Ioannidis and Y. Kang, Randomized Algorithm for Optimizing Large Join Queries, Proc. ACM SIGMOD Int'l Conf., pp. 312-321, 1990.
....restricted by w 1 . The number of plans increases exponentially with the number of involved relations (see [18, 25] for an analysis on spatial and non spatial domains) Optimization algorithms search either in a deterministic (e.g. dynamic programming) or a randomized way (e.g. hill climbing [12]) to find a cheap plan. The cost of a specific plan is computed using formulae for (i) the operators involved in the plan, and (ii) the output size of each sub query of the plan. The first provide an estimate for the cost of each node in the plan, while the second determine the cost of succeeding ....
Ioannidis Y., Kang Y. Randomized Algorithms for Optimizing Large Join Queries. ACM SIGMOD, 1990.
....various conditions we chose the following set of control parameters for the current problem: the initial probability of accepting an uphill move is 0.4, the temperature reduce factor is 0.975, and the equilibrium condition is n. Similar values were obtained for the optimization of relational joins [19], 51] We also implemented versions of the above heuristics, called the II sortDN and SAsortDN, respectively, which, instead of using a random initial plan, choose a good seed by applying the following heuristic: based on the fact that the join cost depends on the data density D and the ....
Ioannidis, Y., Kang, Y. Randomized Algorithms for Optimizing Large Join Queries. Proc. ACM SIGMOD, pp. 312--321, 1990.
....strategies. This work has been supported in part by ESPRIT III project 7091, Pythagoras . y C. Galindo Legaria was supported by an ERCIM fellowship. 1 Several probabilistic search strategies have been proposed, like : Simulated Annealing (SA) Iterative Improvement (II) and others [IW87, IK90, IK91, SG88, LVZ93, OL90, INSS92, GD87] These search strategies are probabilistic in the sense that they start at a randomly selected QEP and or the generation of the next QEP involves some probability. To generate the next QEP transformation rules are used. These rules are based on properties ....
....algebra, such commutativity and associativity. The QEPs that can be generated using a single transformation are called the neighbors of the current QEP. The performance of these transformation based optimization algorithms depends on the cost distribution over the search space and its topology [IK90] Since the set of used transformation rules defines which QEPs are neighbors, a topology is imposed on the search space. The probabilistic strategy we propose does not move around in the search space following transformations, but randomly chooses a QEP out of all alternatives. This means that ....
[Article contains additional citation context not shown here]
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. Proc. of the ACM-SIGMOD Conference on Management of Data, pages 312--321, 1990.
....time on the number of relations of the query [OL90] This combinatorial explosion make heuristics and probabilistic algorithms the prime vehicle for query optimization. Simulated Annealing (SA) and Iterative Improvement (II) are commonly used as reference points for research in this area [IW87, SG88, Swa89b, Swa89a, IK90, IK91, LVZ93]. 1 The probabilistic search algorithms SA, II, and their variations rely heavily on transformation rules to generate candidate execution plans. These transformations are based on properties of the underlying algebra, such as commutativity and associativity of the relational join. The ....
....being used. In particular, a complete set of transformations i.e. one that is sufficient to transform a starting plan into any other plan in the space does not guarantee good behavior, and it is sometimes necessary to add redundant transformations to improve the performance of algorithms [IK90]. Several sets of transformation rules have been studied, but the extent to which they allow rigorous analysis and prediction of the behavior of transformation based algorithms is somewhat limited rather, they serve to provide qualitative insight [IK91] A question that motivates the present ....
[Article contains additional citation context not shown here]
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. Proc. of the ACM-SIGMOD Conference on Management of Data, pages 312--321, 1990.
....those questions for the class of acyclic queries those whose query graph, defined below, is acyclic. The answer to the second question has a direct application to randomized query optimization, as selection of a random item in the search space is a basic primitive for most randomized algorithms [SG88, Swa89b, Swa89a, IK90, IK91, Kan91, LVZ93, GLPK94]. 1 A B C D p 1 p 2 p 3 A B D C p 1 . p3 . p 2 . Phi Phi H H Phi Phi H H Phi Phi H H A D B C Theta p 1 p 3 . p 2 . Phi Phi H H Phi Phi H H Phi Phi H H Figure 1: Query graph and operator trees. Acceptable operator trees are subject to restrictions on which relations ....
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. Proc. of the ACM-SIGMOD Conference on Management of Data, pages 312--321, 1990.
....feasible, but the number of join orders increases very fast as the number of relations grow. Heuristics and or probabilistic algorithms are then a viable alternative. Research on probabilistic algorithms has focused on Simulated Annealing (SA) and Iterative Improvement (II) and their variations [IW87, SG88, Swa89b, Swa89a, IK90, IK91, LVZ93]. Those optimization algorithms rely heavily on transformation rules 1 to generate alternative join evaluation orders. The transformation rules are usually based on algebraic properties of the join evaluation orders, like commutativity and associativity, and they impose a particular topology on ....
....accounts for cpu only and considered execution plans with only hash joins. In this paper we extend our previous experiments to assess the stability of the phenomenon observed. We use the same I O dominated cost model used at the University of Wisconsin in their randomized optimization work [IK90, Kan91]. We examine the impact of indices, changes on the statistical profiles of the catalogs, and the use of different join algorithms. For the problem of selecting a join order, the size of the space is exponential in the number of relations (see [GLPK95] for the exact size) When, in addition, a ....
[Article contains additional citation context not shown here]
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. Proc. of the ACM-SIGMOD Conference on Management of Data, pages 312--321, 1990. 14
....optimization problem which has to be incorporated into a set of transformations. Though applied successfully to various combinatorial 1. Introduction 2 optimization problems, in context of join ordering they earned only questionable reputation as different studies showed contradicting results [13, 7, 11]. These reports make them appear unreliable and very sensitive to a multitude of parameters needed for proper tuning. In this paper we tread a new path inspired by an observation first reported on by Ioannidis and Kang [8] The distribution of costs in the search space is left weighted, i.e. the ....
....3. For a pair of query plans t and t 0 , different cost functions C 1 and C 2 do not define the same relation in general, however, for plans with extremal costs we often observe: C 1 (t) C 1 (t 0 ) C 2 (t) C 2 (t 0 ) 4. The majority of query plans have costs lower than the mean c [7]. Moreover, the distributions found bear strong resemblance with the Gamma distribution having shape parameters between 1 and 2. 2. Inside Cost Models 4 While points 1 through 3 are more or less expectable, the fourth needs special attention. This effect was first spotted by Ioannidis and Kang in ....
Y. E. Ioannidis and Y. C. Kang. Randomized Algorithms for Optimizing Large Join Queries. In Proc. of the ACM SIGMOD Int'l. Conf. on Management of Data, pages 312--321, Atlantic City, NJ, USA, May 1990.
....carried out under project CONQUER. 1. Introduction Much e ort has been spent on designing and implementing algorithms for database query optimization. Almost all current query optimizers are targeted at nding the best (or at least a good) execution plan for a single query at a time [SAC 79, IK90, GLPK94, VM96] This is a reasonable approach for ad hoc querying and traditional applications ring isolated, but rather complex queries at a time. Modern database applications, such as data mining, however, strongly interact with the DBMS by sending a stream of query batches. This stream ....
Y. E. Ioannidis and Y. C. Kang. Randomized Algorithms for Optimizing Large Join Queries. In Proc. of the ACM SIGMOD Int'l. Conf. on Management of Data, pages 312-321, Atlantic City, NJ, USA, May 1990.
....spaces is an inevitable prerequisite. In this paper we tackle the underlying, general question: Can t we reduce the total number of plans that have to be explored by exploiting the topology of the processing trees Our research was inspired by the cost model proposed by Ioannidis and Kang in [3]. A costing algorithm can anticipate the commutative exchange of the input relations to a join operator and choose the more cost efficient of the two alternatives on the fly. This decision is operator local and no larger context 2. Sequences 2 A C B A B E C D E D C 5 D E A 1 4 B 3 2 ....
Y. E. Ioannidis and Y. C. Kang. Randomized Algorithms for Optimizing Large Join Queries. In Proc. ACM SIGMOD Int'l. Conf., pages 312--321, Atlantic City, NJ, USA, May 1990.
....may be imperfect, the prediction error could be safe and there could be no penalty. 6. 3 Experimental Evaluation of the WebPT Prediction The experimental evaluation was performed using a simulator of a distributed query processing environment, with a two phase randomized query optimizer [13]. The simulator is described in [25] The QS algorithm is implemented on top of the simulator. The query processing environment has a query site, which execute queries, and data sites, that store relations used in queries. It assumes each relation is located in a different data site. All joins are ....
Y. Ioannidis and Y. Kang. Randomized algorithms for optimizing large join queries. Proceedings of the ACM Sigmod Conference, 1990.
No context found.
Y. Ioannidis, Y. Kang. Randomized algorithms for Optimizing Large Join Queries. SIGMOD 1990.
No context found.
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. In Proc. of the ACM SIGMOD Conf. on Management of Data, pages 312-- 321, 1990.
No context found.
Y. E. Ioannidis and Y. Kang. Randomized algorithms for optimizing large join queries. In Proc. of the 1990.
No context found.
Yannis Ioannidis and Younkyung Cha Kang. Randomized algorithms for optimizing large join queries. In SIGMOD, 1990.
No context found.
Yannis Ioannidis and Younkyung Cha Kang. Randomized algorithms for optimizing large join queries. In SIGMOD, 1990.
No context found.
Y. E. Ioannidis and Younkyung Kang. Randomized algorithms for optimizing large join queries. In Hector Garcia-Molina and H. V. Jagadish, editors, Proceedings of the 1990.
No context found.
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. In Proc. of the ACM SIGMOD Conf. on Management of Data, pages 312-- 321, 1990.
No context found.
Y. Ioannidis, Y. Kang. Randomized algorithms for Optimizing Large Join Queries. SIGMOD 1990.
No context found.
Y.E. Ioannidis and Y.C. Kang. Randomized Algorithms for Optimizing Large Join Queries. In Proc. ACM SIGMOD Conference, pp. 312--321, Atlantic City, 1990.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC