| S.Helmer and G.Moerkotte. Evaluation of main memory join algorithms for joins with subset join predicates. In VLDB, 1997. |
....true or a false drop) we have to examine the corresponding sequences with the containment criterion. In order to minimize the false drops, it has been proved [4] that, for sets of length T, the length of the signatures has to be equal to: F mT=ln2 Henceforth, based on the approach of Ref. [12] for the case of set valued object databases, we assume that m is equal to one. Given a collection of sequential patterns, in Section 4 we examine effective methods for organizing the representations of the patterns, which consist of signatures of equivalent sets. 4. Family of SEQ algorithms ....
....a main memory operation, it is much smaller compared to the Index and Data Scan costs that involve I O. Therefore the latter two costs determine the cost of the searching algorithm. Moreover, it is a common method to evaluate indexing algorithms by comparing the number of disk accesses, e.g. [4,12,35]. For SEQ(C) the calculation of F (signature length) with Eq. 1) is done using the expected length, l # El; of equivalent sets (in place of T ) Since l # El grows rapidly, F can take large values, which increase the possibility of collisions during the generation of signatures (i.e. elements ....
S. Helmer, G. Moerkotte, Evaluation of main memory join algorithms for joins with set comparison join predicates, Proceedings of International Conference on Very Large Databases (VLDB'97), 1997, pp. 386 -- 395.
....Pointer joins for effi ciently traversing path expressions in object oriented databases has also been studied extensively [DLM93] SC90] However, there is very little previous work on set containment joins. The only reported work of which we are aware is the work by Helmer and Moerkotte [HM96] [HM97]. These papers investigate nested loops algorithms for computing a set containment join and propose a new signature based hash join. We discuss these algorithms in Sections 3 and 4.2 1.2 Paper Organization The rest of the paper is organized as follows. Section 2 defines the problem of set ....
....this approach performs very poorly unless the set sizes and relation sizes are small; in fact, in many cases, it is so bad that the algorithm can arguably be called ugly . 3. 3 Signature Nested Loops Algorithm for Nested Internal Representation The signature nested loops algorithm proposed by [HM97] attempts to reduce the cost of evaluating the containment predicate by approximating sets using signatures and evaluating the join predicate by comparing these signatures. A signature is a fixed length bit vector that is computed by applying a function M iteratively to every element e in the set ....
[Article contains additional citation context not shown here]
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In Proceedings of International Conference on Very Large Databases (VLDB), Athens, Greece, 1997.
....[PD96] Pointer joins for effi ciently traversing path expressions in object oriented databases has also been studied extensively [DLM93] SC90] However, there is very little previous work on set containment joins. The only reported work of which we are aware is the work by Helmer and Moerkotte [HM96], HM97] These papers investigate nested loops algorithms for computing a set containment join and propose a new signature based hash join. We discuss these algorithms in Sections 3 and 4.2 1.2 Paper Organization The rest of the paper is organized as follows. Section 2 defines the problem of ....
S. Helmet and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. Technical report, University of Mannheim, 1996.
....AND Y. MANOLOPOULOS studied in [19, 20, 21] where various variations of the bit sliced signature file were introduced. RD trees have been proposed for indexing set valued data and, when used with signatures, they exhibit similar performance to that of S trees [22] Besides inclusive queries, [23] examines the performance of signature based structures for set valued objects under the join query with subset superset predicates. In [24, 25, 26] the use of signatures in path expressions has been also studied. In the present paper, we focus on optimizing S trees, which have been widely cited ....
Helmer, S. and Moerkotte, G. (1997) Evaluation of main memory join algorithms for joins with set comparison join predicates. Proc. 23rd VLDB Conf., Athens, Greece, pp. 386-- 395.
....algorithms to support new join predicates. In spatial domains, the most common join considered has been polygon overlap [3, 8, 13] here, r.A ## s.B if the polygon in r.A overlaps the polygon in s.B. In set valued domains, the most common predicate considered has been set containment [5, 14], in which r.A ## s.B if r.A # s.B. Since our goal is to study the join problems themselves, and not specific algorithms for their evaluation, we need an abstract model for join computation that is independent of any algorithm. In our work we model an instance of the join problem as a bipartite ....
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, pages 386--395, 1997.
....joins are required ( SSJ 94] Zur 97] A temporal join is basically a one dimensional spatial join, but because of the high overlap of the data in temporal databases, most of the methods known from spatial joins perform poorly. In object oriented databases, we need to support subset joins [HM 97] The join predicate of the subset join is defined on two set valued attributes, say R.A and S.B, and all tuples of the cartesian product are retrieved where holds. Most of these join algorithms have in common that they are designed to support one specific join operation and it is not evident how ....
....joins, the subset join is not symmetric, so that we are not able to swap the roles of the inner and outer partition on demand. Despite the large amount of work in the area of join processing, we are not aware of a description of algoR. A S.B 14 rithms to process large subset joins. Note that [HM 97] presented several proposals, but input relations are assumed to be small enough to process the entire join in main memory. 4 Experiments In this section we present a few results from our experiments with Plug Join. In particular, we show the results of a comparison of Plug Join and a ....
Helmer, S.; Moerkotte, G.: Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates. VLDB 1997: 386-395
....ffl and Customers, on the other hand, do not produce any false drops: these two Customers have the same hash value, but they are stored in the same Customer partition. Statistically, the number of false drops can be estimated fairly easily; similar formulae have, e.g. been devised in [Car75, HM97] To simplify the discussion, we will assume that the join is a functional join and that there is a referential integrity constraint so that every Order refers to exactly one Customer in the join. These assumptions can easily be relaxed for cases in which there is e.g. a predicate that ....
....to support decision support queries. In the database context, bitmaps have been used to speed up the execution of joins in distributed [Bab79, VG84] as well as centralized systems [Bra84] In these proposals so called Bloom filters [Blo70] are used to filter out tuples without join partners. HM97] use bitmap signatures for processing joins involving predicates on nested sets. Also, bitmap indexing is a well known concept; see, e.g. the early work on signature files [CS89] or the bitmap indices in Model 204 [O N87] Indexing attribute values via bitmaps [OQ97, CI98, WB98] and bitmap join ....
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with subset join predicates. In Proc. of the Conf. on Very Large Data Bases (VLDB), pages 386--395, Athens, Greece, August 1997.
No context found.
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with subset join predicates. In Proc. of the 23rd VLDB Conference, pages 386--395, Athens, August 1997.
No context found.
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In Proc. Int. Conf. on Very Large Data Bases (VLDB), pages 386--395, 1997.
....must really be carried out, i.e. the sets are sorted (if necessary) and compared in a single linear scan. The smart anti semijoin variant of Query 1 employs this subset test. Details about set comparison techniques in join predicates, especially signature based set comparison, are discussed in [HM97]. 5 Benchmarking In this section, we present performance experiments comparing the alternative evaluation plans that we have discussed in Section 3. 5.1 Benchmark Platform Parameters The experiments were performed with the query engine as described in the previous section. The query client and ....
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with subset join predicates. In Proc. of the Conf. on Very Large Data Bases (VLDB), Athens, Greece, August 1997.
.... 2 e 1 :SetAttribute, but works for a conjunctive predicate e 2 2 e 1 :SetAttribute e 1 :a = e 2 :b by performing hashing over the second factor and then verifying the truth of the rst [Gra93] A signature based solution to employ hashing for predicates with set valued attributes is proposed in [HM96]. Grouping The implementation of a binary grouping operator E 1 Gamma g;p;aggr E 2 as used for our application, i.e. performing an aggregation on the groups, is similar to a semijoin. The hash implementations are based on the corresponding semijoin variants semi build and semi probe. The result ....
....must really be carried out, i.e. the sets are sorted (if necessary) and compared in a single linear scan. The ismart anti semijoinj variant of query 1 employs this subset test. Details about set comparison techniques in join predicates, especially signature based set comparison, are discussed in [HM96]. 6 Benchmarking In this section, we present performance experiments comparing the alternative evaluation plans that we have discussed in Section 3. 6.1 Benchmark Platform Parameters The experiments were performed with the query engine as described in the previous section. The query client and ....
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison predicates. Technical Report 13/1996, University of Mannheim, October 1996.
....must really be carried out, i.e. the sets are sorted (if necessary) and compared in a single linear scan. The smart anti semijoin variant of Query 1 employs this subset test. Details about set comparison techniques in join predicates, especially signature based set comparison, are discussed in [HM97]. 5 Benchmarking In this section, we present performance experiments comparing the alternative evaluation plans that we have discussed in Section 3. 5.1 Benchmark Platform Parameters The experiments were performed with the query engine as described in the previous section. The query client ....
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with subset join predicates. In Proc. of the Conf. on Very Large Data Bases (VLDB), Athens, Greece, August 1997.
....joins [7, 36] An area where more complex join predicates occur is that of spatial database systems. Here, special algorithms to support spatial joins have been developed [3, 14, 19, 26, 32] Another special join algorithm has been developed for joining objects on set valued attributes [18]. Another important research area is the development of index structures that allow to accelerate the evaluation of joins [16, 22, 23, 31, 39, 40] However, if there is no selection prior to a join or the selections exhibit a high selectivity value (i.e. many output tuples are produced) the ....
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with subset join predicates. In Proc. of the 23rd VLDB Conference, pages 386--395, Athens, August 1997.
....is organized as follows. In the next section, we introduce some basic notions needed in order to develop our join algorithms. Sections 3 introduces and evaluates several join algorithms where the join predicate is subset equal. Algorithms for other join predicates can be found in the full paper [17]. Section 4 concludes the paper. 2 Preliminaries 2.1 General Assumptions For the rest of the paper, we assume the existence of two relations R 1 and R 2 with set valued join attributes a and b. We do not care about the exact type of the attributes a and b that is whether it is e.g. a ....
....the paper is to compute efficiently the join expression R 1 1 a b R 2 More specifically, we introduce a join algorithm based on hashing and compare its performance with a nestedloop strategy. Two other join algorithms, sort merge and a tree based one, and set equality predicates are described in [17]. For convenience, we assume that there exists an (injective) function m which maps each element within the sets of R 1 :a and R 2 :b to the domain of integers. The function m is dependent on the elements type of the set valued attributes. For integers, the function is identity, for strings and ....
[Article contains additional citation context not shown here]
S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. Technical Report 13/96, University of Mannheim, Mannheim, Germany, 1996.
No context found.
S.Helmer and G.Moerkotte. Evaluation of main memory join algorithms for joins with subset join predicates. In VLDB, 1997.
No context found.
Sven Helmer and Guido Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In International Conference on Very Large Databases, pages 386-- 395, 1997.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC