| P. Deshpande, S. Agarwal, J. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996. |
....the case of the data cube operator. Sarawagi et al. [SAG96] present several optimizations, like combining common operations across multiple group bys (e.g share sort partition cost and amortize disk scans) caching and using pre computed group bys for computing other groupbys. Deshpande et el [DANR96] proposed an algorithm that minimizes the number of disk accesses and overlaps the computation of the group bys by exploiting partially matching sort orders. These algorithms work well for Relational OLAP (ROLAP) systems that store their data in conventional relational tables. Zhao et al. [ZDN97] ....
....(MOLAP) systems, that store their data in sparse multidimensional arrays rather than in tables. For the synthetic and rather dense datasets that they used for their experiments the authors showed that the MOLAP approach can be significantly faster that the ROLAP tablebased algorithm proposed in [DANR96] The benefit comes from the fact that the array representation allows direct access to individual cells. Real world data however, for many application domains, are often very large and sparse. Ross and Srivastava in [RS97] argue that none of the previous algorithms is very efficient for sparse ....
P.M. Deshpande, S. Agrawal, J.F. Naughton, and R. Ramakrishnan. Computation of Multidimensional Aggregates. Technical report, 1314, University of Wisconsin, Madison, 1996.
....and intermediate results between group bys with common attributes. The PIPESORT algorithm was introduced in [AAD 96, SAG96] The idea is to convert the cube lattice (see Figure 2) to a processing tree and compute every group by from the smallest parent. The OVERLAP algorithm, proposed in [DANR96] overlaps the computation of the group bys by using partially matching sort orders, in order to reduce the number of sorting steps required. For example, OVERLAP uses the sorted abc group by, in order to produce the ac sort order. It does that by sorting independently each one of the a ....
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996. 24
....(product) customer) and ALL (for empty attribute) As such, for a CUBE operator on n On leave from The National University of Singapore attributes, 2 n GROUP BYs or cuboids have to be computed. Sorting based, hash based and array based algorithms were proposed to compute a single datacube [1, 4, 5, 7, 8, 9, 11]. Agarwal et al. summarized the possible optimization techniques such as smallestparent, cache results, amortize scans, share partitions and share sorts, for computing multiple group bys in datacube computation [1] Study on multi cube computation is motivated by two facts. First, due to the ....
....a cuboid tree that minimizes the total number of sorting. Using their paths algorithm, the cuboid tree for a 4 attribute datacube is shown in Figure 1. In this study, we construct the same cuboid tree using the paths algorithm. 3. Previous Single Cube Algorithms The PipeSort algorithm [1, 4] attempts to optimize the overall cost of the computation of a datacube using various ways of cost estimations, in order to determine which cuboid will be used to actually compute other cuboids. Then it converts the resulting tree into a set of paths such that every edge in the tree is in one and ....
[Article contains additional citation context not shown here]
P. Deshpande and et al. Computation of multidimensional aggregates. Technical Report Technical Report 1314, University of Wisconsin-Madison, 1996.
....DANR96, HaRU96] However, relatively less work has been devoted to parallel processing of aggregates [ShNa94] In this section, we discuss some interesting properties and issues related to data cube computation using parallel processors. 3. 1 Data cube and cuboids We adopt the notations used in [DANR96]. Let R be a relation with k 1 attributes X = A 1 , A 2 , A k , V . A cuboid on j attributes S = A i1 , A i2 , A ij is defined as a group by on attributes A i1 , A i2 , A ij using aggregate function F( applied on attribute V. This cuboid can be represented as a k 1 ....
P.M. Deshpande, et. al., Computation of multidimensional aggregates. Technical Report-1314, Computer Sciences Department, University of Wisconsin-Madison, 1996.
....where each granularity is one of the 2 k possible subsets of the CUBE BY attributes B1 ; Bk ; attributes that are not present in such a subset are replaced by a special value ALL in the datacube result. Each of these 2 k granularities is referred to as a cuboid following [DANR96, AAD 96] and we use the notation Q( B i ) to denote the cuboid at granularity B i . The computation of the various cuboids are not independent of each other, but are closely related in that some of them can be computed using others. These relationships are captured in terms of the search ....
....as we show later, the I O overhead of our technique, Partitioned Cube, is about the equivalent of 11:5 passes through the relation. It thus incurs an I O overhead of only about 25 (We shall address the CPU cost separately. 2 2. 2 OVERLAP The OVERLAP algorithm proposed by Deshpande et al. DANR96, AAD 96] tries to minimize the number of disk accesses by overlapping the computation of the cuboids, by making use of partially matching sort orders to reduce the number of sorting steps performed. First, OVERLAP computes the finest granularity cuboid Q(fB1 ; Bkg) from R, and sorts ....
P. M. Deshpande, S. Agarwal, J. F. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin, Madison, 1996.
....much computation as PipeSort because PipeSort computes multiple group bys with one sort where as PipeHash must re hash the data for every group by. Second, PipeHash requires a significant amount of memory to store the hash tables for the groupbys even after partitioning. ffl Overlap, proposed in [4, 1], aims to overlap as much sorting as possible by computing a group by from a parent with the maximum sort order overlap. The algorithm recognizes that if a group by shares a prefix with its parent, then the parent consists of a number of partitions, one for each value of the prefix. For example, ....
P. M. Deshpande, S. Agarwal, J. F. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996.
....pre computed group bys for computing other group bys. Empirical evaluation shows that the resulting algorithms give much better performance compared to straightforward methods. This paper combines work done concurrently on computing the data cube by two different teams as reported in [SAG96] and [DANR96]. 1 Introduction The group by operator in SQL is typically used to compute aggregates on a set of attributes. For busi Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright ....
....is no published work in the statistical database literature on methods for optimizing the computation of related aggregates. This paper is in two parts and combines work done concurrently on computing the data cube. Part I presents the methods proposed by [SAG96] whereas the methods proposed by [DANR96] are described in Part II. Section 10 presents a summary and brief comparison of the two approaches. Part I 1 1 This part presents work done by Sunita Sarawagi, Rakesh Agrawal and Ashish Gupta at IBM Almaden Research Center, San Jose. 2 Optimizations Possible There are two basic methods for ....
[Article contains additional citation context not shown here]
Prasad M. Deshpande, Sameet Agarwal, Jeffrey F. Naughton and Raghu Ramakrishnan. Computation of Multidimensional Aggregates. Technical Report-1314, University of Wisconsin-Madison, 1996.
No context found.
P. Deshpande, S. Agarwal, J. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996.
No context found.
DESHPANDE, P., AGRAWAL, S., NAUGHTON,J.,AND RAMAKRISHNAN, R. 1996. Computation of multidimensional aggregates. Tech. Rep. 1314, Univ. Wisconsin, Madison, Madison, Wis.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan, "Computation of multidimensional aggregates," Technical Report 1314, University of Wisconsin, Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan, "Computation of multidimensional aggregates," Technical Report 1314, University of Wisconsin, Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan, "Computation of multidimensional aggregates," Technical Report 1314, University of Wisconsin, Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996.
No context found.
P. M. Deshpande, S. Agarwal, J. F. Naughton and R. Ramakrishnan, "Computation of Multidimensional Aggregates", Technical Report 1314, University of Wisconsin, Madison, 1996.
No context found.
P. Deshpande, S. Agarwal, J. Naughton, and R. Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin - Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin, Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R. Ramakrishnan, "Computation of multidimensional aggregates," Technical Report 1314, University of Wisconsin, Madison, 1996.
No context found.
P.M. Deshpande, S. Agarwal, J.F. Naughton, and R Ramakrishnan. Computation of multidimensional aggregates. Technical Report 1314, University of Wisconsin, Madison, 1996.
No context found.
P. Deshpande, S. Agarwal, J. Naughton and R. Ramakrishnan. Computation of Multidimensional Aggregates. Technical Report 1314. University of Wisconsin, Madison, 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC