| Goil, S. & Choudhary, A. (1997). High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4): 391--417. |
....this case, the initialization time may be prohibitively large. In addition, since the materialized views reside on disk, answering OLAP queries may require multiple disk I Os. To address the scalability problem of MOLAP, Goil and Choudhary proposed a parallel MOLAP infrastructure called PARSIMONY [16, 17]. Their algorithm incorporates chunking, data compression, view optimization using a lattice framework, as well as data partitioning and parallelism. The chunks can be stored as multi dimensional arrays or (OffsetInChunk,data) pairs depending on whether they are dense or sparse. The OffsetInChunk ....
....pass of input data. The freed space allocated to the dense buckets is proportional to their counts. Suppose the freed space is m. The buckets B 1 , B 2 , B k are dense with counts C 1 , C 2 , C k . 1 6 7 8 9 12 13 16 20 21 22 23 24 30 2 3 1 2 71 4 15 25 4 8 9 10 6 1 [1,1] 6,9] 12,13] [16,16] [20,24] 30,30] C[1] C[4] C[2] C[1] C[5] C[1] 1,1] 6,16] 20,24] 30,30] C[1] C[11] C[5] C[1] 61 Suppose d 1 , d 2 , d k are the step lengths of these sub buckets. D is the step length of the first level division. Then Using these parameters, the count arrays (CA) A 1 , A 2 , A k ....
[Article contains additional citation context not shown here]
S. Goil and A. Choudhary, "High Performance OLAP and Data Mining on Parallel Computers," Journal of Data Mining and Knowledge Discovery, vol. 1, pp. 391417, 1997.
.... Theta 4 and 8 Theta 8. If the initial shape constraint is too large, it is difficult to extract suitable dense subarrays. In the practice, we can sample some points in the problem space to select a initial shape. Finally, we illustrate our probabilistic inference schemes for an OLAP application [9]. We use a 3D group by operations related to OLAP (on line analytical processing) 9, 10] for the example. The code segment in Figure 12 is a sequence of sum functions operated on sparse arrays ABC, AB, AC, BC, A, B, and C. In group by operation of OLAP, the relationships among these sparse arrays ....
....to extract suitable dense subarrays. In the practice, we can sample some points in the problem space to select a initial shape. Finally, we illustrate our probabilistic inference schemes for an OLAP application [9] We use a 3D group by operations related to OLAP (on line analytical processing) [9, 10] for the example. The code segment in Figure 12 is a sequence of sum functions operated on sparse arrays ABC, AB, AC, BC, A, B, and C. In group by operation of OLAP, the relationships among these sparse arrays can be represented as a lattice. These are mainly dealing with large scale sparse data ....
Sanjay Goil and Alok Choudhary. High Performance OLAP and Data Mining on Parallel Computers, In Proceedings of IPPS/SPDP '98, Orlando, April, 1998.
....sharing sort costs [1, 16] that minimize external memory sorting by partitioning the data into memory size segments [3, 13] and that represent the views themselves as multi dimensional arrays [8, 18] By contrast, relatively little research effort has been focused upon parallel computation. In [7], the authors propose an algorithm that computes each possible view across all processors. While that technique does provide better performance than the purely sequential alternatives, it can also produce excessive inter node communication, particularly in high dimension spaces. In this paper, we ....
....turn, exploit the efficiency of the existing sequential approaches. More specifically, our techniques for building the datacube partition the original problem into a set of sub cube computations which are then distributed to individual processors. This is in contrast to the technique described in [7] that calculates each sub cube across all processors. Not only do these mechanisms fail to directly utilize current sequential algorithms, but they can create excessive inter node communication. Our algorithms require very little communication overhead and are applicable to high dimension spaces. ....
S. Goil and A. Choudhary. High performance olap and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4), 1997.
....this case, the initialization time may be prohibitively large. In addition, since the materialized views reside on disk, answering OLAP queries may require multiple disk I Os. To address the scalability problem of MOLAP, Goil and Choudhary proposed a parallel MOLAP infrastructure called PARSIMONY [10, 11]. Their algorithm incorporates chunking, data compression, view optimization using a lattice framework, as well as data partitioning and parallelism. The chunks can be stored as multi dimensional arrays or (OffsetInChunk,data) pairs depending on whether they are dense or sparse. The OffsetInChunk ....
....disk as a reference value. We call this set of measurements writing in the first row in Figure 6. 14 Figure 6: Setup time vs number of records. Figure 7: Response time vs number of records. We have repeated this series of experiments using data set D2 and query q2 = s 1 ,s 2 ,s 3 ,s 4 ) [1,10], 1,10] 1,10] 1,10] This time the size of the data set increased from 2MB to 100MB. The domain size for each of the five dimensions is 15. The experiments for D2 were run on the PC, which has a larger available local disk than the SUN workstation, to accommodate the increased data volume. The ....
[Article contains additional citation context not shown here]
S. Goil and A. Choudhary, "High Performance OLAP and Data Mining on Parallel Computers," Journal of Data Mining and Knowledge Discovery, 1:4, pp. 391-417, 1997.
.... of some truly interesting, useful patterns [16] In particular, Attribute Focusing has been successfully used to discover interesting, useful knowledge [1] 2] and it has been the target of parallelization techniques aimed at increasing its computational efficiency when mining large databases [7]. 2.1 The Interestingness Functions of Attribute Focusing As mentioned above, the goal of Attribute Focusing is to detect interesting attribute values, i.e. attribute values that significantly deviate from their expected values. In order to achieve this goal, 1] has proposed two interestingness ....
....i.e. the main goal of our experiment is to evaluate the quality of discovered knowledge. The issues of computational efficiency and scalability are somewhat orthogonal to our proposal of our new HAF method. Readers interested in computational efficiency issues are referred to [7] and [11] For our experiments, we have manually selected five dimensions, which seem to be the dimensions more promising for the discovery of interesting patterns. The selected dimensions were Claimant, Covered Item, Event Time, Insured Party, and Policy. Figure 1 shows, for each of these ....
[Article contains additional citation context not shown here]
Goil, S. and Choudhary, A. High-performance OLAP and data mining on parallel computers. Data Mining and Knowledge Discovery 1(4), 391-417. 1997.
No context found.
Goil, S. & Choudhary, A. (1997). High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4): 391--417.
No context found.
S. Goil and A. Choudhary, "High performance OLAP and data mining on parallel computers," Journal of Data Mining and Knowledge Discovery, vol no. 4, 1997.
No context found.
S. Goil and A. Choudhary. High performance olap and data mining on parallel computers. Journal ofDat Miningand Knowledge Discovery, 1(4), 1997.
No context found.
S. Goil and A. Choudhary, "High performance OLAP and data mining on parallel computers," Journal of Data Mining and Knowledge Discovery, vol. 1, no. 4, 1997.
No context found.
S. Goil and A. Choudhary. High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, (4), 1997.
No context found.
S. Goil and A. Choudhary. High performance olap and data mining on parallel computers. Journal ofDat Miningand Knowledge Discovery, 1(4), 1997.
No context found.
S. Goil and A. Choudhary, "High performance OLAP and data mining on parallel computers," Journal of Data Mining and Knowledge Discovery, vol. 1, no. 4, 1997.
No context found.
Goil, S. & Choudhary, A. (1997). High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4): 391--417.
No context found.
Goil, S. & Choudhary, A. (1997). High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4): 391--417.
No context found.
S. Goil and A. Choudhary. High performance olap and data mining on parallel computers. Journal ofDat Miningand Knowledge Discovery, 1(4), 1997.
No context found.
S. Goil and A. Choudhary, "High performance OLAP and data mining on parallel computers," Journal of Data Mining and Knowledge Discovery, vol. 1, no. 4, 1997.
No context found.
S. Goil and A. Choudhary, "High performance OLAP and data mining on parallel computers," Journal of Data Mining and Knowledge Discovery, vol. 1, no. 4, pp. 391--417, 1997.
No context found.
S. Goil and A. Choudhary, "High Performance OLAP and Data Mining on parallel computers", Journal of Data Mining and Knowledge Discovery , 1(4):391-417, 1997.
No context found.
Sanjay Goil and Alok Choudhary. High performance OLAP and data mining on parallel computers. Technical Report CPDC-TR-97-05, Center for Parallel and Distributed Computing, Northwestern University, December 1997.
No context found.
S. Goil and A. Choudhary. High Performance OLAP and Data Mining on Parallel Computers. Journal of Data Mining and Knowledge Discovery (Special Issue on Scalable High-Performance Computing for KDD), 1(4):391--417, 1997.
No context found.
S. Goil and A. Choudhary. High performance olap and data mining on parallel computers. Journal ofDat Miningand Knowledge Discovery, 1(4), 1997.
No context found.
S. Goil and A. Choudhary. High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4), 1997.
No context found.
S. Goil and A. Choudhary, "High performance OLAP and data mining on parallel computers," Journal of Data Mining and Knowledge Discovery, vol. 1, no. 4, 1997.
No context found.
S. Goil and A. Choudhary. High performance OLAP and data mining on parallel computers. In Proceedings of 12th International Parallel Processing Symposium & 9th Symposium on Parallel and Distributed Processing, Orlando, Fl., pp. 548--555, 1998.
No context found.
S. Goil and A. Choudhary. High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4), 1997.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC