15 citations found. Retrieving documents...
T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the 1999.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Armada: a parallel I/O framework for computational grids - Oldfield, Kotz (2002)   (Correct)

.... across disks is dependent on the value of the data, moving that function to the data server can halve network traffic [22] Processors near the data servers can filter data in an application specific way, passing only the necessary data on to the clients, saving network bandwidth and client memory [10,22 24]. Processors near the data servers can exchange blocks without passing the data through clients, e.g. to rearrange blocks between disks during a copy or permutation operation. Format conversion, compression, and decompression are also possible. In short, there are many ways to optimize memory and ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, Querying very large multi-dimensional datasets in ADR, in: Proceedings of the SC99 on High Performance Networking and Computing, ACM Press/IEEE Computer Society Press, Portland, OR, 1999.


Active Harmony: Towards Automated Performance Tuning - Tapus, Chung, Hollingsworth (2003)   (5 citations)  (Correct)

....our search algorithm can md a solution that is within 0.05 of the minimal value. 6.2. 1 I O Intensive Application To evaluate our optimization system using a real application, we selected a 3 d volume reconstruction application [2] built on top of the Active Data Repository (ADR) middleware [9]. The 3 d volume reconstruction application uses digital images of a space to reconstruct the objects that are visible from the various camera angles. The ADR is an infrastructure that integrates storage, retrieval and processing of large multi dimensional data sets. ADR provides the user with ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz, "Querying Very Large Multi-dimensional Datasets in ADR," Proceedings ofSC99. Nov. 1999, Orlando, FL, ACM Press.


A Scientific Data Management System for Irregular.. - No, Thakur, Kaushik..   (Correct)

....systems, such as file systems, Unitree, HPSS and database objects. However, it does not fully support the optimizations implemented in MPIIO. Shoshani et al. 28, 29] describe an architecture for op6 timizing access to large volumes of scientific data stored on tapes. The Active Data Repository [17] and DataCutter [4] optimize storage, retrieval, and processing of very large multidimensional datasets. The main difference between our work and other efforts in I O is that SDM aims to combine the good features of parallel file I O and databases, whereas other efforts focus on either parallel ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying Very Large Multi-dimensional Datasets in ADR. In Proceedings of SC99: High PerformanceNetworking and Computing, November 1999.


Efficient Execution of Multiple Query Workloads in.. - Andrade, Kurc.. (2001)   (1 citation)  (Correct)

....minimum, sum and average, or complex functions such as visualization operations and data mining operations. In earlier workwedeveloped runtime support and examined strategies for efficient execution of queries in data analysis applications on distributed memoryparallel machines with a disk farm [6, 9] and in distributed, heterogeneous environments [3, 5] Our previous work has focused on the efficient execution of a single query. Better utilization of system resources can be achieved if commonalities across multiple queries are exploited. For instance, intermediate data structures computed by ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the 1999 ACM/IEEE SC99 Conference. ACM Press, Nov. 1999.


Integrating Parallel File I/O and Database Support for.. - No, Thakur, Choudhary (2000)   (1 citation)  (Correct)

....to large volumes of scientific data stored on tapes. Chervenak et al. 5] describe a general architecture for managing distributed scientific data sets in a grid environment. An architecture for data intensive distributed computing using DPSS is described in [37, 38] The Active Data Repository [17] optimizes storage, retrieval and processing of very large multi dimensional datasets. An initial discussion of a framework for scientific data management similar to the one described in this paper is given in [6] Several efforts have involved optimizing I O in parallel file systems and runtime ....

Tahsin Kurc, Chialin Chang, Renato Ferreira, Alan Sussman, and Joel Saltz. Querying Very Large Multidimensional Datasets in ADR. In Proceedings of SC99: High Performance Networking and Computing, November 1999.


Efficient Execution of Multiple Query Workloads - In Data Analysis   Self-citation (Kurc Sussman Saltz)   (Correct)

No context found.

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the 1999.


An Efficient System for Multi-perspective Imaging and.. - Borovikov, Sussman.. (2001)   (2 citations)  Self-citation (Sussman)   (Correct)

....on a parallel machine or a cluster of workstations employing an efficient software system for providing the desired functionality. To aid in efficiently implementing storage and processing of multi perspective images, we employ an object oriented framework called the Active Data Repository (ADR) [3, 4, 7, 8], that has been developed at the University of Maryland for managing and processing large amounts of scientific data on a parallel or distributed system. In this paper we describe how to customize ADR for building an efficient and flexible framework for storing multi perspective image data and ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the


Processing Large-Scale Multidimensional Data in.. - Beynon, Chang.. (2002)   (2 citations)  Self-citation (Kurc Chang Ferreira Sussman Saltz)   (Correct)

....the speedup for five server processors is 3.6 compared to a one processor server. 12 3.3 Query Processing Strategies Workload partitioning and tiling have significant effects on the performance of an application implemented using the ADR framework. We have evaluated several potential strategies [22,23,43] that use different workload partitioning and tiling schemes. To simplify the presentation, we assume that the target range query involves only one input and one output dataset. Both the input and output datasets are assumed to be already partitioned into data chunks and declustered across the ....

....c d x a y e Fig. 5. A cut for the aggregation hypergraph shown in Figure 4. 3.3.5 Experimental Results We compare the performance of HG to that of the DA and RA strategies on a 48 node PC cluster running Linux. We present experimental results using datasets derived from the VM application [43,64]. Each node in the cluster has two 450MHz Pentium II processors, 500MB memory, and one local disk. The nodes are interconnected via both Myrinet (120MB sec max. and Fast Ethernet (100Mb sec max. networks. In the experiments, only one process was executed on each node. For HG, we use a ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the


Multiple Query Optimization for Data Analysis.. - Andrade, Kurc..   (1 citation)  Self-citation (Kurc Sussman Saltz)   (Correct)

....the use of SMP clusters to improve response times and overall system performance. In particular, we look at the effective use of aggregate processing power and I O bandwidth for executing single and multiple queries efficiently. Unlike previous work on query execution in parallel systems [5, 6, 10, 12], our system design combines parallel execution of queries with data caching and multi threaded execution so that multiple queries can execute concurrently on multiple processors on an SMP node and also reuse cached results to improve performance and lower interprocess communication. Moreover, a ....

....are submitted to the system, some of the processors will be idle causing under utilization of the aggregate processing power. In order to alleviate these problems, each query is executed in parallel. We have implemented two strategies based on the replicated accumulator scheme developed in [10]. In the Fully Replicated Accumulator (FRA) scheme, a query is assigned to all the SMP nodes in the system for evaluation. The entire accumulator structure associated with the query is allocated on all the nodes. Each SMP node is responsible for retrieving and carrying out the aggregation ....

[Article contains additional citation context not shown here]

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the


A Hypergraph-Based Workload Partitioning Strategy.. - Chang, Kurc.. (2000)   (4 citations)  Self-citation (Kurc Chang Sussman Saltz)   (Correct)

....Grant #N6600197C8534. y Department of Computer Science, University of Maryland, College Park, MD 20742 z DepartmentofPathology, Johns Hopkins Medical Institutions, Baltimore, MD 21287 1 paper 2000 12 4 page 2 i i i i i i i i 2 gation functions are commutative and associative [9]. That is, the correctness of the aggregation operation does not depend on the order the input data items are aggregated. In earlier work, weinvestigated three strategies for distributing the workload among processors [4,9] The Distributed Accumulator (DA) strategy assigns the processing for an ....

....i i i i i i i i 2 gation functions are commutative and associative [9] That is, the correctness of the aggregation operation does not depend on the order the input data items are aggregated. In earlier work, weinvestigated three strategies for distributing the workload among processors [4,9]. The Distributed Accumulator (DA) strategy assigns the processing for an entire output element to a single processor. The output elements are partitioned across the processors and each processor carries out aggregation operations on the local output elements. Input elements are communicated to ....

[Article contains additional citation context not shown here]

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the 1999 ACM/IEEE SC99 Conference.ACM Press, Nov. 1999.


Exploration and Visualization of Very Large.. - Kurc.. (2001)   (4 citations)  Self-citation (Kurc Chang Sussman Saltz)   (Correct)

....good performance can be achieved for diverse applications. The Active Data Repository is an on going project. The version of ADR used in this paper employs a query processing strategy called fully replicated accumulator. We are also investigating other potential strategies. In our earlier work [11, 24], we examined the performance impact of two other strategies, referred to as sparsely replicated accumulator (SRA) and distributed accumulator (DA) under various application scenarios. The SRA strategy replicates the accumulator elements only on processors with local input elements that map to ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. In Proceedings of the 1999 ACM/IEEE SC99 Conference. ACM Press, Nov. 1999.


Optimizing Retrieval and Processing of.. - Chang, Kurc, Sussman.. (2000)   (3 citations)  Self-citation (Kurc Chang Sussman Saltz)   (Correct)

....the data items retrieved to generate the output products. Output products can be returned from the back end nodes to the requesting client, or stored in ADR. This paper addresses optimization of processing for range queries on distributed memory machines within the ADR framework. In earlier work [6, 14], we described three potential processing strategies, and evaluated the relative performance of these strategies for several application scenarios and machine configurations. Our experimental results showed that the relative performance of the strategies changes under varying application ....

....Strategies In this section we briefly describe three strategies for processing range queries in ADR. First we briefly describe how datasets are stored in ADR, and outline the main phases of query execution in ADR.More detailed descriptions of these strategies and of ADR in general can be found in [5, 6, 14]. 2.1 Storing Datasets in ADR A dataset is partitioned into a set of chunks to achieve high bandwidth data retrieval. A chunk consists of one or more data items, and is the unit of I O and communication in ADR. That is, a chunk is always retrieved, communicated and computed on as a whole during ....

[Article contains additional citation context not shown here]

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. Technical Report CS-TR-4022 and UMIACS-TR-99-29, University of Maryland, Department of Computer Science and UMIACS, May 1999. To appear in SC'99.


Optimizing Retrieval and Processing of.. - Chang, Kurc, Sussman.. (2000)   (3 citations)  Self-citation (Kurc Chang Sussman Saltz)   (Correct)

....the data items retrieved to generate the output products. Output products can be returned from the back end nodes to the requesting client, or stored in ADR. This paper addresses optimization of processing for range queries on distributed memory machines within the ADR framework. In earlier work [6, 14], we described three potential processing strategies, and evaluated the relative performance of these strategies for several application scenarios and machine configurations. Our experimental results showed that the relative performance of the strategies changes under varying application ....

....Strategies In this section we briefly describe three strategies for processing range queries in ADR. First we briefly describe how datasets are stored in ADR, and outline the main phases of query execution in ADR.More detailed descriptions of these strategies and of ADR in general can be found in [5, 6, 14]. 2.1 Storing Datasets in ADR A dataset is partitioned into a set of chunks to achieve high bandwidth data retrieval. A chunk consists of one or more data items, and is the unit of I O and communication in ADR. That is, a chunk is always retrieved, communicated and computed on as a whole during ....

[Article contains additional citation context not shown here]

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. Technical Report CS-TR-4022 and UMIACS-TR-99-29, University of Maryland, Department of Computer Science and UMIACS, May 1999. To appear in SC'99.


Cost Models for Query Processing Strategies in the Active Data.. - Chang (1999)   Self-citation (Chang)   (Correct)

....the parallel back end. During query execution, back end nodes retrieve input data and perform user defined operations over the data items retrieved to generate the output products. Output products can be returned from the back end nodes to the requesting client, or stored in ADR. In earlier work [3, 7], we described three potential processing strategies, and evaluated the relative performance of these strategies for several application scenarios and machine configurations. Our experimental results showed that the relative performance of the strategies changes under varying application ....

....of ADR In this section we briefly describe three strategies for processing range queries in ADR. First we briefly describe how datasets are stored in ADR, and outline the main phases of query execution in ADR. More detailed descriptions of these strategies and of ADR in general can be found in [2, 3, 7]. 2.1 Storing Datasets in ADR A dataset is partitioned into a set of chunks to achieve high bandwidth data retrieval. A chunk consists of one or more data items, and is the unit of I O and communication in ADR. That is, a chunk is always retrieved, communicated and computed on as a whole during ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. Technical Report CS-TR-4022 and UMIACS-TR-99-29,Universityof Maryland, Department of Computer Science and UMIACS, May 1999. To appear in SC'99.


Cost Models for Query Processing Strategies in the Active Data.. - Chang (1999)   Self-citation (Chang)   (Correct)

....the parallel back end. During query execution, back end nodes retrieve input data and perform user defined operations over the data items retrieved to generate the output products. Output products can be returned from the back end nodes to the requesting client, or stored in ADR. In earlier work [3, 7], we described three potential processing strategies, and evaluated the relative performance of these strategies for several application scenarios and machine configurations. Our experimental results showed that the relative performance of the strategies changes under varying application ....

....of ADR In this section we briefly describe three strategies for processing range queries in ADR. First we briefly describe how datasets are stored in ADR, and outline the main phases of query execution in ADR. More detailed descriptions of these strategies and of ADR in general can be found in [2, 3, 7]. 2.1 Storing Datasets in ADR A dataset is partitioned into a set of chunks to achieve high bandwidth data retrieval. A chunk consists of one or more data items, and is the unit of I O and communication in ADR. That is, a chunk is always retrieved, communicated and computed on as a whole during ....

T. Kurc, C. Chang, R. Ferreira, A. Sussman, and J. Saltz. Querying very large multi-dimensional datasets in ADR. Technical Report CS-TR-4022 and UMIACS-TR-99-29, Universityof Maryland, Department of Computer Science and UMIACS, May 1999. To appear in SC'99.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC