| R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8. |
....2.27 1.34 Optimized 1.91 1.15 6 Related Work Numerous techniques for optimizing I O accesses have been proposed in literature. These techniques can be classified into three categories: the parallel file system and run time system optimizations [22, 8, 10, 19, 21, 16] compiler optimizations [4, 20, 17], and application analysis and optimization [20, 6, 29, 17, 7, 37] Brown et al. 5] proposed a meta data system on top of HPSS using DB2 DBMS. Our work, in contrast, focuses more on utilizing state of the art I O optimizations with minimal programming effort. Additionally, the design flexibility ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming, pages 1--10, 1995.
....can be considered as developing an out of core Java compiler. Compiler optimizations for improving I O accesses have been considered by several projects. The PASSION project at Northwestern University has considered several di erent optimizations for improving locality in out of core applications [6, 20]. Some of these optimizations have also been implemented as part of the Fortran D compilation system s support for out of core applications [29] Mowry et al. have shown how a compiler can generate prefetching hints for improving the performance of a virtual memory system [26] These projects have ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1-10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
....can be considered as developing an out of core Java compiler. Compiler optimizations for improving I O accesses have been considered by several projects. The PASSION project at Northwestern University has considered several different optimizations for improving locality in out of core applications [6, 30], including loop transformations. Some of these optimizations have also been implemented as part of the Fortran D compilation system s support for out of core applications [42] Mowry et al. have shown how a compiler can generate prefetching hints for improving the performance of a virtual memory ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
....can be considered as developing an out ofcore Java compiler. Compiler optimizations for improving I O accesses have been considered by several projects. The PASSION project at Northwestern University has considered several different optimizations for improving locality in outof core applications [6], 16] Some of these optimizations have also been implemented as part of the Fortran D compilation system s support for out of core applications [22] Mowry et al. have shown how a compiler can generate prefetching hints for improving the performance of a virtual memory system [20] These ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
....can be considered as developing an out of core Java compiler. Compiler optimizations for improving I O accesses have been considered by several projects. The PASSION project at Northwestern Universityhas considered several different optimizations for improving locality in out of core applications [7, 17]. Some of these optimizations have also been implemented as part of the Fortran D compilation system s support for out of core applications [24] Mowry et al. haveshown how a compiler can generate prefetching hints for improving the performance of a virtual memory system [21] These projects have ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & PracticeofParallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
....and code generation problems in parallel compilation [1, 19, 36] Our work on providing high level support for data intensive computing can be considered as developing an out ofcore Java compiler. Compiler optimizations for improving I O accesses have been considered by several projects [8, 21, 20, 32, 29]. These projects have concentrated on relatively simple stencil computations, with regular datasets. Our work is significantly different in the class of applications we focus on, and in our support for high level abstractions of sparse datasets. 7 Conclusions High level abstractions for ....
R. Bordawekar, A. Choudhary, K. Kennedy,C.Koelbel,andM.Paleczny.A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
.... Examples in the literature include virtual processor approaches for cyclic and block cyclic distributions [3, 12] support for dynamic and non uniform computation partitioning of data parallel programs on heterogeneous systems [18] and support for managing computation on out of core arrays [6]. 1 We show execution behavior using space time diagrams visualized using the AIMS toolkit [20] Each horizontal line in the diagram represents a time line showing the activity of a processor. Time increases to the right. Along a processor s time line, solid bars represent computation, and blank ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1-10, Santa Barbara, CA, July 1995.
....for the array storage of radio astronomy applications. Parallel i o researchers have emphasized the design of parallel file systems or parallel i o libraries suitable for large scale scientific applications. Most of these works applied the collective i o approach to out of core applications [3, 6]. Instead of optimizing the performance of the i o system, 1] took a different direction. They tuned the applications, e.g. reorganizing the order of the loops, to achieve high performance. 1] experimented with part time i o nodes and found that part time nodes slow down read intensive ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel and M. Paleczny, A Model and Compilation Strategy for Out-of-Core Data Parallel Programs, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 1995.
....Parallel I O Some efforts have been made to support parallel I O directly in the parallel programming language. For example, the Fortran D and Fortran 90D research projects explored the use of language based parallel I O with a combination of compiler directives and runtime library calls [4, 5, 39]. CM Fortran from Thinking Machines Corp. also supported reading and writing of parallel arrays. Although parallel I O was discussed during the deliberations of the High Performance Fortran (HPF) Forum, it does not appear in the final HPF standard. In all, language based parallel I O remains ....
Rajesh Bordawekar, Alok Choudhary, Ken Kennedy, Charles Koelbel, and Michael Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, pages 1--10. ACM Press, July 1995.
....one is write intensive, requiring the output of solution files and periodic checkpointing. Parallel i o researchers have emphasized the design of parallel file systems or parallel i o libraries suitable for large scale scientific applications. The collective i o approach [Acharya96, Bennett94, Bordawekar95, Kotz94b, Kotz95, del Rosario93, Seamons95] was shown to produce high performance. Most of these works were applied to out of core applications. These are i o intensive because memory cannot hold the entire problem, so data needs to be moved to and from secondary storage and each processor s main ....
....to out of core applications. These are i o intensive because memory cannot hold the entire problem, so data needs to be moved to and from secondary storage and each processor s main memory periodically. Kotz95] applied his disk directed i o approach to an out of core LU decomposition program. Bordawekar95] used the PASSION runtime library for solving three out of core applications: a Laplace equation solver using the Jacobi iteration method, LU factorization with pivoting, and three dimensional red black relaxation. Acharya96] worked on tuning the performance of i o intensive parallel ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel and M. Paleczny, A Model and Compilation Strategy for Out-of-Core Data Parallel Programs, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, July, 1995.
....from parallel computers. As a result, significant effort has been put into trying to improve parallel I O systems. To date, most of the effort has focused on observing the I O behavior of existing applications and on trying to improve the ability of I O systems to execute existing applications[2, 3, 4, 6, 8]. We decided to take a different approach; rather than tuning the I O system to optimize fixed applications, we concentrated on tuning the applications to improve their I O performance (and hopefully also improve their execution time) Our goal was to find out what strategies were required to ....
....addition, local accesses are guaranteed not to interfere with I O requests from other processors. This increases the utility of the file cache and makes the overall behavior of the application becomes more predictable. Exploiting locality in this manner is beneficial for out of core applications[1, 2, 12] on both client server and peer to peer configurations. In either configuration, exploiting locality improves I O performance as well as total execution time. Diskful machines are important: As shown by the results in Sections 4 and 5, local disks attached to compute nodes can help convert ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
....increases rapidly with the depth of access. Achieving and sustaining good performance in presence of deep memory hierarchies is a very important problem and has received significant attention in the last few years. Several research projects have worked on code transformations to improve locality [6, 31, 32]. Most of the programs still spend considerable amount of their time in accessing data from memory at a deeper level in the hierarchy. The overhead of deep memory accesses can be reduced by using asynchronous operations overlapped with computations. Substantial work has been done on compiler ....
....the secondary memory very frequently. Examples include several sensor data processing codes which perform out of core reductions on images which are several hundred MB in size. Restructuring the computation can often reduce the frequency and increase the granularity of secondary memory access [1, 6], however, such codes can still spend significant amount of their time in I O operations. 3 Problem Definition In this section, we define the interprocedural balanced code placement problem and motivate our analysis framework. As we stated in Section 2, scientific applications may make frequent ....
[Article contains additional citation context not shown here]
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
....the best execution time. Bordawekar 95a] Experimental results have shown that the Passion compiler can produce code that runs about twice as fast as the regular HPF code (using virtual memory to support out of core computation) on applications such as LU factorization and red black relaxation [Bordawekar 95b] 2.4. Shore (http: www.cs.wisc.edu shore ) Shore (Scalable Heterogeneous Object Repository) is a persistent object system developed at the University of Wisconsin. It combines many services usually provided separately by object oriented databases (OODB) and file systems. From file systems, it ....
Rajesh Bordawekar, Alok Choudhary, Ken Kennedy, Charles Koelbel and Michael Paleczny. A Model and Compilation Strategy for Out-of-Core Data Parallel Programs. In Proc. of the Fifth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, July 1995. http://www.npac.syr.edu/techreports/html/0650/abs-0696.html
....stored back into AD if necessary. On each processor, the computation is performed in phases where each phase operates on only so large parts of the arrays which can fit into the local memory. The portion of the array which is in processor s memory is called the In Core Local Array Portion (ICLAP) [2]. The computation performed in a phase corresponds to the execution of a tile of loop iterations. To determine the phases, work of the loop can be distributed among the processors, for example, in the following way: the global iteration space is first partitioned into a set of tiles and a number ....
R.R. Bordawekar, A.N. Choudhary, K. Kennedy, C. Koelbel, M. Paleczny. A Model and Compilation Strategy for Out-of-Core Data Parallel Programs.
....Pattern Figure 7. Execution times for read operations. 6. Related Work There are many proposed techniques for optimizing I O accesses. These techniques can be divided into three main groups: the parallel file system and run time system optimizations [21,6,9,17,19,14] compiler opti mizations [3,18,15], and application analysis and opti mization [18,5,24,15] The closest work to ours is the one done by Brown et al. 4] They propose a similar architecture to ours; however, they do not handle the advanced I O optimiza tions proposed in this paper. They build their meta data system on top of ....
....of parallel machines. The characterization information can be very useful in our framework in selecting suit able user level directives to implement in order to cap ture access patterns in a better manner. Finally, some amount of work has been done in the area of out of core compilation [3], 2] 7. Conclusions In this paper we present a program development en vironment based on maintaining performance related system level meta data. This environment consists of a user code, a meta data management system (MDMS) and a hierarchical storage system (HSS) and provides a seamless data ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of- core data parallel programs. Proceedings of the ACM Sym- posium on Principles and Practice of Parallel Programming, pages 1 10, July 1995.
....maximizing the I O performance. 7 Related Work Numerous techniques for optimizing I O accesses have been proposed in literature. These techniques can be classified into three categories: the parallel file system and run time system optimizations [24, 8, 10, 20, 22, 17] compiler optimizations [4, 21, 18], and application analysis and optimization [21, 7, 30, 18] Brown et al. 6] proposed a meta data system on top of HPSS using DB2 DBMS. Our work, in contrast, focuses more on utilizing state of the art I O optimizations with minimal programming effort. Additionally, the design flexibility of our ....
....performance of parallel machines. The characterization information can be very useful in our framework in selecting suitable user level directives to implement in order to capture access patterns in a better manner. Finally, some amount of work has been done in the area of out of core compilation [4, 5]. 8 Conclusions This paper has presented a novel application development environment for large scale scientific computations. At the core of our framework is the Metadata Database Management System (MDMS) framework, which uses relational database technology in a novel way to support the ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming, pages 1--10, July 1995.
....which cannot be satisfied by secondary storage devices and for applications which cannot afford the cost or system complexity of a large number of disk drives. There has been a considerable amount of work in addressing the flow of data to and from secondary storage devices (e.g. magnetic disks) [1, 2, 3, 4, 5, 6, 7, 8, 9]. There has also been a significant amount of work on the management of large scale data in a storage hierarchy involving tertiary storage devices (e.g. tapes devices) 10, 11, 12, 13, 14] Striping has been studied to improve the response time of tertiary storage devices [15, 16] The ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. Proceedings of the ACM Symp. on Prin. and Prac. of Paral. Prg., pages 1--10, July 1995.
....when arbitrary range queries are allowed. Jagadish et al. [23] investigated the problem of efficient organization of a data warehouse on secondary storage. There has been a considerable amount of work in addressing the flow of data to and from secondary storage devices (e.g. magnetic disks) [6, 9, 11, 34, 16, 29, 31, 32, 5]. There has also been a significant amount of work on the management of large scale data in a storage hierarchy involving tertiary storage devices (e.g. tapes 8 9 devices) 37, 22, 24, 25, 30] Striping has been studied to improve the response time of tertiary storage devices [15, 19] Also, ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. Proceedings of the ACM Symp. on Prin. and Prac. of Paral. Prg., pages 1--10, July 1995.
....which cannot be satisfied by secondary storage devices and for applications which cannot afford the cost or system complexity of a large number of disk drives. There has been a considerable amount of work in addressing the flow of data to and from secondary storage devices (e.g. magnetic disks) [3, 4, 5, 10, 11, 14, 15, 16, 2]. There has also been a significant amount of work on the management of large scale data in a storage hierarchy involving tertiary storage devices (e.g. tapes devices) 19, 12, 17, 18, 7] Striping has been studied to improve the response time of tertiary storage devices [13, 6] The Department ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. Proceedings of the ACM Symp. on Prin. and Prac. of Paral. Prg., pages 1--10, July 1995. 1 In the final version of this abstract we will report results obtained using several chunk shapes/sizes that satisfy a given access pattern.
No context found.
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Palecnzy. A Model and Compilation Strategy for Out-of-core Data Parallel Programs. In Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPoPP), pages 1--10, ACM Press, July 1995.
....maximizing the I O performance. 6. RELATED WORK Numerous techniques for optimizing I O accesses have been proposed in literature. These techniques can be classified into three categories: the parallel file system and run time system optimizations [21, 7, 9, 18, 20, 15] compiler optimizations [4, 19, 16], and application analysis and optimiza Table 8: Total I O times (in seconds) for volren on 4 processors (Data set size is 64 MB) File No 1 2 3 4 Original 31.18 19.20 61.86 40.22 Optimized 11.90 11.74 20.10 18.38 Table 9: Total I O times (in seconds) for volren on 8 processors (Data set ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming, pages 1--10, 1995.
....which cannot be satisfied by secondary storage devices and for applications which cannot afford the cost or system complexity of a large number of disk drives. There has been a considerable amount of work in addressing the flow of data to and from secondary storage devices (e.g. magnetic disks) [1, 2, 3, 4, 5, 6, 7, 8, 9]. There has also been a significant amount of work on the management of large scale data in a storage hierarchy involving tertiary storage devices (e.g. tapes devices) 10, 11, 12, 13, 14] Striping has been studied to improve the response time of tertiary storage devices [15, 16] The ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. Proceedings of the ACM Symp. on Prin. and Prac. of Paral. Prg., pages 1--10, July 1995.
....an overhead in this range is acceptable. 6. Related Work There are many proposed techniques for optimizing I O accesses. These techniques can be divided into three main groups: the parallel file system and run time system optimizations [17, 6, 8, 13, 16, 15, 10] compiler optimizations [3, 14, 11], and application analysis and optimization [14, 5, 18, 11] The closest work to ours is the one done by Brown et al. 4] They propose a similar architecture to ours; however, they do not handle the advanced I O optimizations proposed in this paper. They build their meta data system on top of ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming, pages 1--10, July 1995.
....for out of core C is described. Output of ViC is a standard C program with the appropriate I O and library calls added for efficient access to out of core parallel variables. In [20] the compiler support for handling out of core arrays on parallel architectures is discussed. Bordawekar et.al [4] offer a strategy to compile out of core programs on distributed memory message passing systems. It should be noted that our algorithms are general in the sense that they can be incorporated to any out of core compilation framework for parallel and sequential machines. Previous work considers ....
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data-parallel programs. In Proc. 5th ACM Symposium on Principles and Practice of Parallel Programming, July 1995.
No context found.
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel, and M. Paleczny. A model and compilation strategy for out-of-core data parallel programs. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), pages 1--10. ACM Press, July 1995. ACM SIGPLAN Notices, Vol. 30, No. 8.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC