| Alex Scherer, Honghui Lu, Thomas Gross, and Willy Zwaenepoel. Transparent Adaptive Parallelism on NOWs using OpenMP. Proceedings of the 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 96-106, May 1999. |
.... usage of nested parallelism inside of OpenMP, but also OpenMP with cluster extensions, that are primarily based on a rst touch mechanism [11] or on data distribution extensions [15] These cluster extensions may also bene t from the availability of software based shared virtual memory (SVM) [5, 25, 26]. At NASA Ames, a hybrid approach was developed. The parallelization is organized in two levels: The upper level is process based, and in the lower level each process is multi threaded with OpenMP. The processes are using a Fortran wrapper around the System V shared memory module shm, that allows ....
Alex Scherer, Honghui Lu, Thomas Gross, Willy Zwaenepoel, Transparent Adaptive Parallelism on NOWs using OpenMP, in proceedings of the Seventh Conference on Principles and Practice of Parallel Programming (PPoPP '99), May 1999, pp 96-106.
....shared memory directives. Each of these categories of hybrid programming has di erent reasons, why it is not appropriate for some classes of applications or classes of hybrid hardware architectures. The paper focuses on the rst three methods; for pure OpenMP approaches, the reader is referred to [2, 11, 13, 16, 22, 24, 25]. Other parallel programming models that can also be used on clusters of SMPs are, e.g. UPC [4, 7] Co Array Fortran [19] MLP [5] and HPF [1] Di erent SMP parallelization strategies in the hybrid model are studied in [27] and in [3] for the NAS parallel benchmarks. The following section shows ....
Alex Scherer, Honghui Lu, Thomas Gross, and Willy Zwaenepoel, Transparent Adaptive Parallelism on NOWs using OpenMP, in proceedings of the Seventh Conference on Principles and Practice of Parallel Programming (PPoPP '99), May 1999, pp 96-106.
.... of OpenMP, but also OpenMP with cluster extensions that are primarily based on a rst touch mechanism [12] or on data distribution extensions [17] These cluster extensions may also bene t from the availability of software based shared virtual memory (SVM, or distributed shared memory [DSM] [4, 28, 29]. At NASA Ames, a hybrid approach was developed. The parallelization is organized in two levels: The upper level is process based, and in the lower level each process is multi threaded with OpenMP. The processes are using a Fortran wrapper around the System V shared memory module shm that allows ....
Alex Scherer, Honghui Lu, Thomas Gross, Willy Zwaenepoel, Transparent Adaptive Parallelism on NOWs using OpenMP, in proceedings of the Seventh Conference on Principles and Practice of Parallel Programming (PPoPP '99), May 1999, pp 96{ 106.
....al. 9] present software DSMs which rely on aggressive compiler optimizations. However, compiler optimizations in these papers are limited to optimizations within interval between adjacent synchronization points and thus can not optimize across threads as of our methods. Hu et al. 8] Scherer et al.[23] and Sato et al. 21] show page based software DSMs for OpenMP but compiler optimizations are not mentioned. Tseng[27] and Han and Tseng[6] show e ectiveness of compiler optimizations for DSMs by eliminate or lessen control synchronizations. Koufaty and Torrellas[12] show compiler optimizations ....
A.Scherer, et al. Transparent Adaptive Parallelism on NOWs using OpenMP. PPoPP'99, pp.96-106, 1999.
....allocated a small amount of memory and a short quantum of running time on the number of processors it requests 4. Assuming a programming and runtime environment that enables each job to dynamically adjust to the number of processors allocated to it, such as Cilk [8] or the adaptive system in [21], to what extent are job turnaround times improved by a spatial processor equipartitioning policy (EQspatial) with low scheduling overhead The alternative scheduling policies are evaluated using simulation with six one month job traces that were logged on the O2K during October 1999 through ....
A. Scherer, H. Lu, T. Gross, and W. Zwaenepoel. Transparent adaptive parallelism on NOWs using OpenMP. Proc. 7th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, Atlanta, May 1999, pp. 96106.
....WKT00] target loop based parallel programs, whereas our technique targets object based multithread programs. Computing systems that adaptively use idle machines. Various systems have been developed to allow parallel programs to execute on a network of workstations with a variable number of nodes [SLGZ99, CFGK95, BL97] In these systems, computing nodes join the computation when they become idle, and withdraw when their users need them. The way in which these systems adjust the number of processors is based on the availability of idle computers and not based on performance or the amount of ....
Alex Scherer, Honghui Lu, Thomas Gross, and Willy Zwaenepoel. Transparent Adaptive Parallelism on NOWs using OpenMP. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '99), pages
....implement the removal of nodes; while our analysis could solve this in special cases, a general solution would require more in depth analysis as well as a way to actually remove nodes. Work on how to eciently remove nodes in DSM systems once the decision has been made has previously been done [SLGZ99] 4.4 CYCLIC based Distributions We implemented LU decomposition, a canonical example where a CYCLIC based distribution is used. As with Jacobi iteration, we inserted a parameterizable delay so that we could experiment with light, moderate, and heavy computational loads. Each test used a ....
Alex Scherer, Honghui Lu, Thomas Gross, and Willy Zwaenepoel. Transparent adaptive parallelism on nows using openmp. In Seventh Conference on Principles and Practice of Parallel Programming, pages 96-106, May 1999.
....software DSMs that rely on aggressive compiler optimizations. However, the compiler optimizations in these papers are limited to optimizations within the interval between adjacent synchronization points and thus cannot optimize across threads, as our method can. Hu et al. 8] Scherer et al.[23], and Sato et al. 21] described page based software DSMs for OpenMP but did not mention compiler optimizations. Tseng[27] and Han and Tseng[6] demonstrated the e ectiveness of compiler optimizations for DSMs by eliminating or lessening the control synchronizations. Koufaty and Torrellas[12] ....
A.Scherer, et al. Transparent Adaptive Parallelism on NOWs using OpenMP. PPoPP'99, pp.96-106, 1999.
....specific network measurement and adaptation systems have been developed, some examples being [13, 3, 21, 24] An important goal of this research is to develop a framework that can be used by a large class of applications. A shared memory based approach to adaptive parallelism is explored in [18]. Node assignment and scheduling algorithms in the literature typically do not treat communication in realistic detail, but some recent exceptions are [23, 2] Several runtime support systems have been developed for partitioning and scheduling computation and communication, an example being [20] ....
SCHERER, A., LU, H., GROSS, T., AND ZWAENEPOEL, W. Transparent adaptive parallelism on NOWs using OpenMP. In Proceedings of the Seventh ACM Symposium on Principles and Practice of Parallel Programming (Atlanta, GA, May 1999).
....the QuO system includes system conditionobjects that drive adaptivity either implicitly or explicitly. An adaptive system that provides a shared memory programming model for a network of workstations or PCs can take advantage of additional nodes and also deal with withdrawal of a nodes [17]. Here the control of adaptivity rests with the (software) distributed shared memory system, but the application (or the compiler) determines the points in the execution of the program where adaptivity is possible. 6 Concluding remarks Figure 1 gives examples of the 4 different kinds of ....
A. Scherer, H. Lu, T. Gross, and W. Zwaenepoel. Transparent adaptive parallelism on nows using openmp. In Proc. 7thACM Symp. on Principles and Practice of Parallel Prog. (PPoPP'99), page (to appear) , Atlanta, GA, May 1999. ACM.
....parallel application may be started without having to abort some on going long running program, simply by reducing this application s allocated resources and letting the new application use them. We have described how we achieve transparent adaptive parallelism for data parallel programs in [15], therefore we focus more on task parallel applications in this paper. We use the OpenMP [14] programming model, an emerging industry standard for shared memory programming. OpenMP frees the programmer from having to deal with lower level issues such as the number of nodes, the data partitioning ....
.... the percentage of shared memory pages moved extra due to an adaptation is very small in nearly all cases for Quicksort (a few ) so the absolute costs remain small compared to our previous results of data parallel applications, where redistribution of 30 60 of all shared memory pages is common [15]. To sum up, Table 4 shows that the costs of an adaptation are typically less than 0.1 seconds for TSP and less than 0.5 seconds for Quicksort even in the slower of the two environments tested, when using around 8 processes. 6 Analysis of Performance The key cost component of an adaptation is ....
[Article contains additional citation context not shown here]
A. Scherer, H. Lu, T. Gross, and W. Zwaenepoel. Transparent Adaptive Parallelism on NOWs using OpenMP. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP), Atlanta, May 1999. ACM.
No context found.
Alex Scherer, Honghui Lu, Thomas Gross, and Willy Zwaenepoel. Transparent Adaptive Parallelism on NOWs using OpenMP. Proceedings of the 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 96-106, May 1999.
No context found.
Alex Scherer, Honghui Lu, Thomas Gross, and Willy Zwaenepoel. Transparent adaptive parallelism on NOWs using OpenMP. In Seventh ACM PPOPP, pages 96--106, May 1999.
No context found.
Alex Scherer, Honghui Lu, Thomas Gross, Willy Zwaenepoel, Transparent Adaptive Parallelism on NOWs using OpenMP, in proceedings of the Seventh Conference on Principles and Practice of Parallel Programming (PPoPP '99), May 1999, pp 96--106.
No context found.
Alex Scherer, Honghui Lu, Thomas Gross, and Willy Zwaenepoel. Transparent adaptive parallelism on NOWs using OpenMP. In Seventh Conference on Principles and Practice of Parallel Programming, pages 96--106, May 1999.
No context found.
A. Scherer, H. Lu, T. Gross and W. Zwaenepoel, Transparent Adaptive Parallelism on NOWs using OpenMP, Proc. of the Thirteenth Intl Parallel Processing Symposium, April 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC