48 citations found. Retrieving documents...
E. Lusk and R.Overbeek. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Speculative Synchronization: Applying Thread-Level.. - Martinez, Torrellas (2002)   (4 citations)  (Correct)

....synchronization operations used by programmers and parallelizing compilers include barriers, locks, and flags. For example, parallelizing compilers typically use global barriers to separate sections of parallel code. Also, programmers frequently use locks and barriers in the form of M4 macros [22] or OpenMP directives [5] to ensure that codes are race free. This work was supported in part by the National Science Foundation under grants CCR 9970488, EIA 0081307, EIA 0072102, and CHE 0121357; by DARPA under grant F30602 01 C 0078; and by gifts from IBM, Intel, and Hewlett Packard. Jos ....

E. Lusk, R. Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehart, and Winston, Inc., New York, NY, 1996.


Comparative Evaluation of Latency Reducing and.. - Gupta, Hennessy.. (1991)   (103 citations)  (Correct)

....this paper. This information will be useful in later sections for understanding the performance results. The selected applications are representative of algorithms used in an engineering computing environment. All of the applications are written in C. The Argonne National Laboratory macro package [19] is used to provide synchronization and sharing primitives. Some general statistics for the benchmarks are shown in Table 2. MP3D [20] is a 3 dimensional particle simulator. It is used to study the pressure and temperature profiles created as an object flies at high speed through the upper ....

E. Lusk, R. Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.


Tolerating Latency Through Software-Controlled Prefetching in.. - Mowry, Gupta (1991)   (232 citations)  (Correct)

....this paper. This information will be useful in later sections for understanding the performance results. The selected applications are representative of algorithms used in an engineering computing environment. All of the applications are written in C. The Argonne National Laboratory macro package [19] is used to provide synchronization and sharing primitives. The three applications we studied are MP3D, LU, and PTHOR. Table 1 shows some general statistics for the benchmarks when using 16 processors (as is the case throughout this study) Table 1: General statistics for the benchmarks. ....

Ewing Lusk, Ross Overbeek, et al. Portable Programs for Parallel Processors'. Holt, Rinehart and Winston, Inc., 1987.


The Design, Implementation, and Evaluation of Jade - Rinard, Lam (1998)   (Correct)

....0164 0925 98 0100 0111 03.50 ACM Transactions on Programming Languages and Systems, Vol. 20, No. 1, January 1998, Pages 1 63. 2 Delta Martin C. Rinard and Monica S. Lam 1. INTRODUCTION Programmers have traditionally developed software for parallel machines using explicitly parallel systems [Lusk et al. 1987; Sunderam 1990] These systems provide constructs that programmers use to create parallel tasks. On shared memory machines the programmer synchronizes the tasks using low level primitives such as locks, condition variables, and barriers. On message passing machines the programmer must also manage ....

Lusk, E., Overbeek, R., Boyle, J., Butler, R., Disz, T., Glickfield, B., Patterson, J., and Stevens, R. 1987. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc.


Commutativity Analysis: A New Analysis Technique for.. - Rinard, Diniz (1997)   (24 citations)  (Correct)

....and the String seismic code [Harris et al. 1990] Explicitly parallel versions of the first two applications are available in the SPLASH [Singh et al. 1992] and SPLASH 2 [Woo et al. 1995] benchmark suites. We have developed an explicitly parallel version of String using the ANL macro package [Lusk et al. 1987]. This section presents performance results for the automatically parallelized and explicitly parallel versions of these applications on a 16 processor Stanford DASH machine [Lenoski 1992] running a modified version of the IRIX 5.2 operating system. The programs were compiled using the IRIX 5.3 CC ....

....time for four processors (approximately 30 ) indicates that lack of parallelism is the primary source of the lack of scalability for this data set. 6.4.4 Comparison with the Explicitly Parallel Version. We have developed an explicitly parallel version of String using the ANL macro package [Lusk et al. 1987]. Table XVII contains the execution times for this version. For the small data set, the explicitly parallel version does not scale beyond four processors. We attribute this lack of performance to a lack of parallelism in the small data set. For the big data set, the explicitly parallel version ....

LUSK, E., OVERBEEK, R., BOYLE, J., BUTLER, R., DISZ, T., GLICKFIELD, B., PATTERSON, J., AND STEVENS, R. 1987. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc.


The ANL/GMD Macros (PARMACS) in FORTRAN for Portable Parallel.. - Hempel (1991)   (Correct)

....way for implementing the shared memory macros. So, for this class of machines a new mechanism had to be developed, and the result was a macro package for programs following the message 4 passing paradigm. Unfortunately, however, these macros could only be used with the C programming language [7]. Therefore, during a visit of the author at Argonne in 1988 the corresponding PARMACS package for programs written in FORTRAN was defined and implemented on a variety of parallel systems with both local or shared memory. Meanwhile the PARMACS package has been extended, with emphasis on efficient ....

Lusk, E., et al.: Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc; New York, 1987.


A Framework To Study Automatically Parallelized Programs - Venkata Krishnan David   (Correct)

....are represented by threads in Tango Lite. Thus it provides a shared virtual memory environment in which all variables are shared unless allocated from a thread s stack. The application to be simulated on this virtual machine should utilize the m4 macros that originated at Argonne National Labs [5]. The macros provide the required shared memory and processcreation primitives facilitating use of locks, barriers, self scheduling in parallel loops etc. Detailed description of the macros is provided in [4] In our tranformation the most common macros used are for barriers (bardec, barinit, ....

Lusk Overbeek and et al. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.


Design and Performance Evaluation of. . . - Oi (2000)   (Correct)

....to evaluate the performance of the proposed architecture in Section 6.3 is described. 6.2. 1 Architecture Model For the performance evaluation, an execution driven simulator was developed using the ABSS multiprocessor toolkit [64] ABSS takes a parallel application written with p4 macro [43] and augments it with instrumentation codes. Each chip has four processors as in [27] A configuration of a bus connected 32 processors (8 chips) system is considered. Each on chip multiprocessor has the 90 L2 Cache Off Chip Interface Request Bus M M M M P L1 P L1 P L1 P L1 Data ....

E. Lusk, et.al., "Portable Programs for Parallel Processors," Holt, Rinehard and Winston, Inc., New York, 1987.


Utilization of Cache Area in On-Chip Multiprocessor - Oi, Ranganathan (1999)   (Correct)

....environment to evaluate the performance of the proposed architecture in Section 4 is described. 3. 1 Architecture Model For the performance evaluation, an execution driven simulator was developed using the ABSS multiprocessor toolkit [10] ABSS takes a parallel application written with p4 macro [11] and augments it with instrumentation codes. Each chip has four processors as in [2] A configuration of a bus connected 32 5 L2 Cache Off Chip Interface Request Bus M M M M P L1 P L1 P L1 P L1 Data Bus (16Bytes) Request Bus Data Bus OCMP OCMP OCMP OCMP OCMP OCMP OCMP OCMP P ....

E. Lusk, et.al., "Portable Programs for Parallel Processors", Holt, Rinehard and Winston, Inc., New York, 1987.


Implicitly Synchronized Data Types: Data Structures for Modular.. - Rinard (1998)   (Correct)

....the resources available to solve important computational problems. The widespread use of these machines has, however, been limited by the difficulty of developing useful parallel software. Programmers have traditionally developed software for parallel machines using explicitly parallel languages (Lusk et al. 1987, Sunderam 1990) These languages provide constructs that programmers use to create parallel tasks. The tasks typically interact using synchronization operations such as locks, condition variables and barriers and or communication operations such as send and receive. Explicitly parallel languages ....

Lusk, E., Overbeek, R., Boyle, J. et al.(1987) Portable Programs for Parallel Processors. Holt, Rinehart and Winston, New York.


Speculative Multithreading Architectures - Krishnan (1998)   (Correct)

....applications. Three applications are SPEC95 benchmarks, namely, swim, tomcatv and mgrid ; one is a NASA7 kernel, vpenta, and the remaining two are SPLASH 2 applications [91] namely fmm and ocean. The two SPLASH 2 applications are explicitly parallel programs written in C and use ANL m4 macros [46] for parallel constructs. The SPEC95 benchmarks and the NASA7 kernel are sequential programs written in Fortran. We use the Polaris automatic parallelizing compiler [4] to identify parallel sections of the code. Polaris uses several techniques, such as inter procedural symbolic program analysis, ....

E. Lusk et al. Portable Programs for Parallel Processors. Holt, Rinehart and Winston Inc., New York, 1987.


The Design Of A Standard Message Passing Interface For.. - Walker (1994)   (52 citations)  (Correct)

....or low level system support for, thereby enhancing scalability. The functionality that mpi is designed to provide is based on current common practice, and is similar to that provided by widely used message passing systems such as Express [12] NX 2 [13] Vertex, 11] parmacs [8, 9] and P4 [10]. In addition, the flexibility and usefulness of mpi has been broadened by incorporating ideas from more recent and innovative message passing systems such as chimp [4, 5] Zipcode [14, 15] and the IBM External User Interface [7] The general design philosophy followed by mpi is that while it ....

Ewing Lusk, Ross Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.


MemSpy: Analyzing Memory System Bottlenecks in Programs - Martonosi, Gupta, Anderson (1992)   (46 citations)  (Correct)

....through different procedure call paths. Thus, to guarantee uniqueness, the names are disambiguated by prepending a string summarizing the state of the call stack. 9 Our parallel programming model uses C language programs augmented with Argonne National Laboratory parallel programming macros [13]. In this model, all shared memory is dynamically allocated using the G MALLOC macro. The final full name is of the form: ProcName.return pc.ProcName.return pc. DataType.VarName This method has both strengths and weaknesses. By prepending the bin name with call stack information, we ....

E. Lusk, R. Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.


Shasta: a System for Supporting Fine-Grain Shared Memory.. - Daniel Scales (1997)   (2 citations)  (Correct)

....and are therefore likely to share much data and communicate more frequently across the cluster than processes that share explicitly allocated segments. The Shasta system supports several programming interfaces. The SPLASH 2 applications are written using a parallel macro package called ANL macros [6]. This package provides macros to create a process, allocate a segment of shared memory, and synchronize processes using locks, barriers, and event flags. Similar functions can be achieved directly by using UNIX system calls. Table 1 shows the names of the ANL macros and UNIX calls that accomplish ....

E. Lusk, R. Overbeek, et al., Portable Programs for Parallel Processors, Holt, Rinehart and Winston, Inc., 1987.


Parallel Evaluation of a Parallel Architecture by means of.. - Muller, al. (1994)   (3 citations)  (Correct)

....The user simply compiles the application program with dcc a compiler driver analogous to cc which performs the additional code translations after the compilation stage. After linking, the program is executed under the control of the DDM emulator. We provide support for the p4 programming model [12] a portable library providing parallel programming facilities (such as synchronisation, T800 Projected Component Emulation Hardware Ratio Integer Instruction 50 300 ns 10 ns 5 30 Floating Point .4 1.6 s 40 ns 10 40 Local Memory time 33.8 s 100 ns 338 Protocol Delay 81.0 s 200 ns 405 ....

Ewing Lusk, Ross Overbeek, James Boyle, Ralph Butler, Terence Disz, Barnett Glickfeld, James Patterson, and Rick Stevens. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.


The DASH Prototype: Logic Overhead and Performance - Lenoski, Laudon, Joe.. (1993)   (92 citations)  Self-citation (Stevens)   (Correct)

....6.2.1 Application Runtime Environment The operating system running on the prototype DASH is a modified version of IRIX; a variant of UNIX System V.3 developed by Silicon Graphics. The applications for which we present results are coded in C and use the Argonne National Labs (ANL) parallel macros[12] to control synchronization and sharing. Before giving the speedup results in the next subsection, we first state the assumptions used in measuring the speedups. The speedups were measured as the time for the uniprocessor to execute the parallel version of the application code (i.e. not all ....

E. Lusk, R. Overbeek, J. Boyle, R. Butler, T. Disz, B. Glickfeld, J. Patterson, and R. Stevens, Portable Programs for Parallel Processors. Holt, Rinehard and Winston, Inc.1987.


The Common Case Transactional Behavior of Multithreaded.. - Jaewoong Chung Hassan (2006)   (Correct)

No context found.

E. Lusk and R.Overbeek. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.


Emulation of a Virtual Shared Memory Architecture - Raina (1993)   (3 citations)  (Correct)

No context found.

E. Lusk, R. Overbeek, J. Boyle, R. Butler, T. Disz, B. Glickfeld, J. Patterson, and R. Stevens. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.


Unknown - Symbolic Parallel Programming   (Correct)

No context found.

Ewing Lusk, James Boyle, Ralph Butler, Terrence Disz, Barnett Glickfeld, Ross Overbeek, James Patterson, and Rick Stevens. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., New York, NY, 1987.


Massively Parallel Genetic Algorithms - Ghazfan, Srinivasan, Nolan (1994)   (Correct)

No context found.

Lusk, E. et al., Portable Programs for Parallel Processors , Holt, Rinehart and Winston, Inc., 1987.


Performance Evaluation by Simulation - Hlavacs, Ueberhuber (2001)   (Correct)

No context found.

Lusk E. et al.: Portable Programs for Parallel Processors. Holt, Rinehart, and Winston, NewYork 1987.


Unknown - Often However The   (Correct)

No context found.

E. Lusk et al., Portable Programs for Parallel Processors, Holt, Rinehart, Winston, 1996.


A Parallel Database-driven Protocol Verification System Prototype - Frieder (1992)   (Correct)

No context found.

E. Lusk, R. Overberk. J. Pattersen, R. Stevens, J. Boyle, R. Butler, T. Disz. B. Glickfeld. Portable Programs for Parallel Processors, Holt, Reinehart and Winston, Inc. 1987.


Turbomachinery CFD on Parallel Computers Richard - Blech, Milner, Quealy, Townsend   (Correct)

No context found.

E. L. Lusk, et. al., "Portable Programs for Parallel Processors," Holt, Rinehart and Winston, Inc., 1987.


Restructuring A Parallel Simulation To Improve Cache Behavior In .. - Cheriton (1991)   (21 citations)  (Correct)

No context found.

E. Lusk, R. Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC