| Scales, D. J. and Lam, M. S. (1994). The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In Proc. of the First Symposium on Operating Systems Design and Implementation, pages 101--114. |
....Munin applies a multiple writer protocol that corresponds to our basic protocol to the write shared pattern, and a single writer protocol to the conventional pattern. For the migratory pattern, the objects are migrated from machine to machine as critical regions are entered and exited. SAM [35] is an object based DSM runtime system that also has the support to optimize some object access patterns. SAM enumerates two patterns, corresponding to the producer consumer pattern and the accumulator pattern in our GOS. SAM lets user explicitly tie the synchronization to the object accesses in ....
D. J. Scales, M. S. Lam, The Design and Evaluation of a Shared Object System for Distributed Memory Machines, in: Operating Systems Design and Implementation, 1994, pp. 101--114.
....are object oriented parallel programming languages. With distributed JVM as the executing platform, Java can also be considered a parallel programming language. Java s popularity makes it more promising to be acceptable by the parallel computing community than 24 Orca and Jade. Munin [8] and SAM [33] are object based DSMs with supports to optimize some object access patterns. However, they require the programmer to explicitly annotate the object with some pattern declaration. Compared to our GOS, their approach is neither transparent to the programmer nor flexible in a dynamic situation. ....
Daniel J. Scales and Monica S. Lam. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In Operating Systems Design and Implementation, pages 101--114, 1994.
.... range from hardware [72] to automated software data consistency, to manually ensured (application level) data consistency [8] Automatically ensuring data consistency, by hardware or software solutions is often related to simulating a shared memory on top of a distributed memory architecture [16, 54, 72, 94]. Ensuring a consistent view of the shared memory built on top of a physically distributed memory usually requires replication of physical pages or data objects and invalidation or updating of remote copies of data that is locally written. The hardware coherence schemes are triggered on read and ....
....work is to make parallelism implicit by using a suitable data layout and by automatically ensuring consistency, with system support. The data layout is based on a general partitioning and a relaxed consistency scheme. Work on relaxed consistency originates in the virtual shared memory techniques [21, 60,61,72, 94]. One way of relaxing a strict consistency model is to ensure consistency only at synchronization points. In a software maintained consistency, the synchronization points can be either indicated by the user (programmer) or discovered by a compiler. The first option allows for high optimizations, ....
[Article contains additional citation context not shown here]
Daniel J. Scales and Monica S. Lam. The design and evaluation of a shared object system for distributed memory machines. In OSDI94, pages 101--114, Monterey, CA, November 1994. USENIX Association.
....on ideas found in earlier systems. In this section, we look at other theoretical results and systems that address scheduling issues for dynamic parallel computation. We shall not look at data parallel systems [8, 53] nor at systems focused on infrastructure such as distributed shared memory [4, 6, 29, 39, 59, 60, 66, 73, 87, 92, 93] 1.2. Previous results and related work 7 or message passing [43, 96, 104, 105] Substantial research has been reported in the theoretical literature concerning the scheduling of dynamic computations. In contrast to our research on multithreaded computations, however, other theoretical research ....
Daniel J. Scales and Monica S. Lam. The design and evaluation of a shared object system for distributed memory machines. In Proceedings of the First Symposium on Operating Systems Design and Implementation, pages 101--114, Monterey, California, November 1994.
....rather than invocation based. COOL provides executable annotations that describe where and when a method should execute. Object a nity for a method results in remote procedure call, whereas task a nity is used to execute tasks back to back to increase cache reuse. The SAM runtime system [84] provides a global address space and support for object oriented parallel programming languages. It provides two data types: single assignment values and accumulators. Accumulators are managed in a manner similar to migratory data in caching schemes: they are moved from processor to processor as ....
D.J. Scales and M.S. Lam. "The Design and Evaluation of a Shared Object System for Distributed Shared Memory Machines". In Proceedings of the First USENIX Symposium on Operating Systems Design and Implementation, pages 101--114, Monterey, CA, November 14--17, 1994. 129
....of this type either require the use of an entirely new object oriented language [3, 35] or only allow the use of a subset of an existing one [17] In contrast, CRL is not language specific; the basic CRL interface could easily be provided in any imperative programming language. Scales and Lam [69] have described SAM, a shared object system for distributed memory machines. SAM is based on a new set of primitives that are motivated by optimizations commonly used on distributed memory machines. Like CRL, SAM is Rishiyur S. Nikhil, personal communication, March 1995. 87 implemented as a ....
Daniel J. Scales and Monica S. Lam. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In Proceedings of the First USENIX Symposium on Operating Systems Design and Implementation, pages 101-- 114, November 1994.
....the acquisition or release of a lock. Causal memory [7] ensures consistency only to the extent that if a process A reads a value written by another process B, then all subsequent operations by A must appear to occur after the write by B. MostDSM sim plement one of these relaxed consistency models [33, 87, 89, 130], though some implement a fixed collection of consistency models [20] while others merely implement a collection of mechanisms on top of which users write their own DSM consistency policies [97, 128] All of these consistency models and the DSM s that implement these models take a low level view ....
D. J. SCALES AND M. S. LAM, The design and evaluation of a shared object system for distributed memory machines, in Proceedings of the First Symposium on Operating Systems Design and Implementation, Monterey, California, Nov. 1994, pp. 101--114.
....[Lampson and Redell 1980] For the program to execute correctly, the programmer must ensure that all of the atomic operations commute. Four of the six parallel applications in the SPLASH benchmark suite [Singh et al. 1992] and three of the four parallel applications in the SAM benchmark suite [Scales and Lam 1994] rely on commuting operations to expose the concurrency and generate correct parallel execution. This experience suggests that compilers will be unable to parallelize a wide range of computations unless they recognize and exploit commuting operations. We have developed a new analysis technique ....
....to implement the abstraction of shared memory. It is clearly feasible, however, to generate code for message passing machines. The basic required functionality is a software layer that uses message passing primitives to implement the abstraction of a single shared object store [Rinard 1994a; Scales and Lam 1994]. The key question is how well the generated code would perform on such a platform. Message passing machines have traditionally suffered from much higher communication costs than shared memory machines. Compilation research for message passing machines has therefore emphasized the development of ....
SCALES, D. AND LAM, M. S. 1994. The design and evaluation of a shared object system for distributed memory machines. In Proceedings of the First USENIX Symposium on Operating Systems Design and Implementation. ACM Press, Monterey, CA.
....with other threads and which may not. Escape analysis is an obvious analysis to use for this purpose it recognizes data that is captured within the current thread and therefore inaccessible to other threads [11, 21, 98, 13, 82] The programming model may also separate shared and private data [92, 89, 81, 58], in some cases the analysis may automatically infer when pointers point to private data [65] More elaborate analyses may recognize actions (such as acquiring a mutual exclusion lock or obtaining the only existing reference to an object) that temporarily give the thread exclusive access to ....
D. Scales and M. S. Lam. The design and evaluation of a shared object system for distributed memory machines. In Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation. ACM, New York, Nov. 1994.
....To utilize bandwidth, a runtime system has to buffer the remote memory access. There is another approach where a programmer can specify optimal granularity, protocol, and association 3 Presently with Tokyo Research Laboratory, IBM Japan, Ltd. between synchronization and shared data[3, 30]. However, with this approach, existing shared memory applications require rewriting. Our idea is that an optimizing compiler directly analyses shared memory source programs, and optimizes communication and consistency management for software DSM execution[28] Our target is a page based software ....
....the benefit of message vectorization, synchronization messages, and support for sender initiated communication. However, this policy is only applicable to automatically parallelizable programs. The second is that a programmer declares shared data and association between data and synchronization[3, 10, 30, 6]. The programmer can select appropriate protocols for each data. The runtime system can utilize application specific information. Since this model hides a memory model from users, the system does not suffer from false sharing. lu contig radix fft barnes ocean water ns water sp raytrace ....
D. J. Scales and M. S. Lam. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In Proc. of the 1st OSDI, Nov. 1994.
....pages. Mostly software page based DSM systems include TreadMarks [5] Brazos [6] and Mirage [7] Software Object Based DSM systems: In this class of DSM systems, all three mechanisms mentioned above are implemented entirely in software. Software object based DSM systems include Orca [8] SAM [10], CRL [9] Midway [11] and Shasta [14] Almost all DSM models employ a directory based cache coherence mechanism, implemented either in hardware or software. DSM systems have demonstrated the potential to meet the objectives of scalability, ease of programming, and the cost effectiveness. The ....
....three layers and the machine specific parts are isolated in the lowest layer. This layer (called Panda) implements a virtual machine which provides the communication and multi tasking primitives needed by a runtime. Portability of Orca only requires portability of Panda. 3.5. 2 SAM Stanford SAM [10] is a shared object system for distributed shared memory machines. SAM has been implemented as a C library. It is a portable runtime system that provides a global name space and automatic caching of shared data. SAM allows communication of data at the level of user defined data types, thus, ....
D. J. Scales and M. S. Lam. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In Proceedings of the First Symposium on Operating Systems Design and Implementation, November 1994.
....1 Introduction Programming models that support a shared object space simplify the programming of parallel and distributed applications. However, these advantages come at a cost: transparently supporting the shared object space using software distributed shared memory (DSM) techniques [1, 15, 7, 3] incurs a performance penalty as compared to an explicit messagepassing version of the program. The fundamental reason for this performance disparity is a mismatch between the sharing granularity and protocols adopted in the DSM layer, and the consistency expectations of application threads ....
....applicability and flexibility as compared to hardware solutions. Starting from Ivy [10] a number of systems support a shared object space at either fixed granularity (e.g. TreadMarks [1] Shashta [14] and AURC [6] or variable (object) granularity (e.g. SharedRegions [13] Midway [3] SAM [15], and CRL [7] Recent research uses weak consistency protocols such as lazy release consistency [1] and entry consistency [3] that postpone propagation of object updates until program synchronization points. Although these protocols achieve good performance on small applications, their ....
D. J. Scales and M. S. Lam. The design and evaluation of a shared object system for distributed memory machines. In OSDI, pages 101--114, 1994.
....techniques to provide programmers with further control over communication. The performance study shows that these features are important to achieve good performance in an update based implementation of entry consistency. Distributed object systems such as Emerald [9] Amber [5] Orca [2] and SAM [17] offer a shared address space of objects. However, on Emerald and Amber objects are not replicated. Orca [2] uses compile time analysis to decide whether to replicate shared objects in all the processors or not to replicate them. Write operations are broadcast to all the processors using a ....
....decide whether to replicate shared objects in all the processors or not to replicate them. Write operations are broadcast to all the processors using a function shipping policy. DiSOM offers a more flexible programming model that provides finer grained control over communication. Like DiSOM, SAM [17] attempts to provide a model that gives programmers control over communication. SAM controls communication with a protocol based on single assignment semantics and with a migratory protocol. Both of SAM s protocols can be emulated by DiSOM and DiSOM s programming model is closer to the models ....
[Article contains additional citation context not shown here]
D. Scales and M. Lam. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In OSDI, November 1994.
....to 92 (for the CM 5) For ASP, the maximum efficiency ranges from 60 (for the PowerXplorer) to 88 (for the CM 5) 5. RELATED WORK Orca can be regarded as an object based distributed shared memory (DSM) system. In the last decade a substantial number of other DSM systems have been built [5, 6, 8, 10, 14, 18, 22, 27]. The Orca programming model has several characteristics that make it different from other forms of DSM. Foremost, it uses object based DSM, instead of page based DSM. Second, it is implemented entirely in user space, with no changes to the operating system. Finally, our model is supported in a ....
D.J. Scales and M.S. Lam, "The Design and Evaluation of a Shared Object System for Distributed Memory Machines," Proc. First USENIX Symp. on Operating System Design and Implementation , - pp. 101-114 (Nov. 1994).
....of these optimizations on performance. We describe the performance of our system at various levels, including the low level communication primitives, the language level primitives, and applications. Several more recent programming languages and libraries also support shared objects of some form [10, 12, 16, 19, 21, 26, 29], so we think our work is relevant to many other systems. The outline of the paper is as follows. In Section 2 we provide more background information about the hardware and software used in the case study. In Section 3 we describe an initial implementation of the Orca system on Myrinet. We also ....
D.J. Scales and M.S. Lam. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. Proc. 1st Symp. on Operating System Design and Implementation, pages 101--114, November 1994. 29
....via an encapsulation mechanism. The shared data can only be accessed through language constructs, which allows the compiler and run time system to maintain mutual exclusion and coherence on the shared data. Other shared data object oriented parallel programming systems also exist, including SAM [Scales and Lam, 1994] , Mentat [Grimshaw et al. 1996] and ABC [O Farrell et al. 1996] Although COOL (discussed previously) and C [Buhr et al. 1992] are designed for sharedmemory platforms, they do use the encapsulation features of C (and extensions) to control the access of shared data through a monitor ....
D.J. Scales and M.S. Lam. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. Proceedings of the 1st Symposium on Operating Systems Design and Implementation, pages 101--114, November 1994.
....[8] and multiplewriters, is a software distributed shared memory (DSM) system for Unix machines, and Quarks [9] which supports a number of consistency protocols whilst using latency hiding to mask communication latency. Other approaches present sharing at the object level, for example SAM [19] and CRL [10] The main di erence between these systems and CFL is the granularity of the shared memory. While TreadMarks implements a global address space to share memory pages and CRL has a global address space for regions of memory, CFL is concerned with sharing individual variables. Each ....
D.J. Scales and M.S Lam, The Design and Evaluation of a Shared Object System for Distributed Shared Memory Machines, 1st Symp. Operating Systems Design and Implementation, 1991.
....memory associated with each processor. One of the Jade applications (the Volume Rendering application) actually fails to run on one platform because of this restriction. These problems highlight the advantages of systems that separate the units of allocation from the units of communication [Scales et al. 1994; Shoinas et al. 1994] One alternative is to distribute fixed size pieces of objects across the memory modules and transfer pieces of objects on demand as processors access them. The shared memory implementation does this implicitly by using the shared memory hardware, which distributes pieces ....
....of parallel algorithms provide a simple, easy way to use parallel machines. The numerical analysis community has developed fast parallel implementations of common linear algebra routines [Dayde and Duff 1990] Other researchers have developed a framework for implementing common data usage patterns [Scales and Lam 1994]. In the best case these systems provide the same advantages as data parallel languages. They preserve the abstraction of sequential execution by encapsulating the parallel computation inside routines invoked from a serial program. 7.3 Functional Languages Functional languages such as Id [Arvind ....
Scales, D. and Lam, M. S. 1994. The design and evaluation of a shared object system for distributed memory machines. In Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation. ACM, New York.
No context found.
Scales, D. J. and Lam, M. S. (1994). The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In Proc. of the First Symposium on Operating Systems Design and Implementation, pages 101--114.
No context found.
Scales, D. J., and Lam, M. S. The Design and Evaluation of a Shared Object System for Distributed Memory Machines. In Proc. of the First Symposium on Operating Systems Design and Implementation (November 1994), pp. 101-- 114.
No context found.
D. Scales and M. Lam. The design and evaluation of a shared object system for distributed memory machines. In Proc. of the First Symp. on Operating Systems Design and Implementation (OSDI'94), pages 101--114, 1994.
No context found.
D. Scales and M. Lam. The design and evaluation of a shared object system for distributed memory machines. In Proc. of the First Symp. on Operating Systems Design and Implementation (OSDI'94), pages 101-114, 1994.
No context found.
D. Scales and M. S. Lam. The design and evaluation of a shared object system for distributed memory machines. In Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation. ACM, New York, Nov. 1994.
No context found.
D. Scales and M. S. Lam. The design and evaluation of a shared object system for distributed memory machines. In Proceedings of the First USENIX Symposium on Operating Systems Design and Implementation, Monterey, CA, November 1994.
No context found.
D. J. Scales and M. S. Lam. The design and evaluation of a shared object system for distributed memory machines. In OSDI9J, pages 101-114, Monterey, CA, November 1994. USENIX Association.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC