40 citations found. Retrieving documents...
Bershad, B., Anderson, T., Lazowska, E., and Levy, H. UserLevel Interprocess Communication for Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 9(2), May 1991.

 Home/Search   Document Details and Download   Summary   ACM   TOC   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Explicit Network Scheduling - Black (1994)   (23 citations)  (Correct)

....In a variant called scheduler activations [Anderson90] the thread blocks but control of the CPU is returned to the process internal scheduler. The client may then at some later date block on or poll for the indication from the server that the operation is complete. This model is recommended in [Bershad91] for multiprocessor machines. That work also notes that this model is superior when threads are provided at the user level (either over a non threaded or a kernel threaded base) due to the lower synchronisation costs. Although this model has a higher cost for local RPCs when measured ....

B.N. Bershad, T.E. Anderson, E.D. Lazowska, and H.M. Levy. User-Level Interprocess Communication for Shared Memory Multiprocessors. ACM Transactions on Commputer Systems, 9(2):175--198, May 1991. (p 48)


Do Faster Routers Imply Faster Communication? - Karamcheti, Chien (1994)   (6 citations)  (Correct)

....software protocol overhead. Researchers have explored a variety of approaches for reducing the cost of messaging layers. While substantial progress has been made, much of it has been made at the cost of reducing functionality. For example, techniques for speeding interprocess communication [2, 11, 26] have resorted to lower level (and more risky) communication primitives to achieve high performance. In parallel systems, a number of reduced messaging layers have also been developed. In fact, active messages, the basis for our study, is one such reduced layer. The lowest level primitive ....

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991.


The Structure of a Multi-Service Operating System - Roscoe (1995)   (44 citations)  (Correct)

....Appl. Appl. Figure 2.2: Microkernel based operating system architecture processes. In a microkernel system this overhead includes two context switches as opposed to a simple processor trap in kernel based system. Much work has gone into reducing the cost of this communication, for example [Hamilton93a, Bershad90, Bershad91, Hildebrand92]. Some systems have even migrated services back into the kernel for performance reasons [Bricker91] 2.1.4 Other systems The incomplete taxonomy above classifies systems into those with zero, 1 or many entities providing operating system services. Naturally, the boundaries between classes are ....

....high throughput by amortising the cost of context switches over several invocations, in other words by having many RPC invocations from a domain outstanding. This separation of information transfer from control transfer is especially beneficial in a shared memory multiprocessor, as described in [Bershad91]. Of equal importance to Nemesis is that the coupling of data transfer and control transfer in tunnelling systems can result in considerable crosstalk between applications, and can seriously impede application specific scheduling. 5.2.4 Design Principles A good RPC system provides high ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-Level Interprocess Communication for Shared Memory Multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991. (pp 9, 87)


Network Interface - Angelos Bilas Edward   (Correct)

....generates code to marshal and unmarshal complex data types. The stub generator and runtime library were designed with SHRIMP in mind, so we believe they come close to the best possible RPC performance on the SHRIMP hardware. Buffer Management The design of ShrimpRPC is similar to Bershad s URPC [1]; the main difference is that URPC runs on shared memory machines while ShrimpRPC runs on the distributed memory SHRIMP system. Each RPC binding consists of one receive buffer on each side (client and server) with bidirectional import export mappings between them. When a call occurs, the ....

....harder to interface to the memory bus on a stock motherboard. Finally, different memory bus protocols may reduce or eliminate the DMA coherence penalty. We cannot predict which transfer mechanism will be better on future architectures. 8 Related Work Our approach is similar in some ways to URPC [1], since both exploit user level communication. URPC is built on top of a shared memory architecture while we use the distributed memory SHRIMP architecture. Bershad s LRPC [2] tries to optimize the kernel path for same machine RPC calls. Since we have eliminated the kernel entirely, LRPC does not ....

B. Bershad, T. Anderson, E. Lazowska, and H. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems 9, 2 (May 1991), 175-198.


The Raven Kernel: a Microkernel for Shared Memory Multiprocessors - Ritchie (1993)   (1 citation)  (Correct)

....implemented at the user level. This allows users to directly invoke IPC primitives without the added costs of crossing user kernel boundaries. A level of indirection is removed because instead of invoking the kernel, the user now directly communicates with the remote process. The URPC technique [BALL90] relies on pair wise shared memory between the client and server processes to pass message data. The message delivery system is controlled by low priority threads that poll the message queues looking for work. The threads only poll while the system Chapter 6. Related Work 144 is idle. While this ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-level interprocess communication for shared memory multiprocessors. Tr-9005 -07, University of Washington, July 1990.


Communication Across Fault-Containment Firewalls on the SGI.. - Ghosh, Christie (1998)   (1 citation)  (Correct)

....recover from the faults that might arise due to fail stop of partitions. Efficient communication at the user level has been a subject of considerable research for quite some time. Efforts in this area range from the use of clever operating system features that avoid copying as far as practicable [1, 4] to the use of specialized atomic swap and fetch and op hardware[9] 8 Conclusions We mentioned why it is important to be able to break up a large computer into autonomous partitions, and described features of the Origin 2000 that allow such partitions to be built. Describing the implementation ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-Level Interprocess Communication for Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, pages 175--198, May 1991.


Software Overhead in Messaging Layers: Where Does the Time Go? - Karamcheti, Chien (1994)   (32 citations)  (Correct)

....of reducing software protocol overhead. Researchers have explored several approaches for reducing the cost of messaging layers. While substantial progress has been made, much of it has been made at the cost of reducing functionality. For example, techniques for speeding interprocess communication [1, 21] have resorted to lower level (and more risky) communication primitives to achieve high performance. In parallel systems, a number of reduced messaging layers have also been developed. In fact, active messages, the basis for our study, is one such reduced layer. The lowest level primitive ....

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991.


Implementing References as Chains of Links - Maisonneuve, Shapiro, Collet (1992)   (6 citations)  (Correct)

....on the best protocol. If the referenced object appears to be in the same context, the maillon is transformed into a simple indirection maillon (its dereference procedure being an indirection to the local object) If it points to a different context on the same machine, a URPC or LRPC protocol [1] using shared memory can be used. Otherwise, a RPC like protocol will be used to carry the invocation across the network. The latter two imply the creation of a stub whose interface conforms to the target object s interface. This stub will be referenced by the data pointer so that it replaces the ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991.


System Support for Efficient Network Communication - Thekkath (1994)   (4 citations)  (Correct)

....the network access model. An advantage of the remote memory model, which provides protected, remote memory that can be shared between a client and server, is that it is natural to extend to the cross machine case the optimization techniques used for high performance same machine RPC, such as URPC [8]. Following the URPC approach, in our system the server exports stacks that are then imported by clients at bind time. On an RPC call, the client stub picks an available stack for the server and builds a call frame on that stack using the remote write operation. In the absence of ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991.


Early Experience with Message-Passing on the SHRIMP.. - Felten, Alpert.. (1996)   (23 citations)  (Correct)

....generates code to marshal and unmarshal complex data types. The stub generator and runtime library were designed with SHRIMP in mind, so we believe they come close to the best possible RPC performance on the SHRIMP hardware. Buffer Management The design of SHRIMP RPC is similar to Bershad s URPC [5]. Each RPC binding consists of one receive buffer on each side (client and server) with bidirectional import export mappings between them. When a call occurs, the client side stub marshals the arguments into its buffer, and then transmits them into the server s buffer. At the end of the arguments ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-Level Interprocess Communication for Shared Memory Multiprocessors. ACM Trans. Comput. Sys., 9(2):175--198, May 1991.


Wide-Address Spaces - Exploring the DesignSpace - Bartoli, Mullender, van der Valk (1993)   (7 citations)  (Correct)

....discuss the issues raised by these new architectures, and we will try to highlight the corresponding problems. 2 Shared Wide Address Spaces 2. 1 Scale of the Address Space Sharing an address space among processes running on the same machine provides by itself some potential for better performance [Bershad et al. 1991; Koldinger et al. 1992] According to the previous section, however, far bigger advantages should come by sharing an address space across multiple machines. The question that arises is deciding the size and kind of the group of machines sharing the address space. Obviously, a major requirement ....

Bershad et al. [1991] B. N. Bershad, T. E. Anderson, E. D. Lazowska and H. M. Levy, User-Level Interprocess Communication for Shared-Memory Multiprocessors, ACM Transactions on Computer Systems 9(2), May 1991.


The Design and Implementation of an Operating.. - Leslie, McAuley.. (1996)   (153 citations)  (Correct)

....particularly by amortising the cost of context switches over several invocations in other words by having many RPC invocations from a domain outstanding. This separation of information transfer from control transfer is especially beneficial in a shared memory multiprocessor, as described in [27]. The thread tunnelling model achieves very low latency by combining all components into one operation: the transfer of the thread from client to server, using the kernel to simulate the protected procedure calls implemented in hardware on, for example, Multics [28] and some capability systems ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy, "User-Level Interprocess Communication for Shared Memory Multiprocessors", ACM Transactions on Computer Systems, vol. 9, no. 2, pp. 175--198, May 1991.


Operating System Support For High-Speed Networking - Druschel (1994)   (16 citations)  (Correct)

....to be effective. This chapter presents a novel OS facility for the efficient management and transfer of I O buffers across protection domain boundaries. 5. 1 Motivation Optimizing operations that cross protection domain boundaries has received a great deal of attention recently [Kar89, BALL90, BALL91] This is because an efficient cross domain invocation facility enables a more modular operating system design. For the most part, this earlier work focuses on lowering control transfer latency it assumes that the arguments transferred by the cross domain call are small enough to be copied from ....

....appropriate fbuf is already cached, and when removing write permissions from the originator is unnecessary. Moreover, in the common case, no kernel involvement is required during cross domain data transfer. Our facility is therefore well suited for use with user level IPC facilities such as URPC [BALL91] and other highly optimized IPC mechanisms such as MMS [GA91] 5.5 Implementation We have implemented and evaluated fbufs using an experimental platform consisting of CMU s Mach 3.0 microkernel [ABB 86] augmented with a network subsystem based on the University of Arizona s x kernel ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991.


Fast RPC on the SHRIMP Virtual Memory Mapped Network Interface - Bilas and Felten (1996)   (6 citations)  (Correct)

....generates code to marshal and unmarshal complex data types. The stub generator and runtime library were designed with SHRIMP in mind, so we believe they come close to the best possible RPC performance on the SHRIMP hardware. Buffer Management The design of ShrimpRPC is similar to Bershad s URPC [1]; the main difference is that URPC runs on shared memory machines while ShrimpRPC runs on the distributed memory SHRIMP system. Each RPC binding consists of one receive buffer on each side (client and server) with bidirectional import export mappings between them. When a call occurs, the ....

.... back(CAUWB) Write through(CAUWT) Deliberate update(CDU) Write back(CDUWB) Write through(CDUWT) NoCopy(NC) Automatic update(NCAU) Write back(NCAUWB) Write through(NCAUWT) Deliberate update(NCDU) Write back(NCDUWB) Write through(NCDUWT) 9 Related Work Our approach is similar in some ways to URPC [1], since both exploit user level communication. URPC is built on top of a shared memory architecture while we use the distributed memory SHRIMP architecture. Bershad s LRPC [2] tries to optimize the kernel path for same machine RPC calls. Since we have eliminated the kernel entirely, LRPC does not ....

B. Bershad, T. Anderson, E. Lazowska, and H. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems 9, 2 (May 1991), 175-198.


A Survey of Multiprocessor Operating System Kernels - Mukherjee, Schwan, Gopinath (1993)   (1 citation)  (Correct)

....optimized for communication between address space on the same machine. LRPC combines the control transfer and communication model of capability systems with the programming semantics and large grained protection 9 It is possible to have more than one kernel in a NUMA machine. model of RPC. In [22], Bershad et al. propose another interprocess communication scheme, called User level Remote Procedure Call (URPC) URPC decouples processor allocation from data transfer and thread management by combining fast cross address space communication protocol using shared memory with lightweight threads ....

B. Bershad, T. Anderson, E. Lazowska, and H. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991.


Efficient Support for Incremental Customization of OS Services - Peter Druschel   (9 citations)  (Correct)

....network performance [9, 12] Work on user level management of parallelism has separated processor allocation from user level thread scheduling and synchronization [3, 10] This work is also driven mainly by performance concerns, but results in a decomposition that suits our goals. User level RPC [4] and fbufs [5] are examples of work that effectively elevate interprocess communication services to the application level. As a final point, the above works suggest that collocating OS implementation and application code has advantages beyond customization. First, the lack of a protection boundary ....

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2):175-- 198, May 1991.


The Case for Application-Specific - Operating Systems Thomas   Self-citation (Anderson)   (Correct)

No context found.

Bershad, B., Anderson, T., Lazowska, E., and Levy, H. UserLevel Interprocess Communication for Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 9(2), May 1991.


Service without Servers - Maeda, Bershad (1993)   (3 citations)  Self-citation (Bershad)   (Correct)

....with the operating system. However, these systems provided no protection against rogue or buggy applications that crash the system or use the hardware to attack other systems. Other work spans file systems [Rees et al. 86, Bershad Pinkerton 88] scheduling [Anderson et al. 92] communication [Bershad et al. 91] and user level memory management [McNamee Armstrong 90, Harty Cheriton 92, Sechrest Park 91, Krueger et al. 93] The work in extensible filesystems permits applications to extend the semantics of files on a per file basis. However, this work still leaves all resource scheduling decisions ....

Bershad, B. N., Anderson, T. E., Lazowska, E. D., and Levy, H. M. User-Level Interprocess Communication for Shared Memory Multiprocessors. ACM Transactions on Computer Systems, 9(2):175--198, May 1991.


The Interaction of Architecture and Operating System.. - Anderson, Levy, Bershad, .. (1991)   (107 citations)  Self-citation (Bershad Anderson Lazowska Levy)   (Correct)

....such as branch delays and write buffers, work less well on some operating system code. Operating system designers must be aware that architectural trends will lead towards relatively more expensive system calls, and should look for mechanisms that avoid the kernel when possible (e.g. Bershad et al. 90b] 3 Virtual Memory The most basic use of virtual memory is to support address spaces larger than the physical memory. For this function, all that is needed is a level of indirection between virtual and physical addresses, provided through TLBs and page tables, plus the ability to fault on an ....

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy. User-level interprocess communication for shared-memory multiprocessors. Technical Report TR-90-05-07, Department of Computer Science and Engineering, University of Washington, July 1990.


Efficient Support for Multicomputing on ATM Networks - Thekkath, Levy, Lazowska (1993)   (31 citations)  Self-citation (Lazowska Levy)   (Correct)

....We now examine marshaling and transport in the context of our RPC implementation. 3.1. 1 Marshaling Given the notion of protected, remote memory that can be shared between a client and server, it is natural to extend techniques used for high performance same machine RPC, such as LRPC [5] and URPC [6]. In our system, the server exports stacks that are then imported by clients at bind time. On an RPC call, the client stub picks an available stack for the server and builds a call frame on that stack using the remote write operation. In the absence of call by reference, the call frame is ....

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. User-level interprocess communication for shared memory multiprocessors. ACM Transactions on Computer Systems, 9(2), May 1991.


Using the Mach Communication Primitives in X11 - Michael Ginsberg (1993)   (3 citations)  Self-citation (Bershad)   (Correct)

....to that of an in kernel implementation because it eliminates the context switching overhead incurred by the out of kernel Unix system. In the second case, we have used shared memory for communication between X11 clients and servers reducing the system s reliance on kernel communication primitives [Bershad et al. 91] This approach yields substantial performance improvements. 1.1 Motivation Mach is a microkernel designed to provide a base operating system on which other operating systems such as Unix can be built. Two versions of Mach will be discussed in this paper. The first, Mach 2.5, includes the Mach ....

Bershad, B., Anderson, T., Lazowska, E., and Levy, H. User-Level Interprocess Communication for Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 9(2), May 1991.


Anonymous RPC: Low-Latency Protection in a 64-Bit Address.. - Yarvin, Bukowski, Anderson   (20 citations)  Self-citation (Anderson)   (Correct)

....our purposes is remote procedure call (RPC) Birrell Nelson 1984] In RPC, a server process exports an interface to one of its procedures; any client process can then bind to the procedure as if it was linked directly into the client. Local RPC has been extensively studied [Bershad et al. 1990, Bershad et al. 1991, Schroeder Burrows 1990] and seems to be the most convenient communication paradigm for integrating software systems across domains. Druschel and Peterson optimized their RPC system mainly for data throughput; we feel that this goal has been achieved, and optimize our anonymous RPC system ....

....we do not know of any other optimized local RPC results on the same architecture. Instead, Table 2 compares our performance to that of four other RPC implementations running on other hardware: Mach RPC [Bershad et al. 1992] SRC RPC [Schroeder Burrows 1990] LRPC [Bershad et al. 1990] and URPC [Bershad et al. 1991]. These are all optimized RPC implementations; commercial RPC implementations are frequently another order of magnitude slower. 7 Related Work The most closely related work to ours is Druschel and Peterson s fbufs, packet buffers mapped in a shared anonymous space [Druschel Peterson 1992] ....

Bershad, B., Anderson, T., Lazowska, E., and Levy, H. User-Level Interprocess Communication for Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 9(2), May 1991.


A Hierarchical Protection Model for Protecting against.. - Shinagawa, Kono, Masuda (2003)   (Correct)

No context found.

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy, "User-level interprocess communication for shared memory multiprocessors," ACM Transactions on Comuputer Systems (TOCS), vol. 9, pp. 175-- 198, May 1991.


Fast Multi-Threading on Shared Memory Multiprocessors - Cordina (2000)   (5 citations)  (Correct)

No context found.

Bershad, B., Anderson, T., Lazowska, E. and Levy, H. User-Level Interprocess Communication for Shared Memory Multiprocessors. University of Washington, 1990. 88


HiPEC: High Performance External Virtual Memory Caching - Lee, Chen, Chang (1994)   (26 citations)  (Correct)

No context found.

Bershad, Brian N., Anderson, Thomas E., Lazowska, Edward D., and Levy, Henry M. User-Level Interprocess Communication for Shared Memory Multiprocessors. In ACM Transactions on Computer Systems, 9(2):175-198, May 1991.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC