| T. Eicken, D. Culler, S. Goldstein, and K. Schauser. Active messages: A mechanism for integrating communication and computation. In Proceedings of the 19th International Symposium on Computer Architecture, pages 256--266, Gold Coast, Australia, May 1992. |
....in our application model, a manual analysis is required to model the packet arrival mechanism. The mapping must also indicate whether packet arrival is implemented with interrupts, polling, a hybrid interrupt polling scheme [10] or special hardware supporting a scheme like active messages [5]. The Click system itself, when running in kernel mode on Linux, uses polling to examine DMA descriptors (a data structure shared by the CPU and the device for moving packets. This issue is of paramount importance in real systems, and is worthy of the attention it has received in the research ....
T. Eicken, D. Culler, S. Goldstein, and K. Schauser. Active messages: A mechanism for integrating communication and computation. In Proceedings of the 19th International Symposium on Computer Architecture, pages 256--266, Gold Coast, Australia, May 1992.
....in the application model, a manual analysis is required to model the packet arrival mechanism. The mapping must also indicate whether packet arrival is implemented with interrupts, polling, a hybrid interrupt polling scheme [10] or special hardware supporting a scheme like active messages [5]. The Click system itself, when running in kernel mode on Linux, uses polling to examine DMA descriptors (a data structure shared by the CPU and the device for moving packets. This issue is of paramount importance in real systems, and is worthy of the attention it has received in the research ....
T. Eicken, D. Culler, S. Goldstein, and K. Schauser. Active messages: A mechanism for integrating communication and computation. In Proceedings of the 19th International Symposium on Computer Architecture, pages 256--266, Gold Coast, Australia, May 1992.
....a large body of related work in user level communication. VIA borrows its basic operation from U Net [6] virtual interfaces to the network from application device channels [4] and remote memory operations from the Virtual Memory Mapped Communication (VMMC) 5] model and from Active Messages (AM)[7]. Like WSDLite, Fast Sockets [10] offers increased communication performance by collapsing protocol layers, using simple buffer management strategies, and by using receive posting to bypass data copying. Thekkath et al. proposed separating network control and data flow, and employed unused ....
....12 direct user level access to the network interface, but do not support simultaneous use by multiple applications. The HP Hamlyn network implements user level sends and receives in hardware [1] ParaStation [13] provides unprotected userlevel access to the network interface. With Active Messages [7], each message contains the address of a user level handler that is executed upon message arrival with the message body as an argument. This allows the programmer and compiler to overlap communication and computation, thereby hiding latency. 7. Conclusions and Future Work For those applications ....
T. v. Eicken, D. E. Culler, S.C. Goldstein, and K. E. Schauser. Active Messages: A Mechanism for Integrating Communication and Computation. In Proceedings of the 19th International Symposium on Computer Architecture, pp. 256-266, 1992.
....alternative to the beta version of WSDP that we initially examined. VIA draws from several other research projects including application device channels [5] which provide the model for virtual interfaces to the network; and Virtual Memory Mapped Communication (VMMC) 6] and Active Messages (AM) [8], which provide the model for remote memory operations used in VIA. Other projects with similar goals to WSDLite and WSDP include Fast Sockets [11] which like WSDLite offers increased communication performance by collapsing protocol layers, using simple buffer management strategies, and by using ....
....direct user level access to the network interface, but do not support simultaneous use by multiple applications. The HP Hamlyn network implements user level sends and receives in hardware [2] ParaStation [14] provides unprotected user level access to the network interface. With Active Messages [8], each message contains the address of a user level handler that is executed upon message arrival with the message body as an argument. This allows the programmer and compiler to overlap communication and computation, thereby hiding latency. 7. Conclusions and Future Work For those applications ....
T. v. Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: A Mechanism for Integrating Communication and Computation. In Proceedings of the 19th International Symposium on Computer Architecture, pp. 256-266, 1992.
....(with a stack size of 1704 bytes) Active Threads [24] is a user level thread library that includes support for migration. One of the main goals of the Active Threads package is performance. This goal is achieved by utilizing an efficient user level communication package based on active messages [6]. Their solution to the stack pointer problem is similar to ours. For SPARCstation 10 multiprocessor workstations connected by a Myrinet network interface, thread migration latency was reported to be about 1.1ms for 2KB stacks. 5.2. Fault Tolerance Costa et al. 5] implement a logging and ....
T. v. Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: A Mechanism for Integrating Communication and Computation. In Proceedings of the 19th International Symposium on Computer Architecture, pp. 256-266, 1992.
....range of parallel machines. SAM provides three main primitives: SAM Send Msg, SAM Bcast Msg, and SAM Sync. Host processors communicate using SAM Send Msg, calculate the next quantum duration using SAM Bcast Msg (that is, via broadcast messages) and synchronize using SAM Sync. Like Active Messages [23], a SAM message contains a virtual address of a handler that will be called at the receiving host processor. However, unlike active messages, SAM does not guarantee message reception until SAM Sync completes. When SAM Sync returns, SAM guarantees that all messages have been received and processed ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....messages and handle them in an appropriate manner. It may be tempting to not take such messages out of the network while they are not welcome: this, however, is not an option on most systems, because messages must constantly be drained from the network to avoid deadlock in the network fabric [27]. Message reordering in the network adds to the woes of a protocol programmer. For example, processors may appear to request copies of cache blocks which they already have, if a read request message overtakes an invalidation acknowledgment message in the network. The protocol might have to await ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....that generate and consume messages, latency through the network, and 2 . latency of processor interactions with a network interface (NI) that connects a computer with a network. The advent of high performance microprocessors with supercomputer like clocks, lean software protocols (e.g. [127, 128]) and high speed reliable networks with tens of nanoseconds latency (e.g. 41] have drastically reduced the impact of software protocols and networks on the overall latency of communication. Consequently, the third component processor interactions with an NI threatens to become a critical ....
....called a system area network or SAN [54, 8] Appendix A) SANs improve performance in two ways. First, aggressive links and switches provide very high bandwidth and extremely low latency. Second, reliability properties of SANs allow systems to use lean communication layers (e.g. Active Messages [128]) instead of heavyweight and one size fits all protocols (e.g. TCP IP) Consequently, SANs help improve the performance of both network hardware (links and switches) and network software (communication protocols) Unfortunately, improvement in network and software protocols have exposed ....
[Article contains additional citation context not shown here]
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....work in user level communication. The VI specification borrows its basic operation from U Net [13] virtual interfaces to the network from application device channels [11] and remote memory operations from the Virtual Memory Mapped Communication (VMMC) 12] model and from Active Messages (AM) [14]. Fast Sockets [22] offer increased communication performance by collapsing protocol layers, using simple buffer management strategies, and by using receive posting to bypass data copying. Thekkath et al. proposed separating network control and data flow, and employed unused processor opcodes ....
....direct userlevel access to the network interface, but do not support simultaneous use by multiple applications. The HP Hamlyn network implements user level sends and receives in hardware [7] ParaStation [27] provides unprotected user level access to the network interface. With Active Messages [14], each message contains the address of a user level handler that is executed upon message arrival with the message body as an argument. This allows the programmer and compiler to overlap communication and computation, thereby hiding latency. Other work in the area of user level communication ....
T. v. Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser, "Active Messages: A Mechanism for Integrating Communication and Computation," presented at Proceedings of the 19th International Symposium on Computer Architecture, pp. 256-266, 1992.
....That is, synchronization threads are preferred over data threads. The model is not preemptive; synchronization and data threads are nonblocking and, once scheduled, execute to completion. Also note that the synchronization threads can be directly mapped into so called active message handlers (see [10]) Within a frame, threads communicate with each other by sharing data in the frame, including the manipulation of counting semaphores (and potentially queueing another data thread for execution) Note that, without special optimizations, interthread communication does not occur through registers; ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, Klaus Erik Shauser. Active Messages: a Mechanism for Integrating Communication and Computation. In proceedings of 20th Annual International Symposium on Computer Architecture, IEEE, June 1992, pages 256-266.
....through the abstraction layers. In most cases, applications access the network through a layer of messaging abstractions. We can distinguish between low level network access models and high level user messaging models. Network access models such as ADCs [14] U Net [54] Active Messages [55], Fast Messages [37] provide protected user access to the NI and serve as a consistent low level model across NIs. Applications can use them to access the network but likely they will prefer higher level messaging models such as Fbufs [13] MPI [16] or TCP IP. Minimal messaging, by definition, ....
....makes DMA transfers even faster than CPU transfers with uncached memory accesses. Therefore, transfers in minimal messaging become even faster than transfers in single copy messaging. Minimal messaging was implemented within the Tempest messaging model [39] a variant of Berkeley Active Messages [55]. Best Case Throughput Latency. We want to determine the effectiveness of minimal messaging in the best case. Therefore, we shall measure the maximum possible benefit across message sizes. We are interested in two metrics, throughput and latency. To measure latency (round trip time) we ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....protocol thread access that conflicts with the block s tag is undefined. The type of protocol event determines the user level handler function that is executed by the protocol thread. The handler for a message arrival is chosen by the sender and encoded in the message header (as in Active Messages [6]) The handler for a timer expiration is specified when the timer is initialized. Handlers for page faults and block access faults are registered locally by the application. All page faults are serviced by a single handler. The handler invoked for a block access fault is determined by the ....
....models (and by Tempest s bulk data transfer operations) are inappropriate for these applications since both the management of memory buffers and the need to copy data into and out of these buffers add significant overhead. Tempest s fine grain messaging facility is based on Active Messages [6]. In the Active Message model, the first word of every message is the starting program counter of the handler to be executed at the receiver. Messages are queued and the handlers are executed serially by the protocol thread. 3.3.1 Node identifiers typedef implementation specific TPPI NodeId; ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active messages: a mechanism for integrating communication and computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....class of networks called a system area network or SAN [5] SANs improve performance in two ways. First, aggressive links and switches provide very high bandwidth and extremely low latency. Second, reliability properties of SANs allow systems to use lean communication layers (e.g. Active Messages [12]) instead of heavy weight and one size fits all protocols (e.g. TCP IP) Consequently, SANs help improve the performance of both network hardware (links and switches) and network software (communication protocols) Unfortunately, improvements in network hardware and software are rarely delivered ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....programming abstractions. Some of these abstractions such as shared virtual memory [5] communicate data at coarse granularity (e.g. a 4 Kbyte page) using conventional high overhead legacy TCP IP protocols. Many abstractions, however, rely on low overhead messaging as in Active Messages [25] and employ fine grain protocols to exchange small amounts of data (e.g. 8 256 bytes) over the network [23,22,15] Protocol handlers in such systems, typically execute a small number of instructions to move data between the application s data structures and the network message queues, and ....
....distributed shared memory provides fine grain communication among processors. Low level communication occurs within a node using the snoopy cache coherent memory bus. A software protocol implements communication across SMP nodes using a fine grain messaging abstraction such as Active Messages [25]. The software protocol executes either on embedded network interface processors [16,14] or on the SMP commodity processors [15,23] Figure 2 (left) depicts an example of a simple activemessage based protocol which performs a fetch add operation on a memory word. The handler takes as input ....
T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active messages: a mechanism for integrating communication and computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....Blizzard runs under an unmodified Solaris 2.4 kernel. Blizzard also supports more efficient fine grain access control, including optimized software and dedicated hardware. Again, the final difference is that Blizzard exploits multiprocessor nodes. Blizzard s uses Tempest active messages [15,38,27]. Tempest communication differs from other active message system [13,25,37] since it does not constrain a program to request response protocols and interrupts computation on message arrival. These unique aspects are necessary for coherence protocols, but complicate the communication layer. Even ....
....without kernel intervention and is used by most recent low latency communication layers [25,13,37,3] For Myrinet hardware and Tempest active messages, it presents interesting hardware software trade offs. 6. 1 Tempest Active Messages Tempest messages are similar to other active message models [38], but they differ in two respects: messages are not constrained to follow a request reply protocol and are delivered without explicit polling by an application program. The differences are necessary to use Tempest messages to implement transparent shared memory protocols. For example, a common ....
[Article contains additional citation context not shown here]
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....from) the SAN interface. LANs use very heavy weight software protocols such as TCP IP, which make the conservative assumption that LANs, like the internet, are highly unreliable. For SANs, such overly conservative protocols can be replaced with lean communication layers such as Active Messages [54, 24]. Typically, LAN interfaces deliver messages to user applications via the operating system, which can be very expensive. For SANs, the latency through the operating system can be eliminated by providing applications with direct user level access to the network interface hardware. For example, the ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....in a private address space augmented by an optional shared segment. Shared memory and hybrid applications can use Tempest mechanisms (or Tempest shared memory libraries) to manage the shared address space. The four types of Tempest mechanism are: Active messages are short, low latency messages [23]. They are useful for sending control, synchronization, or short data messages. Upon receipt of an active message, the system invokes the handler specified by the message and passes two arguments: the sender s processor number and the message length. The handler reads the message body from the ....
....in Section 3. On the CM 5, the shared memory EM3D ran as fast as a native message passing version. 6 Related Work Several interfaces share Tempest s goal of providing portability among parallel machines. PVM [7] is a widelyused, coarse grain message passing system. Berkeley s Active Messages [23] provides a portable interface for finegrain messages, but, unlike Tempest, no support for transparent caching. DSM systems, such as Rice s Munin [1] and Treadmarks [10] support shared memory, but since their coherence is limited to page granularity, they require more complex semantic models to ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proc. of the 19th Annual Inter. Symposium on Computer Architecture, pp. 256-- 266, May 1992.
....to use writeback caching, and we focus on fine grain user to user communication in which the receiving process may be notified without an interrupt. We differ from Remote Queues by being at a lower level of abstraction. Remote Queues provide a communication model similar to Active Messages [45], except extracting a message from the network and invoking the receive handler can be decoupled. Implementing Remote Queues with CNIs is straightforward and offers advantages over CM 5, Intel Paragon, MIT Alewife, and Cray T3D network interfaces. CNIs support cachable device registers for ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....shared memory, message passing, or hybrid (i.e. combination) applications. The four types of Tempest mechanisms are: Low Overhead Active Messages. Tempest supports an active message abstraction, in which each message specifies a destination node, handler address, and a string of arguments [25]. When a message arrives at its destination, it creates a thread that runs the handler atomically with respect to other message handlers. Nothing guarantees atomicity between a handler and the destination node s computation thread, except explicit (user level) synchronization. Bulk Data Transfer. ....
....a data structure that exhibits false sharing. 2.3 Implementing Tempest Tempest s messaging and virtual memory support are largely conventional. Active message abstractions can be implemented very efficiently with custom hardware [6, 18] but also have reasonable performance on existing machines [25]. Tempest s virtual memory mechanisms can be implemented as a userlevel library on a system that provides mmap( and munmap( or with custom kernel modifications [19] The key challenge in implementing Tempest is sup3 Appears in: Supercomputing 94, Nov. 1994. Reprinted by permission of IEEE. ....
[Article contains additional citation context not shown here]
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....is similar to LAPSE [5] a direct execution simulator for message passing machines that runs on the Intel Paragon. One difference between the two simulators is that LAPSE models network contention. To provide a communication library for messagepassing programs, we ported the Active Message [22] layer from Thinking Machines CMMD library to the Wind Tunnel. The complete CMMD library runs as part of the target program, as it does on a CM 5. Since the active message code is heavily optimized assembly code that violates the SPARC ABI convention in ways that conflict with WWT s violations of ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....described in Section 1.3. 9 1.2 The Tempest interface Tempest provides a concrete, portable interface to the three DSM mechanisms identified in the previous section: messaging, local storage management, and memory access control. Tempest s messaging borrows from von Eicken s Active Messages [vECGS92] Standard virtual address translation mechanisms are used for local storage management, as in software DSM systems [LH89] The most innovative aspect of Tempest is its specification of fine grain access control, a feature that enables fine grain coherence and provides scalability to ....
....mechanisms required for distributed shared memory (DSM) Three mechanisms messaging, local storage management, and access control underlie nearly all DSM systems. Section 2.2 describes Tempest, a concrete, portable interface to these mechanisms. Tempest uses a variant of Active Messages [vECGS92] for messaging. Virtual address translation provides local storage management. Tempest s most innovative feature is fine grain access control. The next two sections show how Tempest can be used, first to provide application transparent shared memory (Section 2.3) then to optimize the performance ....
[Article contains additional citation context not shown here]
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a mechanism for integrating communication and computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....point topoint messages. Low overhead messages are fundamental to the performance of most programming models. The Active Messages model, where a message specifies a userlevel handler to be invoked on its reception, provides an efficient building block for many paradigms, including shared memory [45]. With user level access to fast messages, compilers can exploit the statically determinable properties of data structures and program communication by explicitly communicating values. In addition, lowlatency message handling is critical for transparent shared memory performance. In Tempest, a ....
....node ID to a memory mapped register. Data words are moved to the send queue using stores or block transfers. The end of the message is signaled by a low order bit in the register address. On the receiver, the first data word is interpreted as the receive handler PC, as in Active Messages [45]. The receive handler must pull the remainder of the message from the receive queue. Scheduling on the NP is performed by a hardwareassisted dispatch loop [16] The dispatch hardware constructs a handler PC in a dedicated register either by taking the first word of an incoming message or by using ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....machines (e.g. Typhoon) an MPP (Blizzard on a CM 5) a NOW (Blizzard on Wisconsin COW) and a NOW with selected hardware acceleration (Wisconsin COW with T0, designed with the help of Sun Microsystems) Tempest provides two classes of messages. First, active messages like Berkeley s [41] transfer control information (e.g. requests for data) and small amounts of data (e.g. a 32 byte cache block) 33] Second, bulk data transfer primitives like CM 5 channels provide higher bandwidth for large messages, which can afford the higher start up cost. As defined so far, Tempest ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....page level protection mechanism. While the results of this paper are largely independent of whether this mechanism is implemented in hardware or software, the results in Section 6 come from a software technique [23] High performance communication is performed via an active message abstraction [30]. Active messages are essentially very light weight RPCs that are optimized for the case where processing nodes are co scheduled; that is, where the destination node already has the correct context for the RPC. Message arrivals either cause interrupts or the processor(s) may poll the network ....
....a simple exception (which requires a similar path through the kernel) takes at least 60 to 200 microseconds for a round trip [26] The alternative is periodic polling, which requires instrumenting the computation thread to periodically check for messages. This can be done either via a compiler [30] or by directly editing the executable file [16] This approach requires a trade off between latency and overhead: frequent polls decrease message latency but increase overhead. 3.2 Fixed Protocol Processor The Fixed policy dedicates one processor of a multiprocessor node to perform only protocol ....
Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active Messages: a Mechanism for Integrating Communication and Computation. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 256--266, May 1992.
....can generate specialized instruction sequences that use shared memory or drive message passing hardware more efficiently than general purpose communication libraries. Recent research suggests that specialized communication code can improve message passing performance by an order of magnitude [14, 31]. Fortran M performance also depends on the cost of process creation, scheduling, and termination operations. A preemptive scheduler is required so as to permit overlapping of computation and communication. Fortunately, these facilities are, increasingly, supported either at the operating system ....
....and termination operations. A preemptive scheduler is required so as to permit overlapping of computation and communication. Fortunately, these facilities are, increasingly, supported either at the operating system [38, 9] or hardware levels [21, 33, 11] or can be provided by a compiler [14]. 9 Related Work Programming notations for parallel scientific programming fall into three principal classes: coordination languages, message passing libraries, and data parallel extensions. Here, we discuss how Fortran M differs from each of these approaches, focusing in particular on the ....
von Eicken, T., Culler, D., Goldstein, S., and Schauser, K., Active messages: A mechanism for integrating communication and computation, Proc. 19th Intl Symp. Computer Architecture, ACM, 1992.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC