| KUBIATOWICZ, AND AGARWAL. Anatomy of a Message in the Alewife Multiprocessor. In International Supercomputing Conference (1993). |
....functionality not supported until recently by current day microprocessors (in the form of cache coherence protocols, which can provide limited forms of access to the on chip cache) Router Processor Cache Memory Network interface and DMA Figure 2. 5: Alewife block diagram The Alewife machine [25] shares the AP1000 s high level architectural design. Both systems give the network interface access to both the cache and primary memory. However, as Figure 2.5 indicates, the Alewife s network interface is additionally directly accessible by the CPU. There are thus three ways to send messages in ....
John Kubiatowicz and Anant Agarwal. The anatomy of a message in the Alewife multiprocessor. In Proceedings of the International Conference on Supercomputing, July 1993. Available from ftp://cag.lcs.mit.edu/pub/papers/anatomy.ps.Z.
....the cache in message sends and or receives. An alternative design is to logically connect the network to primary memory, using DMA transfers for interprocessor communication. Such an approach is utilized by a wealth of machines, for instance the Intel Paragon [10] Cray T3D [6] MIT Alewife [11], Bull ECRC ICL Siemens EDS [17] Meiko CS 2 [9] and Caltech Mosaic C [13] as well as systems designed around the Inmos T9000 Transputer [12] The tradeoff between the two interfaces is at the highest level parallel performance versus sequential performance. Transmitting messages between ....
John Kubiatowicz and Anant Agarwal. The anatomy of a message in the Alewife multiprocessor. In Proceedings of the International Conference on Supercomputing, July 1993. Available from cag.lcs.mit.edu/pub/papers/anatomy.ps.Z.
....file access latency using client initiated RDMA is 20 to 40 lower than with using RPC. We note that this is a rough estimate based on measurements of a prototype with few optimizations. We plan for a more thorough evaluation in the future. NICs that are tightly coupled with the host [29] [30], 31] 14] 32] aim at lowering the NIC overhead as well as the overhead of the NIC interaction with the host for control and data transfer. Previous research [31] has pointed to the importance of NIC design for low latency RPC communication. Scheduling delays included in TNullRPC can be ....
J. Kubiatowicz and A. Agarwal, "Anatomy of a Message in the Alewife Multiprocessor," Tech. Rep. LCS/TM-498, MIT, 1993.
....sharing of memory blocks (up to five remote readers) is supported in hardware; higher degree sharing is handled in software by trapping the home processor. In addition to providing support for coherent shared memory, Alewife provides the processor with direct access to the interconnection network [61]. E cient mechanisms are provided for sending and receiving both short (register to register) and long (memory to memory) messages. Using these message passing mechanisms, a processor can send a message in a few user level instructions. A processor that receives such a message traps; user level ....
J. Kubiatowicz and A. Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the International Conference on Supercomputing, pages 195--206, July 1993.
....completes. Because operation records are selected based on PID, however, intervening processes cannot corrupt the interrupted process s record. Instead, the interrupted process can simply resume its sequence when it is rescheduled to execute. Compared to the solution used in CM 5 [22] and Alewife [13], our approach does not require saving and restoring of the commands across interrupts. This is because MAGIC accu This does not imply the completion of the requested operation. Completion is checked through predesignated memory locations in most protocols. Virtual Addt esses 0xl0001000 ....
....physical address spaces. mulates the memory mapped operations within software records as opposed to a single hardware queue. In addition, the programmability of MAGIC allows us to customize the command sequence protocol for various uses as opposed to providing a single hardwired protocol [13]. 3.2 Virtual Memory User level messaging requires support for virtual to physical address translations since the user process can only specify virtual addresses and MAGIC needs to directly access memory with physical addresses. We begin this section by briefly describing our approach for ....
[Article contains additional citation context not shown here]
John Kubiatowicz and Anant Agarwal. Anatomy of a message in the Alewife multiprocessor. In Proceedings of the 7th ACM International Conference on Supercomputing, July 1993.
....type of deadlock, the system must extend the buffer space by copying messages to an overflow software queue in main memory [BCL 95,MKAK94] of the receiver. 67 Earlier systems aggressively buffer messages when the sender blocks [SFL 94, BCL 95] or after it has been blocked for a timeout interval [KA93] However, the extra copies this entails can potentially degrade performance [MFHW96] Blizzard, instead, uses a conservative deadlock detection scheme, which only buffers messages when a deadlock may have occurred. For this purpose, each node n uses a conservative, local condition to detect ....
John Kubiatowicz and Anant Agarwal. Anatomy of a message in the alewife multiprocessor. In Proceedings of the 1993 ACM International Conference on Supercomputing, 1993.
....The system is deadlocked. Single request buffers A2 A1 A2 A1 B1 B2 Figure 3.4: Protocol Level Deadlock situations. Once detected, there are two basic techniques to handle deadlock. The first technique attempts to avoid deadlock by using local memory to expand the buffer space as needed [49]. When a buffer fills and a possible deadlock situation is detected, packets are removed from the buffer and placed in a secondary buffer created in the local memory. The cache and directory controllers must then process packets from this secondary buffer until it is empty. This technique ....
John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In 7th ACM International Conference of Supercomputing, 1993.
....a hardware only implementation. In, e.g. DASH [16] some request response dependencies are solved using separate meshes for requests and replies. The general deadlock problem can be solved by augmenting the hardware buffers in software. This technique is referred to as network overflow recovery [15]. A time out mechanism signals a network overflow interrupt to the processor when the send buffer has been full for a predefined amount of time. The processor then moves the contents of the interrupt buffer to a software buffer allocated in memory, thereby freeing buffer space in the hardware. The ....
J. Kubiatowicz and A. Agarwal, "Anatomy of a Message in the Alewife Multiprocessor," In Proceedings of the 7th ACM International Conference on Supercomputing, pages 195-206, July 1993.
....ELAN using its own TLB. The size of the message, however, is 32 bytes, so overhead becomes larger for larger messages. Some machines were proposed to integrate cache coherent hardware and bulk data transfer mechanisms, for applications which require high throughput communications. The MIT Alewife [13, 14] integrates fine grain communication using cache coherent hardware and bulk data transfer using message passing. The Alewife machine, however, does not support virtual memory. The Alewife can send stride data by writing many address and size pairs into message headers, but this type of stride data ....
Kubiatowicz, J., and Agarwal, A. Anatomy of a message in the Alewife multiprocessor. In International Conference on Supercomputing (1993), pp. 195--206.
....chips can be connected with these links to create parallel systems of arbitrary topology. They also allow a VIRAM chip to be used along with a hard disk drive as the building block for scalable data servers like ISTORE [7] The structure of the network interface is similar to those in the Alewife [24] and Fugu [26] systems. It is memory mapped as a virtual resource and allows applications to send short or long messages without invoking the operating system. Short messages can be created by storing data directly into a message buffer in the network interface. For long messages, one or more DMA ....
J. Kubiatowicz and A. Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the 7th ACM International Conference on Supercomputing, July 1993.
....at MIT, is a system that implements this integrated approach. It combines a multithreaded processor with software assisted invalidate based cache coherence [14] It also provides a user level interface to the underlying message passing hardware that is used to implement the coherent shared memory [54, 56, 57]. An Alewife processor has access to a single set of communication registers. A message is first described by writing values or pointer length pairs to the registers. An atomic send instruction is used to send the message as a single unit to its destination. When a message arrives at its ....
....interface to a single network port, means that multiple message buffers must be maintained. There are less expensive ways to support message passing, such as the Alewife approach of providing a user level interface to the same network interface that is used for handling shared memory operations [56, 57]. There is also a push to make traditional network interfaces more efficient and (to an extent) programmable. There would be some loss of performance, compared to StreamLine, with these less integrated approaches. As for ease of use, there is the recurring debate about whether message passing is ....
John Kubiatowicz and Anant Agarwal. Anatomy of a message in the Alewife multiprocessor. In Proceedings of the 7th ACM International Conference on Supercomputing, July 1993.
....a head pointer to read different words of a message from different addresses of the queue. Several existing NIs can be classified with this taxonomy. The Thinking Machines CM5 [124] NI is NI 2w since it exposes two words of a message to the receiver. Similarly, the Alewife [2] NI is NI 16w [62]. The network interface in T NG [22] which devotes 8 KB for an NI queue and consists of 64 byte cache blocks, is NI 128 Q. The T Jr NI [53] can be classified as CNI 0 Q m because it does not have a cache (hence 0 ) I call the T Jr NI a CNI, even though it does not have a cache, because it ....
John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the 1993 ACM International Conference on Supercomputing, 1993.
....4096 3644 4634 Coherent Widget 32 371 22 Coherent Widget 128 476 65 Coherent Widget 4096 3632 1801 viewing message passing as a valid programming model in its own right, and not just as a bulk transfer mechanism that can overcome a shortcoming of DSM in moving large quantities of data. Alewife[7] represents one of the earliest hybrid distributed shared memory explicit message passing systems. Its approach was very invasive, requiring fabrication of a custom version of a SPARC[10] cpu. Message handling received little support, other than limited DMA capability, necessitating significant ....
Kubiatowicz, J., and Agarwal, A. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the 7th ACM International Conference on Supercomputing (July 1993).
.... as TCP and UDP, will soon dwarf the transmission time[2] Examples of the fabrics we consider potentially viable are Fibre Channel[6] and R2[5] Our target systems, clusters of commodity workstations running essentially standard operating systems, rules out approaches such as those taken by Alewife[8], Typhoon[12] T[11] or MDP[3] which rely on custom processors and or non standard operating systems. Continued reliance on standard protocols can impose unnecessary communication costs. The services required by applications on these clusters are often far more modest than those provided by the ....
Kubiatowicz, J., and Agarwal, A. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the 7th ACM Internatonal Conference on Supercomputing (July 1993).
....shared memory system performs quite well, enabling speedups comparable to or better than similarly scalable systems. In addition to providing support for coherent shared memory, Alewife provides the processor with direct access to the interconnection network for sending and receiving messages [24]. Efficient mechanisms are provided for sending and receiving both short (register to register) and long (memoryto memory, block transfer) messages. Using Alewife s message passing mechanisms, a processor can send a message with just a few user level instructions. A processor receiving such a ....
John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the International Conference on Supercomputing, pages 195--206, July 1993.
....from the manager by sending data to other clients. The job of the metadata manager is tracking locations of file data blocks, and forwarding requests from clients to the appropriate destinations. Its functionality is similar to the directory in traditional DSM systems such as DASH [32] and Alewife [30]. Finally, the storage servers collectively provide the illusion of a striped network disk. They receive striped writes from clients. They also react to requests from managers by supplying data to the clients which have initiated the I O operations. Our prototype runs on Sun SPARC and UltraSPARC ....
....grain than a per file basis. Finally, client to client data transfers in xFS, while more efficient than passing all data through the server, introduce potential circular dependencies. The cache coherence protocol in xFS is similar to those seen in hardware DSM systems such as DASH [32] and Alewife [30]. But even minor modifications to these protocols can lead to subtle bugs [9] Also, aspects of the cluster file system require protocol modifications that do not apply to DSM systems. For example, xFS must maintain reliable data storage in the face of node failures. A client must therefore write ....
Kubiatowicz, J., and Agarwal, A. Anatomy of a Message in the Alewife Multiprocessor. In Proc. of the 7th International Conf. on Supercomputing (July 1993).
....[2,3,8,25,26] On the other hand, coherent shared memory can have performance problems compared with alternative abstractions such as raw message passing, object based distributed languages, and remote procedure call or object invocation. These recognized performance problems motivate current [21,22] and future [9,23] systems that combine some form of message passing with shared memory. While integrating messages and memory can be a simple matter on a system based on slow but straightforward mechanisms, modern distributed memory systems based on relaxed consistency models allow implementors ....
....and also allows us to build tailored synchronization objects in the Carlsberg runtime system instead of having to use the fixed synchronization primitives normally provided in DSM systems like Munin and TreadMarks. The Alewife multiprocessor provides a hardware DSM combined with message passing [21,22] and multithreaded processors. The problems in this environment are simpler than those in Carlsberg since the system is based on a sequentially consistent memory model and messages are passed in the same FIFO pipelines as memory operations. This guarantees that the appropriate memory consistency ....
John Kubiatowicz and Anant Agarwal. Anatomy of a message in the Alewife multiprocessor. In Proceedings of the 1993 ACM International Conference on Supercomputing, pages 195--206, July 1993.
....problems by disabling caching and must maintain coherence at a finer grain. Client to client interactions in xFS, while more efficient, introduces potential circular dependencies. The cache coherence protocol in xFS is close to those seen in hardware DSM systems such as DASH [26] and Alewife [24]. But even minor modifications to these protocols can lead to subtle bugs. Also, reliability considerations of a file system require protocol modifications that do not apply to DSM systems. For example, a client must write its dirty data to storage servers before it can forward it to another ....
Kubiatowicz, J., and Agarwal, A. Anatomy of a Message in the Alewife Multiprocessor. In Proc. of the 7th Internat. Conf. on Supercomputing (July 1993).
No context found.
KUBIATOWICZ, AND AGARWAL. Anatomy of a Message in the Alewife Multiprocessor. In International Supercomputing Conference (1993).
.... performs quite well, enabling speedups comparable to or better than other scalable hardware based DSM systems [1, 49] In addition to providing support for coherent shared memory, Alewife provides the processor with direct access to the interconnection network for sending and receiving messages [45]. Efficient mechanisms are provided for sending and receiving both short (register to register) messages and long (memory to memory, bulk data transfer) messages. In addition, messages combining both types of data can be sent: some elements of a message can be register to register, scalar values, ....
John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the International Conference on Supercomputing, pages 195--206, July 1993. 160
....load and run programs via this interface. The code for instrumenting the statistics gathering facility is included as a part of the Alewife kernel and the statistics monitoring mode is activated by adding features to the host interface. The Alewife kernel supports a message passing interface [15] which is used to communicate between the host and the machine. Alewife also supports a timer interrupt facility which is used to interrupt processors at specified times to collect statistics for a certain interval. This feature of the Alewife architecture is utilized in QuickStep to provide ....
John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the International Supercomputing Conference (ISC) 1993, Tokyo, Japan, July 1993.
....a 2 D mesh routing chip, and the CMMU, Communications and Memory Management Unit. Alewife supports sequential consistency, and maintains cache coherence using a single writer write invalidate cache coherence protocol. Also, Alewife provides a fast user level messaging interface with DMA capability [19]. DMA data in messages are locally coherent. 4.2 Support for MGS 4.2.1 Software Virtual Memory MGS requires a virtual memory system in order to implement software DSM. Alewife, however, is a single address space machine and does not support virtual memory. Our MGS prototype performs address ....
John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the International Supercomputing Conference, Tokyo, Japan, July 1993.
....the shared memory and message passing mechanisms. To do so, the system provides forward progress guarantees to shared memory accesses in the face of message reception interrupts. In addition, the DMA engine maintains the coherence between the data in messages and the data in local caches [18]. C. Fine Grain Computation Given a fixed size data set, the granularity of computation (the time between events that require interprocessor AGARWAL et al. MIT ALEWIFE MACHINE 431 communication) decreases as the number of processors in a system increases. A system that cannot handle small tasks ....
....and Memory Coherence and DRAM Control blocks comprise, respectively, the processor and memory portions of the cache coherence protocol. In addition, both blocks service requests from the Network Interface and DMA Control block, which provides user level message passing with locally coherent DMA [18]. Since the processor and memory sides of the cache coherence protocol as well as the message passing interfaces share the same network queues, message passing and shared memory are integrated [16] The Transaction Buffer is a 16 entry, fully associative data store that tracks outstanding cache ....
J. Kubiatowicz and A. Agarwal, "Anatomy of a message in the alewife multiprocessor," in Proc. Int. Conf. Supercomputing, July 1993, pp. 95--106.
....At the source, messages are injected into the network at any rate up to and including the rate at which the network will accept them. The injection operation is atomic in that messages are committed to the network in their entirety; no partial packets are ever seen by the communication substrate [17]. Message injection can thus be viewed in the following fashion: inject(header, handler, word0, word1, If resource contention prevents the network from accepting a given message, the corresponding inject operation blocks until successful. Alternatively, blocking can be avoided by using a ....
....from fast to buffered mode in response to exceptional conditions and support operation in buffered mode. Further discussion of buffering is deferred to Section 4.2. Send and Receive. The inject operation of the abstract model is decomposed into a two phase process of describe and launch, as in [17]. To send a message, an application first writes all of the Interrupt Trap Event Signaled message available User interrupt: raised when a message is available for reading mismatch available Interrupt: message available with mismatched GID (or all messages when divert mode is set) ....
John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the International Supercomputing Conference, July 1993.
No context found.
Kubiatowicz, J., and Agarwal, A. Anatomy of a Message in the Alewife Multiprocessor. In Proceedings of the 7th ACM Internatonal ConferenceonSupercomputing (July 1993).
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC