| Jiun-Ming Hsu and Prithviraj Banerjee. A message passing coprocessor for distributed memory multicomputers. In Proceedings of Supercomputing '90, pages 720--729, November 1990. |
....has finished so it can dispatch the arrived message. The main disadvantage of traditional network interface is that message passing costs are usually thousands of CPU cycles. One solution to the problem of software overhead is to add a separate processor on every node just for message passing [12, 8]. Recent examples of this approach are the Intel Paragon [9] and Meiko CS 2 [7] The basic idea is for the compute processor to communicate with the message processor through either mailboxes in shared memory or closelycoupled datapaths. The compute and message processors can then work in ....
Jiun-Ming Hsu and Prithviraj Banerjee. A message passing coprocessor for distributed memory multicomputers. In Proceedings of Supercomputing '90, pages 720--729, November 1990.
....in message passing performance. Until recently, this interest was primarily restricted to network interfaces for multicomputers. An increasingly common multicomputer approach to the problem of user level transfer initiation is the addition of a separate processor to every node for message passing [16, 10]. Recent examples are the Stanford FLASH [14] Intel Paragon [11] and Meiko CS 2 [9] The basic idea is for the compute processor to communicate with the message processor through either mailboxes in shared memory or closely coupled datapaths. The compute and message processors can then work ....
Jiun-Ming Hsu and Prithviraj Banerjee. A message passing coprocessor for distributed memory multicomputers. In Proceedings of Supercomputing '90, pages 720--729, November 1990.
....or workstations. The main disadvantage is that message passing costs are usually thousands of CPU cycles, with the best implementation [32] still requiring over 100 CPU cycles. One solution to the problem of software overhead is to add a separate processor on every node just for message passing [22, 13]. Recent examples of this approach are the Intel Paragon [14] and Meiko CS 2 [12] The basic idea is for the compute processor to communicate with the message processor through either mailboxes in shared memory or closely coupled datapaths. The compute and message processors can then work in ....
Jiun-Ming Hsu and Prithviraj Banerjee. A message passing coprocessor for distributed memory multicomputers. In Proceedings of Supercomputing '90, pages 720--729, November 1990.
....finished so that it can dispatch the arrived message. The main disadvantage of traditional network interfaces is that message passing costs are usually thousands of CPU cycles. One solution to the problem of software overhead is to add a separate processor on every node just for message passing [15, 10]. Recent examples of this approach are the Intel Paragon [12] and Meiko CS2 [9] The basic idea is for the compute processor to communicate with the message processor through either mailboxes in shared memory or closely coupled datapaths. The compute and message processors can then work in ....
Jiun-Ming Hsu and Prithviraj Banerjee. A message passing coprocessor for distributed memory multicomputers. In Proceedings of Supercomputing '90, pages 720--729, November 1990.
.... traditional network interfaces and thus their implementations of the NX message passing library manage communication buffers in the kernel [37, 35] Current machines like the Intel Paragon and Meiko CS 2 attack software overhead by adding a separate processor on every node just for message passing [34, 25, 23, 22, 20]. This approach, however, does not eliminate the overhead of the software protocol on the message processor, which is still tens of microseconds in software overhead. Distributed systems offer a wider range of communication abstractions, including remote procedure call [8, 39, 4] ordered ....
Jiun-Ming Hsu and Prithviraj Banerjee. A Message Passing Coprocessor for Distributed Memory Multicomputers. In Proceedings of Supercomputing '90, pages 720--729, November 1990.
....from the past and present are shown in Figure 2 1. The round trip cost, further explained in Table 2.1, is roughly based on a two way null remote procedure call (RPC) or a ping pong operation. It is obtained by doubling the reported value when only the one way cost is provided in the literature [19, 20, 21, 15, 22, 23, 24, 25, 26, 27]. Since an actual implementation for [18] does not exist, the round trip cost is extrapolated from the specified overhead of assembling, sending and receiving a remote read message 1 . On the horizontal axis, Figure 2 1 also shows that the systems employ a variety of mechanisms for robustness. ....
.... dedicated message buffers, and trusted message handlers are enforced through guarded pointers [13] The high cost of operating system involvement in the message interface is evident in the original Intel iPSC 2, where 85 of the communication time for a short message is spent in software overhead [19]. Context switches between user and system mode alone account for 18 of that overhead. The overall latency is however reduced by almost 7 Theta when a co processor is added to relieve the operating system [19] More recently, the Shrimp [25] system uses a user level direct memory access (UDMA) 1 ....
[Article contains additional citation context not shown here]
Jiun-Ming Hsu, Prithviraj Banerjee, "A Message Passing Coprocessor for Distributed Memory Multicomputers", in SuperComputing 1990, pp. 720--729.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC