14 citations found. Retrieving documents...
F. O'Carroll, H. Tezuka, A. Hori, and Y. Ishikawa. The Design and Implementation of Zero-Copy MPI using Commodity Hardware with a High Performance Network, pages 243--250. ACM SIGARCH ICS'98. July 1998.

 Home/Search   Document Not in Database   Summary   ACM   TOC   Related Articles   Check  

This paper is cited in the following contexts:
Pipelined Scheduling of Tiled Nested Loops onto.. - Athanasaki.. (2002)   (Correct)

....a system call, the OS switches to kernel level and assumes the copying of data from user areas to kernel areas for protection. Nevertheless, modern network technologies (i.e. SCI, Myrinet, etc. are mitigating this startup latency with optimized communication protocols (i.e. VIA) with Zero Copy [4], DMA support and UserLevel [3] characteristics. Not only these novel network interfaces are reducing the message startup latency, but they can also alleviate the communication burden from CPU. Current parallel applications should be rescheduled to exploit these enhanced features. The parallel ....

F. O. Carroll, H. Tezuka, A. Hori, and Y. Ishikawa. The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance Network. In Proceedings of the International Conference on Supercomputing, pages 243--249, Melbourne, Australia, 1998.


MPICH/Madeleine: a True Multi-Protocol MPI for High.. - Aumage, Mercier, Namyst (2000)   (3 citations)  (Correct)

....than 16 KB with a sustained bandwidth of 80 MB s and more. 5.4 BIP Myrinet We now expose performance evaluation for the Myrinet network. The protocol used was BIP [15] Comparisons are made between raw Madeleine performance and two other MPI versions over Myrinet: MPI GM ( 1] and MPICH PM ([13]) The Myrinet hardware is the same in all cases, but the performance of MPICH PM were measured on RWC PC Cluster II, which is a Pentium Pro 200 MHz based cluster (and not a dual PentiumII 450 MHz based) The libraries used in these two other versions of MPI are respectively Myricom s GM (1.2.3 ....

Francis O'Carroll, Hiroshi Tezuka, Atsushi Hori, and Yutaka Ishikawa. The Design and Implementation of Zero-Copy MPI using Commodity Hardware with a High Performance Network, pages 243-250. ACM SIGARCH ICS'98. Juillet 1998.


The design for a high performance MPI implementation.. - Prylli, Tourancheau, .. (1999)   (13 citations)  (Correct)

....A MPI implementation on top of the AM is available [3] Fast messages (FM) 11] use the same RPC like system. They include a heavy ow control protocol. The designers of FM provide an implementation of MPI [9] PM [8] is a fast communication layer for the myrinet network. MPI was ported to PM [10]. GM [1] is the native API for the myrinet hardware developed by myricom. MPI is also available on top of GM. Virtual Interface Architecture (VIA) 5] Computer industry leaders (Microsoft, Intel and Compaq) proposed a new software architecture for an e cient access to the network hardware from ....

Francis O'Carroll, Atsushi Hori, Hiroshi Tezuka, and Yutaka Ishikawa. The design and implementation of zero copy MPI using commodity hardware with a high performance network. ACM SIGARCH ICS'98, pages 243250, July 1998.


MPICH/Madeleine: a True Multi-Protocol MPI for High.. - Aumage, Mercier, Namyst (2000)   (Correct)

....7: Comparison between ch mad, Madeleine, ScaMPI and SCI MPICH 5.4 BIP Myrinet We now expose performance evaluation for the Myrinet network. The protocol used was BIP [15] Comparisons are made between raw Madeleine performance and two other MPI versions over Myrinet: MPI GM ( 1] and MPICH PM ([13]) The Myrinet hardware is the same in all cases, but the performance of MPICH PM were measured on RWC PC Cluster II, which is a Pentium Pro 200 MHz based cluster (and not a dual PentiumII 450 MHz based) The libraries used in these two other versions of MPI are respectively Myricom s GM (1.2.3 ....

Francis O'Carroll, Hiroshi Tezuka, Atsushi Hori, and Yutaka Ishikawa. The Design and Implementation of Zero-Copy MPI using Commodity Hardware with a High Performance Network, pages 243--250. ACM SIGARCH ICS'98. July 1998.


A Software Approach for Readout and Data Acquisition in .. - Antchev, Cano..   (Correct)

....highly efficient programs. E. Zero Copy Streams Figure 6: Components encapsulate various different tasks, but expose the same clean and narrow interface. The implementation of the I O streams aim at achieving good performance. Therefore they rely on zero copy buffers wherever possible [16] 17] [18]. In general, driver interfaces Operating system OS provided buffers OS provided buffers Operating system Zero Copy Buffer Stream Component Copy Buffer Stream Component UDP RingBuf Zcopy UDP Zcopy RingBuf 1 Control Flow Data Flow FU Output Stream RUM Input Stream ....

F. O'Carroll, H. Tezuka, A. Hori and Y. Ishikawa, The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance Network, in Proceedings of the 1998 International Conference on Supercomputing (ICS 98), pages 243-250, ACM Press, Melbourne, Australia, 1998.


MPI derived data types support in VIRTUS - Cristaldi, Iannello   (Correct)

....to call subprograms without limitations. The latter solution, however, would not be easily portable to other platforms. 5 Discussion and related work Several efficient implementations of MPI exist on both MPPs [7] and clusters [11] including some that use recent versions of the MPICH library [2]. All these implementations achieve remarkable performance and demonstrate that the design of the MPI standard and of its probably most popular public domain implementation, MPICH, succeeded in attaining both high efficiency and wide portability. Our own porting of MPICH on the FM platform attain ....

F. O'Carroll, H. Tezuka, A. Hori, and Y. Ishikawa, "The design and implementation of zero copy MPI using commodity hardware with a high performance network", Procs. of International Conference on Supercomputing, Melbourne, Australia, July 13-17, 1997, pp. 243--250.


Challenges in Data Acquisition at the Beginning of the New.. - Gutleber (1999)   (1 citation)  (Correct)

....configuration settings to achieve good performance. Portability of data transmission can at least be guaranteed with this approach. Message passing libraries, such as MPI [21] provide communication abstraction and are even available in optimised versions for specific communication hardware [22]. They lack, however, two properties, which are important for data acquisition: first it is not possible to mix different communication devices, and second a presentation layer is missing. With object oriented programming languages becoming ubiquitous, we are able to address the abovementioned ....

F.O'Carroll, H.Tezuka, A.Hori and Y.Ishikawa, The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance Network, in Proceedings of the 1998 International Conference on Supercomputing (ICS 98), pages 243-250, ACM Press, Melbourne, Australia, 1998.


Improving the Communication Subsystem Performance of WARPED - Rajasekaran (1998)   (1 citation)  (Correct)

....MPI BIP obtains a bandwidth of 912Mbps and 12 secs one way message latency. Software is available via WWW from http: lhpca.univ lyon1.fr software distrib.html. PM RWCP parallel and distributed system software TSUKUBA laboratory researchers have developed a new communication library called PM [54] for Myrinet gigabit LAN card, that was a dedicated processor and on board memory to handle a communication protocol. PM was designed to obtain high performance communication and support a multi user environment. PM consists of a daemon process, and a run time routine for a programming language. ....

O'Carroll, F., Hori, A., Tezuka, H., and Ishikawa, Y. The design and implementation of zero copy mpi using commodity hardware with a high performance network. In ACM SIGARCH ICS'98 (July 1998), pp. 243--250.


Template Based Structured Collections - Nolte, Sato, Ishikawa (2000)   (6 citations)  Self-citation (Ishikawa)   (Correct)

....to our reduce( semantics. The MPI implementation we used is MPICH0 50 100 150 200 250 0 50 100 150 200 Number of Objects (128PE) par. red 2 mpi bc red par. red 4 Figure 5. reduce( vs. MPICH PM CLUMP Proceedings of IPDPS 2000, pp. 483 491, ISBN 0 7695 0574 0, c 2000 IEEE 8 PM CLUMP [22, 18] which is a very efficient port of MPICH for Myrinet networks. Figure 5 shows that even our very first implementation is by all means in the competitive range. In case of the 4 ary tree topology our implementation shows even better performance than the corresponding MPI implementation. 6.3 Scan ....

F. O'Carroll, H. Tezuka, A. Hori, and Y. Ishikawa. The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance Network. In International Conference on Supercomputing '98, pages 243-- 250, July 1998.


TACO - Template Based Collections for Distributed.. - Nolte, Sato, Ishikawa   Self-citation (Ishikawa)   (Correct)

....3. Basic collective operations For comparison with a standard communication library we measured our reductions against collective operations of MPI. A MPI bcast( followed by a MPI Reduce( mpi bc reduce) is closest to our reduce( semantics. The MPI implementation we used is MPICH PM CLUMP [19, 17] which is a very efficient port of MPICH for Myrinet networks. Figure 3 shows that our very first implementation is by all means in the competitive range and shows even better performance as the corresponding MPI implementation. 5.2 Scan Operations In a scan operation all members of a collection ....

F. O'Carroll, H. Tezuka, A. Hori, and Y. Ishikawa. The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance Network. In International Conference on Supercomputing '98, pages 243-- 250, July 1998.


Implementation and Evaluation of MPI on an SMP Cluster - Takahashi, O'Carroll.. (1999)   (6 citations)  Self-citation (Tezuka Hori Ishikawa)   (Correct)

....provided. This feature is used by the SCore D gang scheduler which realizes the multi processes environment. MPICH PM MPICH PM is an MPI library on top of PM, based on MPICH[8] Using the PM remote memory write feature, MPICH PM achieves high bandwidth. MPICH PM has been developed for UP clusters[10]. MPICH PM uses the eager and the rendezvous protocols internally, supported by the MPICH implementation. The eager protocol pushes a message into the network as soon as an MPI send primitive is issued. When a message arrives at the receiver, MPICH stores the message into a temporary buffer if an ....

Francis O'Carroll, Hiroshi Tezuka, Atsushi Hori, and Yutaka Ishikawa. The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance Network. In ICS'98, July 1998.


RWC PC Cluster II and SCore Cluster System.. - Ishikawa, Tezuka, .. (1999)   (8 citations)  Self-citation (O'carroll Tezuka Hori Ishikawa)   (Correct)

....which is a restricted quantity resource under a paging memory system. Allocation of pinned down memory by multiple simultaneous requests for sending and receiving without a control can cause deadlock. MPICH PM, based on the MPICH implementation, has overcome this issue and achieves good performance[8]. MPICH PM CLUMP, the successor of MPICH PM, supports a cluster of multiprocessors or called CLUMP. Using MPICHPM CLUMP, The MPI legacy programs run on CLUMP without any modifications. For example, suppose a CLUMP consisting of 16 nodes each of which contains dual processors. MPICH PM CLUMP ....

Francis O'Carroll, Hiroshi Tezuka, Atsushi Hori, and Yutaka Ishikawa. The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance Network. In International Conference on Supercomputing '98, pages 243--250, July 1998.


PM2: High Performance Communication Middleware for .. - Takahashi.. (2000)   (1 citation)  Self-citation (Hori Ishikawa)   (Correct)

....cluster installed in a lower level environment and then use a large configuration cluster at a computer center as a production run or testbed for further scalability optimization. However, because high performance communication libraries have been developed for specific network interfaces [1, 2, 3, 4, 5]. the user program must be recompiled to use such a library when the program has previously been compiled for another library. 2. Since the users have a chance to access several clusters within their site (here we do not assume a global computing environment) it is a natural idea that those ....

Francis O'Carroll, Hiroshi Tezuka, Atsushi Hori, and Yutaka Ishikawa, "The Design and Implementation of Zero-Copy MPI Using Commodity Hardware with a High Performance Network," In International Conference on Supercomputing '98, pages 243--250, ACM, July 1998.


MPICH/Madeleine: a True Multi-Protocol MPI for High.. - Aumage, Mercier, Namyst (2001)   (3 citations)  (Correct)

No context found.

F. O'Carroll, H. Tezuka, A. Hori, and Y. Ishikawa. The Design and Implementation of Zero-Copy MPI using Commodity Hardware with a High Performance Network, pages 243--250. ACM SIGARCH ICS'98. July 1998.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC