| Vijay Karamcheti and Andrew A. Chien. FM---fast messaging on the Cray T3D. Available from http://www-csag.cs.uiuc.edu/papers/t3d-fmmanual. ps, February 1995. |
....unique goal [12, 25, 30, 31] but FM is distinguished by its hardware context (Myrinet) and high performance. The Fast Messages project focuses on optimizing the software messaging layer that resides between lower level communication services and the hardware. It is available on both the Cray T3D [22, 23] and Myricom s Myrinet [6] Using the Myrinet, FM provides MPP like communication performance on workstation clusters. FM on the Myrinet achieves low latency, high bandwidth messaging for short messages delivering 32s latency and 16 MBytes s bandwidth for 128 byte packets (user level to ....
....SBus bandwidth, and hence are not a critical performance factors. 3 The Fast Messages Approach 3.1 Illinois Fast Messages (FM) 1. 0 Illinois Fast Messages (FM) is a high performance messaging layer which is available on several parallel platforms (Cray T3D and workstation clusters) [22, 23]. The design goal of FM is to deliver network hardware performance to the application level with a simple interface. FM is appropriate for implementors of compilers, language runtimes, communications libraries, and in some cases application programmers. Function Operation FM send ....
Vijay Karamcheti and Andrew A. Chien. FM---fast messaging on the Cray T3D. Available from http://www-csag.cs.uiuc.edu/papers/t3d-fmmanual. ps, February 1995.
....The runtime system exposes specialized versions [22] of important runtime primitives, such as remote method invocation and synchronization via futures [15] to the compiler to exploit compile time information. In addition, communication is realized via low overhead messaging layers: Fast Messages [23, 24] on the CRAY T3D and Active Messages [48] on the TMC CM 5. In addition to the standard features of the programming model, IC CEDAR also utilizes general placement directives of collection of objects (similar to map arrays in HPF) for spatial based object distribution and grouping. 3. IC CEDAR ....
Vijay Karamcheti and Andrew A. Chien. FM--- fast messaging on the Cray T3D. Available from http://www-csag.cs.uiuc.edu/papers/t3d-fm-manual.ps, February 1995.
....between application code and the NIU is possible. The absolute performance of the message Passing mechanism in StarT Voyager, with message latency of between 1 to 5 s, bandwidth of over 100 MBytes sec with messages of only 88 Bytes is superior to all but the very best supercomputers today[20, 22, 21, 9, 6, 12, 13, 24]. Shared Memory Performance Shared memory performance is highly dependent on cache miss ratios. By implementing S COMA in hardware with HAL bits, StarT Voyager has the capability of having a 64MB L3 cache, which will drastically reduce capacity and conflict misses in many programs. To give an ....
V. Karamcheti and A. Chien. FM: Fast Messaging on the Cray T3D. (Work at Concurrent Systems Architecture Group, Dept of Computer Science, University of Illinois at Urbana-Champaign. Available from http://www-csag.cs.uiuc.edu/projects/comm/fm.html in Jul 96).
....speed Location neighbor Overhead Overhead Bandwidth (MHz) latency ( s) s) MByte sec) s) StarT Voyager (AM) 150 memory bus 1.25 0.13 0.51 140 CM 5 (CMAM) 30] 33 memory bus 5.7 1.5 1.25 10 Paragon (AM) 31] 50 memory bus 8.4 2.2 1.0 145 Meiko CS 2 (AM) 14] 66 memory bus 10.8 1.7 1. 6 39 T3D (FM)[24] 200 memory bus 10 4 N.A. HP Medusa (AM) 32] 99 graphics bus 10.15 N.A. N.A. 12 StarT Jr (AM) 20] 120 I O bus 27 1.5 1.1 7 Sparc20 ATM (AM) 10] 50 I O bus 33 3 14 SP 2 (AM) 10] 66 I O bus 25.5 4 34 Sparc20 Myrinet(AM) 14] 50 I O bus 15.7 2.0 2.6 20 Sparc20 Myrinet(FM) 35] 50 I O bus 25 N.A. N.A. ....
....to other machines. Aside from StarT Voyager s numbers, the numbers in Figure 8 are reported by other researchers, and were measured either with Active Message (AM) 44] or Illinois Fast Messages (FM) 35] both of which are very fast message passing libraries, and are taken from many articles[30, 32, 31, 14, 10, 20, 24, 35]. It is impossible to conduct a completely fair comparison because the machines are from different time periods, not all the measurements reported are for the same message size (varies from zero to several words of payload) and some implementations include sliding window protocols to deal with ....
V. Karamcheti and A. Chien. FM: Fast Messaging on the Cray T3D. (Work at Concurrent Systems Architecture Group, Dept of Computer Science, University of Illinois at Urbana-Champaign. Available from http://www-csag.cs.uiuc.edu/projects/comm/fm.html in Jul 96).
....speed Location neighbor Overhead Overhead Bandwidth (MHz) latency ( s) s) MByte sec) s) StarT Voyager (raw) 150 memory bus 0.9 0.05 0.17 158 CM 5 (CMAM) 30] 33 memory bus 5.7 1.5 1.25 10 Paragon (AM) 31] 50 memory bus 8.4 2.2 1.0 145 Meiko CS 2 (AM) 15] 66 memory bus 10.8 1.7 1. 6 39 T3D (FM)[24] 200 memory bus 10 4 N.A. HP Medusa (AM) 32] 99 graphics bus 10.15 N.A. N.A. 12 StarT Jr (AM) 21] 120 I O bus 27 1.5 1.1 7 Sparc20 ATM (AM) 11] 50 I O bus 33 3 14 SP 2 (AM) 11] 66 I O bus 25.5 4 34 Sparc20 Myrinet(AM) 15] 50 I O bus 15.7 2.0 2.6 20 Sparc20 Myrinet(FM) 35] 50 I O bus 25 N.A. N.A. ....
....to other machines. Aside from StarT Voyager s numbers, the numbers in Figure 9 are reported by other researchers, and were measured either with Active Message (AM) 44] or Illinois Fast Messages (FM) 35] both of which are very fast message passing libraries, and are taken from many articles[30, 32, 31, 15, 11, 21, 24, 35]. It is impossible to conduct a completely fair comparison because the machines are from different time periods, not all the measurements reported are for the same message size (varies from zero to several words of payload) and some implementations include sliding window protocols to deal with ....
V. Karamcheti and A. Chien. FM: Fast Messaging on the Cray T3D. (Work at Concurrent Systems Architecture Group, Dept of Computer Science, University of Illinois at UrbanaChampaign. Available from http://www-csag.cs.uiuc.edu/projects/comm/fm.html in Jul 96).
....given request completes. 5 Experimental Results We empirically evaluated DPA using two applications from the SPLASH 2 suite [38] Barnes Hut and FMM, on the Cray T3D [9] a distributed memory machine with support for one sided remote put get memory operations. We use Illinois Fast Messages (FM) [22, 26] for our message passing implementation. We selected these applications because they rely on sophisticated PBDSs for efficiency, and previous studies [3, 14, 29, 34, 35, 37] show that locality optimizations are crucial to achieve good performance. Our codes are in ICC and are adapted from the ....
Vijay Karamcheti and Andrew A. Chien. FM--- fast messaging on the Cray T3D. Available from http://www-csag.cs.uiuc.edu/papers/ t3d-fm-manual.ps, February 1995.
....unique goal [12, 25, 30, 31] but FM is distinguished by its hardware context (Myrinet) and high performance. The Fast Messages project focuses on optimizing the software messaging layer that resides between lower level communication services and the hardware. It is available on both the Cray T3D [22, 23] and Myricom s Myrinet [6] Using the Myrinet, FM provides MPP like communication performance on workstation clusters. FM on the Myrinet achieves low latency, high bandwidth messaging for short messages delivering 32 s latency and 16 MBytes s bandwidth for 128 byte packets (user level to ....
....SBus bandwidth, and hence are not a critical performance factors. 3 The Fast Messages Approach 3.1 Illinois Fast Messages (FM) 1. 0 Illinois Fast Messages (FM) is a high performance messaging layer which is available on several parallel platforms (Cray T3D and workstation clusters) [22, 23]. The design goal of FM is to deliver network hardware performance to the application level with a simple interface. FM is appropriate for implementors of compilers, language runtimes, communications libraries, and in some cases application programmers. Function Operation FM send ....
Vijay Karamcheti and Andrew A. Chien. FM---fast messaging on the Cray T3D. Available from http://www-csag.cs.uiuc.edu/papers/t3d-fm-manual.ps, February 1995.
....utilizes stack based sequential execution and creates threads lazily only when required. The runtime system and exposes a hierarchy of high performance COOP primitives [11] to allow compile time specialization. In addition, communication is realized via low overhead messaging layers: Fast Messages [12, 13] on the CRAY T3D and Active Messages [24] on the TMC CM 5. 3 Application Structure Our implementation of SAMR code was originally written in C (making use of Fortran libraries) and then ported to the Illinois Concert system for parallel execution. We first discuss the basic structure of the ....
Vijay Karamcheti and Andrew A. Chien. FM---fast messaging on the Cray T3D. Available from http://www-csag.cs.uiuc.edu/papers/t3d-fm-manual.ps, February 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC