13 citations found. Retrieving documents...
F. Petrini, S. Coll, E. Frachtenberg, and A. Hoisie "Performance Evaluation of the Quadrics Interconnection Network", Journal of Cluster Computing, 6(2): 125-142, April 2003

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
An Evaluation of Current High-Performance Networks - Bell, Bonachea, Cote.. (2003)   (6 citations)  (Correct)

....for a large class of multiprocessor systems: Convex, Cray, IBM, Intel, KSR, Meiko, nCUBE, NEC, SGI and TMC using a ping pong benchmark. Luecke et al. 15] evaluate the communication performance of Linux and NT clusters, the Cray Origin 2000, IBM SP and Cray T3E. More recently, Petrini et al. [22] examine the performance of Quadrics networks using uni and bi directional ping benchmarks. A different school of thought adopts a more detailed model of the network performance. In 1993, Culler et al. 3] introduced the LogP performance model of parallel computation. Their model is built upon ....

F. Petrini, A. Hoisie, W. Feng, and R. Graham. Performance evaluation of the Quadrics interconnection network. In Workshop on Communication Architecture for Clusters, 2001.


An Evaluation of Current High-Performance Networks - Christian Bell Dan (2003)   (6 citations)  (Correct)

....for a large class of multiprocessor systems: Convex, Cray, IBM, Intel, KSR, Meiko, nCUBE, NEC, SGI and TMC using a ping pong benchmark. Luecke et al. 14] evaluate the communication performance of Linux and NT clusters, the Cray Origin 2000, IBM SP and Cray T3E. More recently, Petrini et al. [21] examine the performance of Quadrics networks using uni and bi directional ping benchmarks. A different school of thought adopts a more detailed model of the network performance. In 1993, Culler et al. 3] introduced the LogP performance model of parallel computation. Their model is built upon ....

F. Petrini, A. Hoisie, W. Feng, and R. Graham. Performance evaluation of the Quadrics interconnection network. In Workshop on Communication Architecture for Clusters, 2001.


Ultra-High Performance Communication with MPI and the Sun.. - Sistare, Jackson (2002)   (2 citations)  (Correct)

....errors which have occurred since the previous barrier. 2.9 Comparison to Other Interconnects The ability to expose local memory in a network address space is not new. It has been supported in hardware by a number of interconnects over the years, including SCI, Memory Channel [10] QsNET [11], and more recently the InfiniBand TM architecture [12] Of these, a Sun Fire Link adapter is most similar to an SCI adapter in that remote operations are driven by programmed I O (PIO) Both adapters support read and write transactions, whereas Memory Channel only supports remote writes. However, ....

Fabrizio Petrini, Adolfy Hoisie, Wu-chun Feng, and Richard Graham. A Performance Evaluation of the Quadrics Interconnection Network, In Workshop on Communication Architecture for Clusters 2001.


Performance Evaluation of the Quadrics Interconnection.. - Petrini, Frachtenberg, .. (2003)   (5 citations)  Self-citation (Petrini Hoisie)   (Correct)

....performance degradation. In this experiment we attempt to write into the same memory location on node 0 from an increasing number of processors (one per SMP) This test provides information on the behavior of a single I O node when serving multiple simultaneous requests. Previous results [15] have shown that read and write operations provide no significant differences. The aggregate bandwidth plots are depicted in figure 16(a) for 1 MByte messages, using uniform and exponential time (T tag) and message size (S tag) distributions. The curves are approximately flat up to 32 nodes, ....

....are used interchangeably. 6. Other PCI buses running at 66 MHz rather than at 33 MHz, for example, those based on the Serverworks HE chipsets, don t suffer from these limitations and can provide almost the same communication performance when the buffers are placed in main or in Elan memory [15]. 7. A shortcut to indicate that the first half of the machine is devoted to computation (the second half will be allocated to I O in this case) with the I O nodes clustered in the last segment of the network. ....

F. Petrini, A. Hoisie, W. Chun Feng and R. Graham, Performance evaluation of the quadrics interconnection network, in: Workshop on Communication Architecture for Clusters (CAC'01), San Francisco, CA, April 2001.


Using Multirail Networks in High-Performance Clusters - Coll, Frachtenberg.. (2001)   (4 citations)  Self-citation (Petrini Hoisie)   (Correct)

.... with a programmable network interface called Elan [7] and a low latency high bandwidth communication switch called Elite [8] Elites can be interconnected in a fat tree topology [4] A recent performance evaluation of the QsNet shows that the network performance is seriously limited by the PCI bus [5]. In fact, the network can deliver almost MB sec at user level ( MB sec of raw bandwidth) but the PCI implementation can sustain only MB sec, using the most efficient PCI chipset on the market. The presence of bidirectional traffic further degrades performance, limiting the ....

....which needs to process incoming control packets and perform the reservation protocol without interfering with the processors in the SMP. Fast response time in the NIC is essential to limit the overhead of this protocol for the protocol s overhead to be justified. This is the case of the QsNet [5], which is equipped with a thread processor that can read an incoming packet, do some basic processing and send a reply in as few as . Finally another dynamic allocation scheme is proposed, called hybrid, which allows bidirectionality for small messages, thus minimizing the protocol ....

Fabrizio Petrini, Adolfy Hoisie, Wu chun Feng, and Richard Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters (CAC '01), San Francisco, CA, April 2001.


Static Allocation of Multirail Networks - Coll, Frachtenberg, Petrini.. (2001)   Self-citation (Petrini Hoisie)   (Correct)

.... a programmable network interface called Elan [7] and a low latency high bandwidth communication switch called Elite [8] Elites can be interconnected in a fat tree topology [3] A recent performance evaluation of the QsNet, shows that the network performance is seriously limited by the PCI bus [5]. In fact, the network can deliver almost # MB sec at user level ( MB sec of raw bandwidth) but the PCI implementation can sustain only MB sec, using the most efficient PCI chipset on the market. A further performance degradation in the presence of bidirectional traffic, limits the ....

Fabrizio Petrini, Adolfy Hoisie, Wu chun Feng, and Richard Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters (CAC '01), San Francisco, CA, April 2001.


Using Multirail Networks in High Performance - Coll, Frachtenberg, Petrini.. (2003)   Self-citation (Petrini Hoisie)   (Correct)

.... a programmable network interface called Elan [10] and a low latency high bandwidth communication switch called Elite [11] Elites can be interconnected in a fat tree topology [6] A recent performance evaluation of the QsNet shows that the network performance is seriously limited by the PCI bus [8]. In fact, the network can deliver almost MB sec at user level ( MB sec of raw bandwidth) but the PCI implementation can sustain only y MB sec, using the most efficient PCI chipset on the market. The presence of bidirectional traffic further degrades performance, limiting the ....

....which needs to process incoming control packets and perform the reservation protocol without interfering with the processors in the SMP. Fast response time in the NIC is essential to limit the overhead of this protocol for the protocol s overhead to be justified. This is the case of the QsNet [8], which is equipped with a thread processor that can read an incoming packet, do some basic processing and send a reply in as few as . Finally another dynamic allocation scheme is proposed, called hybrid, which allows bidirectionality for small messages, thus minimizing the protocol ....

Fabrizio Petrini, Adolfy Hoisie, Wu chun Feng, and Richard Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters (CAC '01), San Francisco, CA, April 2001.


Using Multirail Networks in High Performance Clusters - Coll, Frachtenberg.. (2001)   (4 citations)  Self-citation (Petrini Hoisie)   (Correct)

.... with a programmable network interface called Elan [6] and a low latency high bandwidth communication switch called Elite [7] Elites can be interconnected in a fat tree topology [2] A recent performance evaluation of the QsNet shows that the network performance is seriously limited by the PCI bus [4]. In fact, the network can deliver almost 340 MB sec at user level (400MB sec of raw bandwidth) but the PCI implementation can sustain only 300 MB sec, using the most efficient PCI chipset on the market. A further performance degradation in the presence of bidirectional traffic limits the ....

....control packets and perform the reservation protocol without interfering with the processors in the SMP. Fast response time in the NIC is essential to limit the overhead of this protocol, otherwise the overhead can only be justified by sending very large messages. This is the case of the QsNet [4], which is equipped with a thread processor that can read an incoming packet, do some basic processing and send a reply in as few as two s. Finally another dynamic allocation scheme is proposed, called hybrid, which allows bidirectionality for small messages, thus minimizing the protocol ....

Fabrizio Petrini, Adolfy Hoisie, Wu chun Feng, and Richard Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters (CAC '01), San Francisco, CA, April 2001.


Gang Scheduling with Lightweight User-Level Communication - Frachtenberg, Petrini.. (2001)   (1 citation)  Self-citation (Petrini Feng)   (Correct)

....Inputter 100 MHz Data Bus Clock Statistics Registers Table Walk Engine 4 Way Set Associative Cache 28 PCI Interface 66MHz 64 Figure 1. Elan functional units and analysis. Finally, we conclude and outline future work in Section 5. 2 Overview of QsNET and RMS 2. 1 Hardware QsNET [12] consists of two building blocks: a programmable network interface called Elan [13] and a lowlatency, high bandwidth communication switch called Elite [14] Elites can be interconnected in a fat tree topology [9] The network has several layers of communication libraries that provide trade offs ....

Fabrizio Petrini, Adolfy Hoisie, Wu chun Feng, and Richard Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters (CAC '01), San Francisco, CA, April 2001.


Performance Evaluation of the Quadrics Interconnection - Petrini, Coll.. (2001)   (5 citations)  Self-citation (Petrini Hoisie)   (Correct)

....to get a complete view of the network behavior. The patterns considered in this work are representative of real scientific applications in use at Los Alamos. One example of workload analysis is presented in [10] for SAGE (SAIC s Adaptive Grid Eulerian hydrocode) an important ASCI application. In [14] we analyzed the QsNET performance under specific load conditions to obtain the peak performance of the network and a baseline for further studies. In this paper we analyze the behavior of the Quadrics interconnect with a much broader set of workload conditions. The test bed for the network ....

.... size is depicted in Figure 11 d) 6 Other PCI buses running at 66Mhz rather than at 33 Mhz, for example those based on the Serverworks HE chipsets, don t suffer from these limitations and can provide almost the same communication performance when the buffers are placed in main or in Elan memory [14]. 20 40 60 80 100 120 140 160 50 100 150 200 250 300 350 400 Accepted Load (MB s) Offered Load (MB s) Traffic pattern: uniform 16 Nodes T uniform, S uniform T uniform, S exponential T exponential, S uniform T exponential, S exponential (a) 30 40 50 60 70 80 90 100 110 120 130 140 50 100 ....

[Article contains additional citation context not shown here]

Fabrizio Petrini, Adolfy Hoisie, Wu chun Feng, and Richard Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters (CAC '01), San Francisco, CA, April 2001.


Performance Evaluation of I/O Traffic and Placement .. - Coll, Petrini.. (2002)   Self-citation (Petrini Hoisie)   (Correct)

....of intelligent communication protocols, and fault tolerance. The work was supported by the U.S. Department of Energy through Los Alamos National Laboratory contract W 7405 ENG 36 1 More information on the Quadrics network can be found at http: www.c3. lanl.gov fabrizio quadrics.html In [13] we analyzed the QsNET performance under specific load conditions to obtain the peak performance of the network and a baseline for further studies. Since not only the efficient support for computation related traffic patterns but a good integration with the I O subsystem is a key issue to ....

.... in Elan memory, in order to 6 Other PCI buses running at 66Mhz rather than at 33 Mhz, for example those based on the Serverworks HE chipsets, don t suffer from these limitations and can provide almost the same communication performance when the buffers are placed in main or in Elan memory [13]. 0 50 100 150 200 250 300 1 4 16 64 256 1K 4K 16K 64K 256K 1M 4M Bandwidth MB s Msg Size (bytes) Bidirectional Ping Bandwidth MPI Elan3, Elan to Elan Elan3, Main to Main (a) 0 5 10 15 20 25 30 35 40 0 1 4 16 64 256 1K 4K Latency s Msg Size (bytes) Bidirectional Ping Latency MPI Elan3, ....

[Article contains additional citation context not shown here]

Fabrizio Petrini, Adolfy Hoisie, Wu chun Feng, and Richard Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters (CAC '01), San Francisco, CA, April 2001.


Symmetric Data Objects and Remote Memory Access.. - Nieplocha..   (Correct)

No context found.

F. Petrini, S. Coll, E. Frachtenberg, and A. Hoisie "Performance Evaluation of the Quadrics Interconnection Network", Journal of Cluster Computing, 6(2): 125-142, April 2003


Micro-Benchmark Level Performance Comparison of.. - Liu.. (2003)   (Correct)

No context found.

F. Petrini, A. Hoisie, W. chun Feng, and R. Graham. Performance Evaluation of the Quadrics Interconnection Network. In Workshop on Communication Architecture for Clusters 2001.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC