35 citations found. Retrieving documents...
S. L. Scott, J. R. Goodman, and M. K. Vernon. Performance of the SCI Ring. pages 403--414, ACM Press, Gold Coast, Australia, May 1992.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Low-Latency Sci Systems - Gonzalez   (Correct)

....therefore provides a means to project network behavior in the absence of an expensive hardware testbed and without requiring the use of complex, computationally intensive simulative models. Analytical modeling of SCI has traditionally focused on cache coherency modeling [1] or queue modeling [16] of SCI components. Relatively little work exists for analytical modeling of SCI developed from an architectural perspective. Such a perspective is necessary to identify bottlenecks for various systems and provide insight into scalability and performance as a function of architectural system ....

Scott S., Goodman J., Vernon M., Performance of the SCI Ring, in: Proceedings of the 19th Annual International Symposium on Computer Architecture, Gold Coast, Australia, May 1992, pp. 403-414.


Methods for Performance Evaluation of Wormhole-Switched Networks - Nilsen (1998)   (Correct)

....Analytical work has its origin at the beginning of the century for telephone traffic [112] Significant contributions were made in the mid sixties by [10, 91] Today [92, 93] serve as classical sources for the computer communication community. Some modern references to analytical modeling are [18, 72, 139,140]. A common characteristic for analytical methods is that various stochastic independence assumptions are imposed. In fact, this seems to pervade the entire field. The reason is simply that models become intractable otherwise. As an example, the renowned Independence Assumption [91] for arriving ....

SCOTT, S., GOODMAN, J., AND VERNON, M. Performance of the SCI ring. In Proc. of the 19th Annual Int. Symposium on Computer Architecture, ACM SIGARCH Computer Architecture News (1992), pp. 403--414.


Hardware Techniques To Improve The Performance Of The.. - Burger (1998)   (10 citations)  (Correct)

....as the IEEE ANSI standard Scalable Coherent Interface [66, 111] seem well suited for this kind of operation. On a ring, operations are observed by all nodes if the sender is responsible for removing its own message. We envision a ring interconnect 137 because of the high performance capability [101], but broadcast on a ring is complicated by the fact that operands originating at different processors are received at other nodes in different orders. A simple tag can sort out data to different addresses, but the issue is complicated when two accesses to the same datum are broadcast close in ....

Steven L. Scott, James R. Goodman, and Mary K. Vernon. Performance of the SCI Ring. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 403--414, May 1992.


Mechanisms for Efficient Shared-Memory, Lock-Based Synchronization - Kagi (1999)   (2 citations)  (Correct)

....which causes severe increases in simulation time [BW95] Although I model contention at the target node interfaces, memory, and memory directories, using a constant network latency ignores contention in the net 71 work itself. To account for network contention, I use an analytical model [SGV92] (which takes the network load as a parameter) to derive a different average network latency for each benchmark. I estimate the aggregate network load from the traffic statistics of previous simulations and their total execution times. Since the network latency affects execution time and therefore ....

Steven L. Scott, James R. Goodman, and Mary K. Vernon. Performance of the SCI ring. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 403--414, May 1992.


Models of Machines and Computation for Mapping in Multicomputers - Norman, Thanisch (1993)   (55 citations)  (Correct)

....extremely difficult. Queueing theory based approaches have been successful to some degree in analysing the performance of more complex interconnects. Good examples are Dally s analysis of the k ary n cube [Dally 1990] Chittor and Enbody s work on the iPSC860 [Chittor and Enbody 1992] and Scott et al. 1992] on the proposed IEEE standard Scalable Coherent Interconnect [IEEE 1991] We discuss work in this area that relates directly to the mapping problem in relevant sections. 5 No Task Precedence The simplest and the most computationally tractable models of parallel computation are those where ....

Scott, S., Goodman, J., and Vernon, M. (1992). Performance of the sci ring. In Proceeedings of the 19th International Symposium on Computer Architecture, pages 403--414. IEEE Computer Society Press.


Bidirectional Ring: An Alternative to the Hierarchy of.. - Muhammad Jaseemuddin And (1995)   (Correct)

....of each network under many possible interesting situations, and it also makes the simulation of large scale networks manageable. In [1] real applications are used to analyze medium scale systems using a slotted ring and running different cache coherence protocols. A different approach is used in [7], where first an analytical model of the SCI ring is developed and then a simple synthetic workload is used to analyze and validate the analytical model with the simulation of the actual network. This section explains the environment and parameters of the simulator. Section 4.1 describes the ....

S.L. Scott, J.R. Goodman, and M.K. Vernon, Performance of the SCI Ring, Proc. ISCA, pp. 403-412, May 1992.


The Performance of SCI Memory Hierarchies - Roberto Hexsel Nigel (1994)   (1 citation)  (Correct)

....networks yield lower latency and higher bandwidth, especially for high dimensional networks. The optimal dimensionality of pipelined networks is higher than that of synchronous networks and they should be grown by increasing the dimensionality while keeping the radix unchanged. Scott et.al, in [24], and Scott in [26] present an analytical model of the SCI logical communication protocol. The model is based on M G 1 queues and the ring is modeled as an open system. Their results indicate that the flow control mechanism is effective in preventing starvation and in reducing the effects of a ....

....to pages mapped to memory on other nodes are called remote references. Simulation Methodology The simulator consists of an approximate model of the SCI link interfaces and of a detailed model of the distributed cache coherence protocol. The model of the ring interfaces is similar to those in [25, 24, 23] but rather than using statistical analysis, traffic related values are measured and directly influence the behaviour of the simulated system. The model of the cache coherence protocol mimics the typical set protocol as defined in [18] The address sequences used to drive the simulator are ....

[Article contains additional citation context not shown here]

S L Scott, J R Goodman, and M K Vernon. Performance of the SCI ring. In Proc. 19th Int. Symp. on Comp. Arch., pages 403--414. ACM SIGARCH Comp Arch News 20(2), May 1992.


Integrating Reliable Memory in Databases - Ng, Chen (1998)   (6 citations)  (Correct)

....provides low bandwidth access to the data in memory while rebooting or during hardware failures. A more expensive solution is to eagerly replicate the file cache data on a remote computer [Papathanasiou97] using a high speed network such as Memory Channel [Gillett96] Scalable Coherent Interface [Scott92], or a LAN (Fig. 3b) This provides higher availability if one machine is permanently disabled. The goal of this paper is to explore how to use the Rio file cache to provide reliable memory for databases. Database systems traditionally encounter two problems in trying to use buffer caches managed ....

Scott SL, Goodman JR, Vernon MK (1992) Performance of the SCI Ring. In: Abramson D, Gaudiot J (eds) Proceedings of the 1992 International Symposium on Computer Architecture, ACM Press, pp 403--414


On Topology and Bisection Bandwidth of Hierarchical-ring.. - Ravindran, Stumm (1998)   (Correct)

....need not occur in a single network cycle. Our assumption that the network cycle time is a factor of two slower than the processor cycle time is justified from the fact that for a 5ns processor cycle time (200 MHz) our ring cycle time of 10ns is close to that used in SCI performance studies [6]. 4 4 SCI specifies a ring cycle time of 2ns with 4 ring cycles required to transfer a packet from the input of one node to the input of the neighboring node. Parameter V alue Description n 4 120 Number of processors b 1 Number of memory banks nL1 Theta nL2 Theta Delta Delta Delta ....

S. Scott, J. R. Goodman, and M. K. Vernon, "Performance of SCI ring," Proc. Intl. Symp. on Computer Architecture, pp. 403-414, 1992.


CC-NUMA Page Table Management and Redundant Linked List Based.. - Vlaovic   (Correct)

.... [2] For SCI systems, depending on the implementation, the bandwidth can be 1 Gbyte second (GaAs chips and special cable up to 10 meters) or 1 Gbit second (CMOS chips and fiber optical cables) The communication latency between two nodes is in the range of sub microsecond to tens of microseconds [28]. Cache Coherence for Networks For distributed systems, the issue of cache coherence was not addressed previously since it was seemingly not practical to maintain coherent caches among remote machines. As a result, large size data replication CPU D head CPU C mid CPU B mid CPU A tail control ....

Steve Scott, James Goodman, and Mary Vernon. Performance of the sci ring. In Proceedings of 19th INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, May 1992.


LIGHTNING Network and Systems Architecture - Dowd, al. (1996)   (14 citations)  (Correct)

....a 16 bit word is transmitted on all 16 wavelengths one bit per wavelength [6] It is a deflection routed network and the receiving node is responsible to re assemble the words into the PDU and support ARQ. A single wavelength approach has also been suggested Scalable Coherent Interface (SCI) [7, 8]. This IEEE standard is a multiple ring single wavelength based network. However, there are no inherent architectural limitations which would stop SCI from incorporating WDM in the future. SCI is designed to provide a coherent address space among the interconnected nodes. This paper briefly ....

S. Scott, J. Goodman, and M.K.Vernon, "Performance of SCI Ring," in Proc. 19 th International Symposium Computer Architecture, pp. 403--414, May 1992.


Performance Issues in the Design of Hierarchical-ring and Direct .. - Ravindran (1998)   (Correct)

....need not occur in a single network cycle. Our assumption that the network cycle time is a factor of two slower than the processor cycle time is justified from the fact that for a 5ns processor cycle time (200 MHz) our ring cycle time of 10ns is close to that used in SCI performance studies [82]. 9 All simulation results have confidence interval half widths of 1 or less at a 95 confidence level, except near saturation where the confidence interval half width may increase to a few percent. 3.5 Program driven Simulation For our program driven simulation, we simulate a cache coherent ....

S. Scott, J. R. Goodman, and M. K. Vernon, "Performance of SCI ring," Proc. Intl. Symp. on Computer Architecture, pp. 403-414, 1992.


The Performance of Cache-Coherent Ring-based Multiprocessors - Barroso, Dubois (1992)   (12 citations)  (Correct)

....that of the ring slots. With the exception of WATER8 and WATER16 the bus shows utilization levels above 50 , and is frequently saturated. In contrast, the slotted ring utilizations are always below 50 , and seldom above 30 . Considering that today s high speed buses are clocked at 20 to 100 nsec [27] and that, baring any breakthrough in bus technology, these values are expected to improve rather slowly, it is safe to say that new bus systems with high performance microprocessors will probably be limited to up to 8 processors. The evaluation results showed here also indicate that the slotted ....

....others have modeled the performance of ring LANs. Still in the context of distributed systems, Delp et al. proposed a token ring distributed shared memory system (Memnet [9] with cache coherence maintained in hardware by means of a snooping like coherence protocol. More recently Scott et al. [27] have analyzed the performance of the SCI ring, which is an implementation of the register insertion access control strategy. They model the ring as a M G 1 queue and derive the expected latency of messages with respect to network throughput, assuming an exponentially distributed arrival of ....

S. Scott, J. Goodman and M. Vernon, "Performance of the SCI Ring", Proceedings of the 19th International Symposium on Computer Architecture, June 1985.


Performance Evaluation of the Slotted Ring Multiprocessor - Barroso, Dubois (1995)   (15 citations)  (Correct)

....which simulate the caches and protocols in details. In the context of distributed systems, Delp et al. proposed a token ring distributed shared memory system (called Memnet [7] with cache coherence maintained in hardware by means of a snooping like coherence protocol. More recently Scott et al. [16] have analyzed the performance of the SCI ring, which is an implementation of the register insertion access control strategy. They model the ring as a M G 1 queue and derive the expected latency of messages with respect to network throughput, assuming an exponentially distribution for arrival ....

S. Scott, J. Goodman and M. Vernon, "Performance of the SCI Ring", Proceedings of the 19th International Symposium on Computer Architecture, pp. 403-414, Gold Coast, Australia, June 1992.


Performance Evaluation of Hierarchical Ring-Based Shared.. - Mark Holliday (1992)   (12 citations)  (Correct)

....the processor modules at the nodes of the network (that is, using a direct network) communication locality is exploited to reduce network traffic and memory latency. Second, bit parallel, unidirectional, slotted rings have been found to be effective at maximizing link bandwidth in direct networks [26]. The advantages of unidirectional rings include: 1) with their point to point connections, they can run at high clock speeds, 2) it is easy to make full use of their bandwidth, 3) they provide a natural broadcast mechanism, and 4) they allow easy addition of extra nodes. Third, a single slotted ....

....we assume a ring cycle time of 10ns. We do so because we define ring cycle time as the time required for a packet to move from the input of one station to the input of the next station. Such a transfer need not occur in a single ring cycle. For example, a recent performance study of the SCI ring [26] assumes (with no contention) four ring cycles for a packet to traverse a station and the link to the next station. The assumption that a memory cycle takes 30 processor cycles follows values used in recent studies [10] Figure 2(a) shows how efficiency varies with the request rate, for the base ....

S.L. Scott, J.R. Goodman, and M.K. Vernon. Performance of the SCI ring. In Proceedings of the 18th Annual International Symposium on Computer Architecture, pages 403--414, Gold Coast, Australia, May 1992.


Emulation of a Virtual Shared Memory Architecture - Raina (1993)   (3 citations)  (Correct)

....Queueing networks have been used extensively in the performance evaluation of multiprocessors. Bhandarkar [25] has used queueing theory to model multiprocessor memory interference. Yang [207] and Towsley [194] have used queueing networks to model multiple bus multiprocessors. More recently, Scott [166] has developed queueing network models for an SCI network. Also, Carreras [37] has developed a detailed model of the DDM, although his model more specific to a bus based implementation of the DDM. 9.3 A simple analytic model 127 A queueing system consists of three components the arrival ....

S. L. Scott, J. R. Goodman, and M. K. Vernon. Performance of the SCI Ring. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 403--414, ACM Press, Gold Coast, Australia, May 1992.


Kiloprocessor Extensions to SCI - Stefanos Kaxiras (1996)   (1 citation)  (Correct)

....shared memory multiprocessor systems. It defines both a network interface and a cachecoherence protocol. The network interface section of SCI defines a 1Gbyte s ring interconnect, and the transactions that can be generated over it. A performance analysis by Scott, Vernon, and Goodman [11] showed that an SCI ring can accommodate small numbers of high performance nodes, in the range of four to eight. To build larger systems, topologies constructed out of smaller rings must be used (e.g. k ary n cubes, multistage topologies) 13] Topologies can be built either by using multiple ....

Steven L. Scott, James R. Goodman, Mary K. Vernon, "Performance of the SCI Ring." Proc. of the 19th Annual International Symposium on Computer Architecture, pp. 403-- 414, May 1992.


Simulating complex SCI topologies - Horn, Bothner, Linge, Kristiansen..   (Correct)

....Scalable Coherent Interface (SCI) Simulations, HIC Technology, Serial links, Topologies I. Introduction N ODES in an SCI topology are designed to form ringlets. However, ringlet structures are sensitive to hardware failures, and their peak load is limited; they are not truly scalable [1]. Having access to switches that enable traffic to be directed from one ringlet to another, one may form rather complex networks from quite small ringlets [2] 3] 4] Ideally, every sending node in the topology should be capable of connecting to any destination node without blocking. We call ....

Steven L. Scott, James R. Goodman, and Mary K. Vernon, "Performance of the sci ring", in Proceedings IEEE ISCA 92, May 1992, pp. 403--414, Conference held in Queensland.


SCI Clustering through the I/O bus: A Performance and.. - Omang (1998)   (1 citation)  (Correct)

....implemented around an internal bus, the B link, which separates the host interface from the interface to SCI (see figure 1) Both cards use the same SCI interface, the LinkController (LC 1) chip, but have very different host interfaces. Low level performance of the Sbus SCI board is presented in [17]. Digital Equipment Corporation has announced Memory Channel[7] as a high speed interconnect for the PCI bus. Both SCI and Memory Channel differ from standard network technologies like Ethernet and ATM and from most other off the shelf network B link LC 1 SCI ring B link to host bus interface ....

....the inner loop. 4.3 Results for Scali HS Results for the Scali HS are presented in tables 1 and 2. All numbers are made relative to the performance for size A on one node using one computing thread(processor) compiled with Apogee s C compiler (622.14s corresponding to 42.78 MFlops on Cray Y MP 1[17]) The apf77 entry in table 1 is Config. and Processes speedup compiler 1 2 3 4 apf77 MPI 0,82 1,60 2,37 3,08 apcc MPI 1,00 2,01 3,96 apcc SMP 1,00 2,02 2,70 4,03 cc SMP 1,08 2,13 2,84 4,24 TABLE 1: NPB EP SIZE A SMP PERFORMANCE Config. Processors speedup 4 8 16 32 Size S 3,92 7,51 12,48 ....

[Article contains additional citation context not shown here]

Steven L. Scott, James R. Goodman, and Mary K. Vernon. Performance of the SCI Ring. In Proceedings of 19th International Symposium on Computer Architecture, volume 20(2) of Computer Architecture News, pages 403--414, May 1992.


Simulation of the SCI Transport Layer on the Wisconsin Wind.. - Douglas Burger And (1995)   (2 citations)  Self-citation (Goodman)   (Correct)

....queues. For higher dimensions this is clearly prohibitively expensive, but the control is much simpler than for merging multiple physical queues. Such a merger would require allowing multiple high speed channels to simultaneously write variable sized packets into a single queue. A previous study [12] assumed that packets sent from a queue were placed into an active buffer to await the returning echo. The queues assumed here hold a sent packet in the queue while its transmission is pending. This makes the queue into more of an associative memory than a FIFO structure, but has the advantage of ....

Steven L. Scott, James R. Goodman, and Mary K. Vernon. Performance of the SCI Ring. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 403--414, May 1992.


Extending The Scalable Coherent Interface For Large-Scale.. - Johnson (1993)   (10 citations)  Self-citation (Goodman)   (Correct)

....9 A register ring [Stal84] holds promise for higher throughput than a backplane bus because the sequential broadcast of a bus is a bottleneck and the cycle time of a bus is limited by distributed capacitances and the speed of light. Given reasonable cycle times, Scott et al. [ScGV92] show that a single ring does indeed have better performance than a single bus, both for throughput and message delay. The KSR1 uses what its designers [Burk92] call an insertion modification token ring. Ch. 1 19 the time such a standard is complete, the technology it standardized is usually ....

....is a continuation of that work, investigating the best topologies for large scale systems. However, the focus has shifted from buses, which exhibit severe physical and electrical limitations, to rings, which can provide much better performance (both throughput and delay) as shown by Scott et al. [ScGV92]. Traditional metrics of a topology include the average and worst case distance between nodes, though in practical networks an additional consideration is the bandwidth limitations of the busiest links and other resources. While these metrics have been well studied for the common topologies, the ....

Steven L. Scott, James R. Goodman, and Mary K. Vernon, "Performance of the SCI Ring," Proceedings of the Nineteenth Annual International Symposium on Computer Architecture 20, 2 (May 1992), 403-414.


Emulation of a Virtual Shared Memory Architecture - Raina (1993)   (3 citations)  (Correct)

No context found.

S. L. Scott, J. R. Goodman, and M. K. Vernon. Performance of the SCI Ring. pages 403--414, ACM Press, Gold Coast, Australia, May 1992.


Multistage Ring Network: An Interconnection Network for.. - Yoo, Park, Maeng   (Correct)

No context found.

S. L. Scott, J. R. Goodman, and M. K. Vernon, "Performance of the sci ring," in Proc. 18th Annu. Int. Symp. Comput. Architecture, pp. 403--414, 1992.


Identification And Optimization Of Sharing Patterns For Scalable.. - Kaxiras (1998)   (4 citations)  (Correct)

No context found.

Steven L. Scott, James R. Goodman, Mary K. Vernon, "Performance of the SCI Ring." In Proceedings of the 19th Annual International Symposium on Computer Architecture, pp. 403-414, May 1992.


Issues in the Design of Direct Multiprocessor Networks - Ravindran, Stumm (1997)   (Correct)

No context found.

S. Scott, J. R. Goodman, and M. K. Vernon, "Performance of SCI ring," Proc. Intl. Symp. on Computer Architecture, pp. 403-414, 1992.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC