34 citations found. Retrieving documents...
D. V. James et al., "Scalable Coherent Interface." IEEE Computer, vol. 23, no. 6, June 1990, pp. 74-77.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Dynamic Pointer Allocation for Scalable Cache Coherence.. - Simoni, Horowitz (1991)   (14 citations)  (Correct)

....of the size of main memory. There are also decentralized approaches that do not maintain all information pertaining to a given data block in a directory at main memory. One such organization maintains a distributed linked list of pointers indicating the caches with copies of a given memory block [10]. The pointers on the list are stored with the data at the caches, while a field indicating the first node on the list is kept in a standard directory structure at main memory. The advantage of this organization is that sufficient pointers are always available 2Another option, proposed by the ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Scalable Coherent Interface. Computer, 23(6):74-77, June 1990.


Design And Analysis Of Update-Based Cache Coherence Protocols For .. - Glasco (1995)   (1 citation)  (Correct)

.... Protocols (INV) This dissertation compares update based protocols with three invalidate protocols: a centralized directory protocol (CD INV) that is similar to DASH [53] a singlylinked distributed directory protocol (DD INV) 70] and a doubly linked distributed directory protocol (SCI INV) [45, 42], which is the IEEE standard protocol. This section gives a brief description of how these three invalidate based protocols operate. The invalidate based protocols require that the writing cache obtain exclusive ownership of the line. With different directory structures, the protocols differ in ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurinder S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


The Network RamDisk: Using Remote Memory on Heterogeneous NOWs - Flouris, Markatos (1999)   (5 citations)  (Correct)

....accesses involve only memory and interconnection network transfers, they proceed at high speed. For example, while typical disk latency is around 10 ms, typical network latency is around 1 ms (for small data transfers) Modern interconnection networks provide latency as low as a few microseconds [5, 7, 14, 17, 20, 1]. Thus, Network RamDisks may result in significant performance improvements over magnetic disks, especially when application performance depends on latency. Besides performance, the Network RamDisk offers a high level of data reliability using either conventional methods such as data replication, ....

....version performs also quite well as we will see later in our experiments. All these versions of the Linux Network RamDisk client have been tested on top of a 155 Mbps ATM network. Lower latency and higher bandwidth networks like Gigabit ATM, Gigabit Ethernet, SCI (Scalable Coherent Interface) [17] or Myrinet [6] should offer even more promising performance, especially when faster communication protocols are used [28, 3, 1] 3.2 The Network RamDisk Servers The Network RamDisk server is a user level program listening to a socket and accepting connections from the NRD clients. Each client ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990. 23


Verifying Distributed Directory-based Cache Coherence.. - Pong, Nowatzyk.. (1995)   (7 citations)  (Correct)

....fixed number of pointers and a hardware supported overflow mechanism which keeps processing nodes sharing a data block in singly linked lists. Cache coherence protocols that use linked lists have been proposed by Thapar [18] and are also used in the Scalable Coherent Interface (SCI) protocol [7]. To verify the S3.mp protocol is very difficult because the linked lists are maintained by a distributed algorithm. The addition and deletion of nodes from the linked list reorganize the list. In addition to the complexity of maintaining linked lists, the S3.mp protocol behavior is unpredictable ....

....solution is to isolate the problem of verifying the integrities of the linked lists from the problem of verifying data coherence. We think that the techniques applied to the S3.mp cache coherence protocols are applicable to other linked list based cache coherence protocols, for example the SCI [7]. We have also demonstrated how to formulate the condition of data consistency in the context of relaxed memory consistency models. The approach in this paper only verifies the property of consistency, for which the state of a single memory block must be tracked. A more difficult problem is the ....

James et al., "Scalable Coherent Interface", IEEE Computer, June 90, Vol 23, No. 6, pp 71-82.


Page Placement For Non-Uniform Memory Access Time (NUMA) Shared .. - LaRowe, Jr. (1991)   (Correct)

....with copies of the line) or write invalidate protocol is used. Chaiken, Fields, Kurihara, and Agarwal consider the scalability of different directory based cache coherence protocols in [CFKA90a] and [CFKA90b] The Scalable Coherent Interface (SCI) protocol, a proposed standard, is introduced in [JLGS90] and a similar protocol being developed at Stanford is described in [TD90] The recent survey article by Stenstrom [Ste90] provides a nice overview of the best known snoopy and directory based cache coherence protocols, and would serve as a good starting point for the reader interested in ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable coherent interface. IEEE Computer, 23(6):74--77, June 1990. New Directions Report.


Hardware Techniques To Improve The Performance Of The.. - Burger (1998)   (10 citations)  (Correct)

....nature of a bus makes the bus an unlikely candidate for the high performance interconnect of the future. However, the demise of the bus has been much slower than predicted, and buses may persist for some time to come. Ring operations, such as the IEEE ANSI standard Scalable Coherent Interface [66, 111] seem well suited for this kind of operation. On a ring, operations are observed by all nodes if the sender is responsible for removing its own message. We envision a ring interconnect 137 because of the high performance capability [101] but broadcast on a ring is complicated by the fact that ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


Example-Standard Contents - These Cover Sheets   (Correct)

....Copyright 1994 1996 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 5 1.7 Distributed directory structures 1.7. 1 Linear sharing lists To support a (nearly) arbitrary number of processors, the SCI coherence protocols are based on distributed directories[SciDir]. By distributing the directories among the caching processors, the potential capacity of the directory grows as additional caches are added and directory updates need not be serialized at the memory controller. The base coherence protocols are based on linear lists; the extended coherence ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurinday S. Sohi, "Scalable Coherent Interface," IEEE Computer, June 1990, Volume 23, No 6, 74-77.


Reactive Proxies: a Flexible Protocol Extension to Reduce.. - Talbot, Kelly   (Correct)

....The proxies approach is different because it does not use a fixed hierarchy: instead it allows requests for copies of successive data lines to be serviced by different proxies. Attempts have been made to identify widely shared data for combining, including the glow extensions to the sci protocol [8, 7]. glow intercepts requests for widely shared data by providing agents at selected network switch nodes. In their dynamic detection schemes, which avoid the need for programmers to identify widely shared data, agent detection achieves better results than the combining of [4] by using a sliding ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


A Survey of Verification Techniques for Cache Coherence Protocols - Pong, Dubois (1996)   (Correct)

....provided by a cache coherence protocol which defines a set of rules coordinating processors, cache controllers, and memory controllers. The verification of cache coherence protocols is an important subject which has been neglected for a long time. Many protocols have been proposed and implemented [6, 16, 30, 46, 57, 61, 88]; however, their correctness has never been formally validated. The main reason for this state of affair is that most existing protocols are relatively simple snooping protocols which use broadcast of updates or invalidations to keep data copies consistent. Their correctness can be established by ....

....rules coordinating processors, cache controllers, and memory controllers. In the CC UMA model, coherence is maintained through a snoopy protocol on a bus. In the CC NUMA model, coherence is maintained by directories centralized at the home node [16, 61] or distributed in a list linking all caches [57, 69]. Coherence can sometimes be maintained in software; however, in this paper, we only consider hardware based protocols. 2.1 Cache Coherence Protocols In all existing cache coherence protocols, several read only copies of the same memory location can exist in the system at the same time. When ....

[Article contains additional citation context not shown here]

James et al., "Scalable Coherent Interface", IEEE Computer, June 90, Vol 23, No. 6, pp. 7182.


Highly Concurrent Cache Coherence Protocols - Williams, Reynolds, Jr. (1990)   (Correct)

....a significant bottleneck from Tang s protocol, the protocol still scales poorly since the size of the bit vector increases linearly with the number of PE s. The focus of much of the subsequent work on directory protocols has been on improving the scalability of the directory representation [Aga88, ArB84, CKA91,GWM90, Jam90, LiY90, OKN90, SiH91, Ste89, ThD91]. Although reducing the space complexity of the directory representation is an important problem, our focus is different: on improving the scalability of cache coherence protocols by increasing their concurrency. For simplicity, we assume the bit vector representation proposed by Censier and ....

D. V. James, et al., Scalable Coherent Interface, Computer 23,6 (June 1990), 74-77.


Shared Regions: A strategy for efficient cache management in.. - Sandhu (1995)   (2 citations)  (Correct)

....sending an invalidation message to the next cache in the list. Replacement of cache lines present a problem. In the SDD protocol, all entries in the list need to be invalidated in order to preserve the integrity of the list. The IEEE Scalable Coherent Interface (SCI) uses a doubly linked list [41]. For every cache block, each cache keeps a pointer to the next cache in the list and the previous one. The advantage is that replacements in the cache can be handled more efficiently than in the singly linked structure. Here, each cache communicates with its neighbors in the list to indicate that ....

D. James, A. Laundrie, S. Gjessing, and G. Sohi. Scalable Coherent Interface. In IEEE Computer, June 1990.


Parallelizing Appbt for a Shared-Memory Multiprocessor - Burger (1995)   (9 citations)  (Correct)

....since there are spatial dependencies along each dimensional axis. To test the scalability and performance of our solution, we simulated execution of Appbt on three different types of shared memory machines: dir N NB [2] a full bitmap directory protocol, and the Scalable Coherent Interface [1, 5], both with and without its pairwise sharing option on. The sizes of these simulated machines ranged from 1 to 128 processors, and we used both processor and memoryconstrained scaling techniques [9] to evaluate scalability. 2 The Appbt algorithm This section briefly describes both the problem and ....

....latencies of 100 cycles. All subsequent descriptions of machine behavior refer to the simulated target machine and not the host CM 5. Figure 4 shows speedups for three different systems; one running the dir N NB cache coherence protocol, and two running variants of the Scalable Coherent Interface [1, 5] cache coherence protocol. All experiments in this graph were run with a point data set. 24 24 24 Figure 4. Constant data set speedups 0 16 32 48 64 80 96 112 128 0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128 Number of processors Speedup Ideal DirNNB SCI PS SCI Figure 5. Constant ....

[Article contains additional citation context not shown here]

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


CC-NUMA Page Table Management and Redundant Linked List Based.. - Vlaovic   (Correct)

....main memory, this access would become an instant bottleneck. Additionally, the reliability of such a scheme is in question, as a fault in the bit map would result in an incorrect sharing list. By distributing the directory, the bottleneck is assuaged, and the reliability weakness is not as great [18, 31, 24]. This type of design is called a distributed pointer protocol. In this type of system, a linked list is created dynamically, to reflect the sharing members at that particular time. The caches can insert and delete themselves from the linked list as necessary. This avoids including every node in ....

D.V. James, A.T. Laundrie, S. Gjessing, and G.S. Sohi. Scalable coherent interface. IEEE Computer, pages 74--77, June 1990.


Overview of Recent Supercomputers - van der Steen, Dongarra (1996)   (Correct)

....(SCI) should provide a point to point bandwidth of 200 1,000 Mbyte s. It is in fact used to in the HP Convex SPP 1200, but could also be used within a network of workstations. The SCI is much more than a simple bus and it can act as the hardware network framework for distributed computing, see [10]. The Omega Gammae 1 work is a structure which is situated somewhere in between a bus and a crossbar which respect to potential capacity and costs. At this moment of the commercially available machines the IBM SP2, the Meiko CS 2, and the Cenju 3 use this network structure, but a number of ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Scalable coherent interface. IEEE Computer, 23(6):74--77, June 1990. Scalable Coherent Interface,http://sunrise.scu.edu/.


Cache Coherence Using Local Knowledge - Darnell, Kennedy (1993)   (16 citations)  (Correct)

....author: ervan cs.rice.edu some common bus, are now in common use for small scale systems [16, 18] however, snoopy strategies are problematic for large scale machines because such machines cannot be based on a single, central broadcast medium for lack of sufficient bandwidth. Directory strategies [2, 8, 11, 19], in which a directory entry associated with each memory location (or cache line) indicates which processors have cached values for that location, seem more promising for large scale systems. However, directories can require large amounts of additional storage and directory maintenance operations ....

D. James, A. Laundrie, S. Gjessing, and G. Sohi. Scalable coherent interface. Computer, 23(6), June 1990.


DataScalar Architectures and the SPSD Execution Model - Burger, Kaxiras, Goodman (1996)   (Correct)

....however, such as on a ring or bus, they may be accomplished with only minor additional cost, though reliable delivery and error recovery are inevitably more complicated for broadcast operations. Ring operations, such as those defined in the IEEE standard Scalable Coherent Interface (SCI) [17, 28] seem particularly wellsuited for this kind of operation. On a ring, all operations are observed by all nodes on a ring if the sender is responsible for removing its own message. We envision a ring interconnect because of the high performance capability [26] but broadcast on a ring is complicated ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


Compiler Support for the Efficient Use of Cache Coherence.. - Trung Nguyen   (Correct)

....private caches, however, cache coherence must be enforced using either a hardware or software mechanism [21] Several schemes [12, 16, 30, 34, 36] have been proposed for bus based systems. For more scalable systems that use general interconnection networks, directory based hardware schemes [2, 3, 5, 15, 35, 38] and compiler assisted software schemes [7, 18, 17, 25, 37] have been suggested. Recently, several authors have proposed dynamically tagged directories [6, 14, 22, 23, 24, 30] in which pointers to processors with a copy of a memory block are allocated only when the block is actually cached. These ....

....memory references. Other factors, such as aliases, procedure calls, and unknown symbolic terms, can also reduce the precision of the dependence analysis. Additionally, if parallel tasks are scheduled dynamically, some temporal locality cannot be detected by the compiler. Directory based schemes [2, 3, 5, 15, 35, 38] , on the other hand, can perfectly disambiguate memory references at run time so that they invalidate only cache blocks that are actually stale. Unfortunately, directory based schemes require a large amount of memory to store the cache block sharing information. Several dynamically tagged ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable coherent interface. Computer, 23(6):74--77, June 1990.


Cache Consistency in Hierarchical-Ring-Based Multiprocessors - Keith Farkas (1992)   (5 citations)  (Correct)

....avoids the multiple traversal problem of the SCI protocol. In this paper, we propose a selective broadcast based cache consistency protocol that addresses the three complications listed above for a class of multiprocessors based on hierarchical rings. Ring based networks have been investigated [1, 6, 10, 11, 16] as a means for implementing high performance interconnection backplanes because they offer a number of advantages. Having point to point interconnections, large rings can be driven at very high clock rates. Rings also exhibit natural broadcast and ordering properties that facilitate the ....

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi. Scalable coherent interface. Computer, 23(6):74--77, June 1990.


Delayed Consistency And Its Effects On The Miss Rate .. - Dubois, Wang.. (1991)   (24 citations)  (Correct)

....we focus on block sizes of 16 words (64 1. Alternatively, in order to reduce the traffic, only the modified bytes could be sent to memory. However this optimization is probably not cost effective. 14 bytes) which is the block size adopted in the SCI (Scalable Coherence Interface) protocol [12]. It is also a good choice for uniprocessor caches. The data access patterns of the four parallel programs used in our performance studies are summarized in Table 1. The programs are further described below. 1) SOR: 100 iterations of the Successive Over Relaxation iterative algorithm to solve ....

D.V. James et al., "Scalable Coherent Interface," IEEE Computer, Vol. 23, No. 6, pp. 74-77, June 1990.


Extending The Scalable Coherent Interface For Large-Scale.. - Johnson (1993)   (10 citations)  (Correct)

....cache, which they call a sparse directory, in combination with coarse vectors. Ch. 1 12 1.3.4. Distributed Pointer Protocols Instead of maintaining lots of cache pointers (or a bit map) at the memory, the sharing set can be distributed as a structure of cache lines that point to each other [JLGS90, ThDe90a, NiSt92]. Storage for these distributed pointer protocols (DP protocols) is dynamically allocated as caches are inserted and deleted, avoiding static allocation based on worst case assumptions. The directory (per line) stores one pointer at the memory and each cache line stores one or more pointers, ....

....one for requests and one for responses. The guarantee of forward progress, not just deadlock avoidance, is necessary for real time scheduling and or any claims that an algorithm will terminate in a given number of steps. Third, there exists a request combining mechanism for DP protocols [JLGS90] that is much simpler and more general than previously proposed mechanisms for hardware combining. Software combining mechanisms have been proposed [YeTL87, YeTa90] and the first softwarecombining algorithm for fetch and add is given by Goodman et al. GoVW89] Previous attempts at ....

[Article contains additional citation context not shown here]

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi, "Scalable Coherent Interface," IEEE Computer 23, 6 (June 1990), 74-77.


Scheduler-Conscious Synchronization - Kontothanassis, Wisniewski, Scott (1994)   (19 citations)  (Correct)

....either for the sake of fairness 1 Some multiprocessors, especially the larger ones, provide more sophisticated hardware support for synchronization. Examples include the queued locks of the Stanford Dash machine [21] the QOLB (queue onlock bit) operation of the IEEE Scalable Coherent Interface [14], and the near constant time barriers of the Thinking Machines CM 5 and the Cray Research T3D. It is not yet clear whether the advantages of such special operations over simpler read modify write instructions are worth the implementation cost. or to minimize contention. The algorithm s ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable Coherent Interface. Computer, 23(6):74--77, June 1990.


Memory Models - Kontothanassis, Scott (1996)   (1 citation)  (Correct)

....most complex of the large scale machines are those that maintain cache coherence in hardware, beyond the confines of a single bus. Examples of such machines include the commercially available Kendall Square KSR 1 and 2, the Convex Exemplar (based on the IEEE Scalable Coherent Interface standard [43]) and the Dash [52] Flash [48] and Alewife [3] research projects. Alternative approaches to cache coherence are discussed in section 0.4. A cache less dance hall machine is sometimes called an UMA (uniform memory access) multiprocessor. Several companies produced such machines prior to the RISC ....

....as COMA (cache only memory architecture) 40] The Stanford Dash machine [52] uses the Theta(P 2 ) directory organization. The MIT Alewife machine has a limited number of pointers for each directory entry; it traps to software on overflow. The Convex Examplar is based on the IEEE SCI standard [43], which maintains a distributed directory whose total space overhead is linear in the size of the sys13 tem s caches. The newer Stanford Flash machine [48] has a programmable cache controller; its directory structure is not fixed in hardware. The KSR 1 and 2 are COMA machines; their proprietary ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable Coherent Interface. Computer, 23(6):74--77, June 1990.


Modelling and Validation of Shared Memory Coherency Protocols - Bennett, Field, Harrison (1996)   (Correct)

....purposes. This enables us to construct a well understood workload in order that direct quantitative comparisons between model behaviour and simulation behaviour can be undertaken. The validation exercise is inspired by a recent paper [4] which presents a very detailed model of the SCI protocol [7] but with no accompanying validation. This model was developed in conjunction with a team of commercial computer architects and the importance of validation in this context cannot be understated. Finally, we consider a real benchmark application, the MP3D particle simulation code from the ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, 1990.


Improving Memory Utilization in Cache Coherence Directories - Lilja, Yew (1993)   (3 citations)  (Correct)

....network [16, 23, 32] reduces the bottleneck, but it compounds the need for the caches by increasing the delay, and it exacerbates the coherence problem by eliminating the snooping medium. Hardware coherence schemes that dynamically determine which memory operations need coherence actions [3, 4, 7, 19] have access to memory addresses only as the program generates them. Since it is impossible for the hardware to predict how the blocks will be shared, they must track the state and sharing characteristics of every memory block referenced by the program. The number of memory bits needed to store ....

....unlike the compiler based schemes, the hardware schemes make coherence enforcement completely transparent to procedure calls and subroutines. These differences between compiler based and hardware directory coherence schemes are summarized in Table 1. In the conventional hardware directories [3, 4, 7, 19], pointer resources are statically associated with each block in the main memory fixing the total number of pointers to the size of the memory. Recently proposed dynamically tagged directories [9, 17, 26, 30] take advantage of the observation that only blocks that are actually cached in one or ....

[Article contains additional citation context not shown here]

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi, "Scalable Coherent Interface," Computer, Vol. 23, No. 6, pp. 74-77, June 1990.


Towards A Shared-Memory Massively Parallel Multiprocessor - Litaize, Mzoughi.. (1992)   (1 citation)  (Correct)

....modules or chained on the caches themselves. The various possible options are nowadays well known although the performance level is not exactly known for an architecture with a great number of processors [CFKA90] ASHH88] Scalability induces one to use some kind of Scalable Coherence Interface [JLGS90] AKGJ90]. In this algorithm, used block pointers are chained over the caches, the last pointer being at the memory level. In this case coherency messages travel from cache to cache: the owner of a dirty block is recorded at the memory level and a flush request is sent along the right link, the last ....

D.V. James, A.T. Laundrie, S. Gjessing, G.S.Sohi : "Scalable Coherent Interface". Computer, June 90, pp. 74-78.


Concurrency Control in Asynchronous Computations - Williams (1993)   (9 citations)  (Correct)

....a significant bottleneck from Tang s protocol, the protocol still scales poorly since the size of the bit vector increases linearly with the number of PE s. The principal focus of the subsequent work on directory protocols has been on improving the scalability of the directory representation [Aga88, ArB84, CKA91,GWM90, Jam90, LiY90, OKN90, SiH91, Ste89, ThD91]. Although reducing the space complexity of the directory representation is an important problem, our focus is different on improving the concurrency of cache coherence protocols. For simplicity, we assume the bit vector representation proposed by Censier and Feautrier [CeF78] but delta cache ....

D. V. James, et al., Scalable Coherent Interface, Computer 23,6 (June 1990), 74-77.


The Network RamDisk : Using Remote Memory on Heterogeneous NOWs - Michail Flouris (1998)   (5 citations)  (Correct)

....block accesses involve only memory and interconnection network transfers, they proceed with low latency and high bandwidth. For example, typical disk latency is around 10 ms, while typical network latency is around 1 ms. Modern interconnection networks provide latency as low as a few microseconds [4, 6, 14, 17, 19]. Thus, Network RamDisks may result in significant performance improvements over magnetic disks, especially when application performance depends on latency. The Network RamDisk, much like network memory file systems [1, 15] exploits network memory to avoid magnetic disk I O, but unlike these file ....

....section 2.3.3 . This version performs also quite well as we will see later in our experiments. All these versions of the Linux Network RamDisk client have been tested on top of a 155 Mbps ATM network. Lower latency and higher bandwidth networks like Gigabit ATM, SCI (Scalable Coherent Interface) [17] or Myrinet [5] should offer even more promising performance, especially when faster communication protocols are used [26, 2] 3.2 The Network RamDisk Servers The Network RamDisk server is a user level program listening to a socket and accepting connections from the NRD clients. Each client is ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


A Performance Comparison of Fast Distributed Synchronization.. - Johnson (1994)   (Correct)

....use a fixed tree based on a Huffman code. Some shared memory synchronization algorithms can be easily modified to construct distributed synchronization algorithms. These algorithms include the MCS contention free lock [9] and cache coherence protocols such as the Scalable Coherent Interface (SCI) [3]. However, these algorithms require the use of a centralized lock manager. Only a little work has been done to make a performance study of distributed synchronization algorithms. Ricart and Agrawala [13] make a simulation study of an O(n) message passing algorithm. Chang, Singhal, and Liu [2] use ....

D.V. James, A.T. Laundrie, S. Gjessing, and G.S. Sohi. Scalable coherent interface. Computer, 23(6):74--77, 1990.


Lightweight Transactions on Networks of Workstations - Papathanasiou, Markatos (1998)   (4 citations)  (Correct)

....from 4 bytes to 1 Mbyte. ffl debit credit: a processes banking transactions very similar to the TPC B. ffl order entry: a benchmark that follows TPC C and models the activities of a wholesale supplier. All our experiments were run on two PCs connected with an SCI interconnection network [20]. Each PC was equipped with a 133 MHz processor. 5.1 Performance Results Figure 6 plots the transaction latency as a function of the transaction size. We see that for very small transactions, the latency that PERSEAS imposes is less than 14 s, which implies that our system is able to complete ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


Using Simple Page Placement Policies to Reduce the .. - Marchetti.. (1994)   (16 citations)  (Correct)

.... fall into two basic categories, termed CC NUMA (cache coherent, non uniform memory access) and COMA (cache only memory architecture) CC NUMA machines include the Stanford DASH [11] the MIT Alewife [1] and the Convex SPP 1000, based on the IEEE Scalable Coherent Interface standard [7]. COMA machines include the Kendall Square KSR 1 and the Swedish Data Diffusion Machine (DDM) 6] COMA machines organize main memory as a large secondary or tertiary cache, giving them a performance advantage over CC NUMA machines when it comes to servicing capacity and conflict cache misses. ....

D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi. Scalable Coherent Interface. Computer, 23(6):74--77, Jun. 1990.


Hardware Support for Synchronization in the Scalable Coherent .. - Nagi Aboulenein (1994)   (8 citations)  Self-citation (Gjessing)   (Correct)

No context found.

D. V. James, A. T. Laundrie, S. Gjessing, , and G. S. Sohi. "Scalable Coherent Interface". IEEE Computer, 23(6):74--77, June 1990.


Balancing Performance, Area, and Power in an On-Chip Network - Gold   (Correct)

No context found.

D. V. James et al., "Scalable Coherent Interface." IEEE Computer, vol. 23, no. 6, June 1990, pp. 74-77.


The Performance of SCI Memory Hierarchies - Roberto Hexsel Nigel (1994)   (1 citation)  (Correct)

No context found.

David V James et al. Scalable Coherent Interface. IEEE Computer, 23(6):74--77, June 1990.


Cache Coherence in Large-Scale Shared Memory Multiprocessors.. - Lilja (1993)   (34 citations)  (Correct)

No context found.

David V. James, Anthony T. Laundrie, Stein Gjessing, and Gurindar S. Sohi, "Scalable Coherent Interface," Computer, Vol. 23, No. 6, pp. 74-77, June 1990.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC