| Sweazey, P. and Smith, A.J., "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus", Proc. of the 13th Int'l Symp. on Computer Architecture, pp. 414423, 1986. |
....which processors have read only or read write access to data blocks that are in their caches. A processor s access to a cache block is determined by the state of that block in its cache, and this state is generally one of the five MOESI (Modified, Owned, Exclusive, Shared, Invalid) states [32]. Processors issue requests, such as Get Exclusive or Get Shared, to gain access to blocks. They can also lose access to blocks, either by choice (e.g. a cache replacement) or when another processor s request steals a block away. Many invalidate protocols maintain the invariant that there can ....
P. Sweazey and A.J. Smith, "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus," Proc. 13th Ann. Int'l Symp. Computer Architecture, pp. 414-423, June 1986.
....in the same order with respect to every processor. In later chapters we show how our proposals do not change these underlying mechanisms for cache coherence and thus maintain the correctness conditions for cache coherence protocols. 2.1.2. 4 Cache coherence protocol mechanisms Sweazey and Smith [160] proposed the Modified, Owned, Exclusive, Shared, and Invalid (MOESI) classification of cache coherence protocols based on the stable states of a cache block. A cache block in stable state has valid data and the block is not waiting for any state transition to occur. Many cache coherence protocols ....
Paul Sweazey and Alan J. Smith. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, June 1986.
....memory is becoming ubiquitous in the form of small scale symmetric multiprocessors (SMPs) In these platforms, the processors are attached on the same memory bus. Both commodity microprocessors and system logic are specifically designed to support shared memory using bus based coherence protocols [SS86] Consequently, it becomes easy to build low end shared memory systems since the fixed design costs are amortized over the large sales volumes of commodity microprocessors. Therefore, these systems are achieving cost performance superior to their unipro cessor counterparts. Supporting shared ....
Paul Sweazey and Alan Jay Smith. A class of compatible cache consistency protocols and their support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414-423, June 1986.
....transactions, so as to highlight the similarities and differences between the protocols. Third, in Section 3.4, we discuss several issues relating to BASH, including livelock deadlock, scalability, complexity, and verification. All three protocols are write invalidate, use the MOSI states [29], allow processors to silently downgrade from S to I, support several transactions (e.g. get an S copy, get an M copy, writeback an M or O copy) and interact with the processors to support a consistency model. Our results assume sequential consistency. 3.1 A Snooping Protocol Traditional ....
P. Sweazey and A. J. Smith. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, June 1986.
....environment. PSCR can simply use this information without the need of adding extra memory into the cache and the modification of program. The protocol has a reduced complexity since it has only five states, and it needs only an additional line on the bus compared to MOESI protocol scheme [Sweazey86]. I am not aware of other solutions that explicitly eliminate the overhead due to private data accesses. The selective invalidation mechanism allows PSCR to gain the benefits of an update mechanism in shared bus architectures. To show the effectiveness of PSCR, its performance is evaluated ....
.... can be found in [Archibald86] Two new WU protocols have been defined for two special bus based machines: on chip multiprocessor [Takahashi96] and bus based COMA [Lee94] A first attempt to standardize protocols yielded the MOESI class of protocols, in order to implement them on a common platform [Sweazey86]. MESI is a MOESI class protocol, based on Goodman s Write Once 4 state protocol [Goodman83] 2 . It is implemented in most of the commercial high performance microprocessors like AMD K5 and K6, the PowerPC series, the SUN UltraSparc, SGI R10000, Intel Pentium, Pentium Pro, Pentium II and ....
P. Sweazey and A. J. Smith, "A Class of Compatible Cache Consistency Protocols and Their Support by the IEEE Futurebus," 13th Int`l Symp. on Computer Architecture, pp. 414--423, June 1986.
....for writing the line back to main memory as well as for supplying the line directly to any other cache requesting it. In this protocol, the tag field of a memory line of a given cache can be in one of the following four states (line states are described according to the terminology found in [Sweazey et al. 86] Invalid (I) The cache copy is not up to date. Non modified Shared (S) The line has not been modified since it was loaded into this cache. Other caches may also have a copy; one of these copies might be in state O while others must be in state S. Modified Exclusive (M ) The line is ....
SWEAZEY,P.,AND SMITH, A.J. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. Proceedings of 13th Annual International Symposium on Computer Architecture, pp.414--423, ACM/IEEE, Tokyo, June 1986.
....and process based software environments. This paper presents a new cache coherent protocol that can eliminate much of the seance communication that affects other protocols. The new protocol introduces new coherence states, termed T (for transfer) which can be added to the MOESI protocol [10] or can enhance other existing protocols. The use of the new T states, as will be explained further, may improve the performance of large multicache systems significantly. In addition, we examine the effect of the seance communication and the new cache coherence protocol under two software ....
P. Sweazey and A. J. Smith. "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus". In the 13th International Symposium on Computer Architecture, pp. 414-423, IEEE, June 1986.
....may result in erroneous system behavior. In multiprocessor systems the task of ensuring that cached copies of data are consistent falls upon the, so called, consistency protocol. Traditional consistency protocols have required that all accessible copies of a datum be identical at all times [Smi82, SS86] While a system implementing such a protocol is certainly correct, the discipline required to enforce such a property may unnecessarily constrain concurrency in the system. Our formulation of sequential consistency is based on the intuition that a cache based memory system should be ....
....(specifically as to the atomicity of write operations) that limit its applicability to some multiprocessor architectures [SD87] For systems in which this definition has a natural interpretation, coherence is usually the correctness condition cited. See, e.g. Smi82, KEW 85, YYF85, SS86, Goo87] As we understand the somewhat 3 informal notion, memory systems that are cache coherent are behaviorally indistinguishable from serial memories, a stronger property than sequential consistency. This distinction is made precise in Section 3, below. The idea of using less restrictive ....
P. Sweazey and A. Smith. A class of compatible cache consistency protocols and their support by the ieee futurebus. In 13th International Symposium on Computer Architecture, pages 414--423, june 1986.
....transactions from each of 16 processors) We are investigating more parsimonious solutions. 3 Timestamp Snooping Protocols Conventional write invalidate snooping protocols maintain a subset of the MOESI stable states M (Modified) O (Owned) E (Exclusive) S (Shared) and I (Invalid) [40] in response to transactions delivered in order (ordered broadcast) and at the same time (synchronous broadcast) Many implementations require processors in M or O to assert an owned signal that is logically OR ed to inform memory not to respond. Similarly, processors in S or E can assert a ....
P. Sweazey and A. J. Smith. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, June 1986.
.... is equipped with a cache memory, it is necessary to maintain coherence among the caches such that any memory access is guaranteed to return the latest version of the data in the system [Censier and Feautrier, 1978] Cache coherence can be enforced through a shared snooping bus [Goodman, 1983, Sweazey and Smith, 1986] The basic idea is to rely on the broadcast nature of the bus to keep all the cache controllers informed of each other s activities so that they can perform the necessary operations to maintain coherency. A number of snooping cache coherence protocols have been proposed [Archibald and Baer, ....
....The basic idea is to rely on the broadcast nature of the bus to keep all the cache controllers informed of each other s activities so that they can perform the necessary operations to maintain coherency. A number of snooping cache coherence protocols have been proposed [Archibald and Baer, 1986, Sweazey and Smith, 1986] They can be broadly classified into the write invalidate scheme and the write broadcast scheme. In both schemes, read requests are carried out locally if a valid copy exists in the local cache. For write requests, these two schemes work differently. When a processor updates a cache line, all ....
Sweazey, P. and Smith, A.J. 1986. A Class of Compatible Cache Consistency Protocols and their Support by IEEE Futurebus. Proc. of 13th Int'l Symp. on Computer Architecture. 414--423.
....at the barrier synchronization early is blocked until the last processor arrives at the synchronization point. The master processor (processor number 0) executes sequential portions of a program. The default simulation parameters are shown in Table 1b. We use the optimized coherence protocol [10, 14] for each architecture. We made the memory access time relatively large because we believe that the processor memory performance gap is widening, not narrowing. Here, note that the tag of the memory is assumed to be slow DRAM. We use the average memory access latency described in Section 4 (Eq. 1) ....
P. Sweazey, A.J. Smith, "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus," Proc. of the 13th ISCA, 1986.
....which processors have read only or read write access to data blocks that are in their caches. A processor s access to a cache block is determined by the state of that block in its cache, and this state is generally one of the five MOESI (Modified, Owned, Exclusive, Shared, Invalid) states [26]. Processors issue requests, such as Get Exclusive or Get Shared, to gain access to blocks. They can also lose access to blocks, either by choice (e.g. a cache replacement) or when another processor s request steals a block away. Many invalidate protocols maintain the invariant that there can ....
P. Sweazey and A. J. Smith. "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus." In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, June 1986.
....provided by a cache coherence protocol which defines a set of rules coordinating processors, cache controllers, and memory controllers. The verification of cache coherence protocols is an important subject which has been neglected for a long time. Many protocols have been proposed and implemented [6, 16, 30, 46, 57, 61, 88]; however, their correctness has never been formally validated. The main reason for this state of affair is that most existing protocols are relatively simple snooping protocols which use broadcast of updates or invalidations to keep data copies consistent. Their correctness can be established by ....
....verify the safety property of data consistency, a common approach is to show that all reachable global states are permissible [68] Usually, the definition of a cache state carries some semantic interpretation of the cached copy. For the protocol considered in this paper as well as for many others [6, 88], a cache in the Shared state means that the cache has a clean copy consistent with the memory copy and that other caches may have a copy too. Similarly, a cache in the Dirty state indicates that it has the latest and sole cached copy. Therefore, global states (D,S) S,D) and (D,D) are not ....
Sweazey, P. and Smith, A.J., "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus", Proc. of the 13th Int'l Symp. on Computer Architecture, pp. 414423, 1986.
....while maintaining acceptable programmability. 2.3.2 Consistency Protocols In a VSM multiprocessor, data is cached in multiple private caches to reduce long delays to access memory. Memory consistency must be enforced using either a hardware or software mechanism. Several snoopy protocols [3, 61, 46, 36, 72, 65, 75] for systems with a broadcast medium such as a bus interconnection network have been proposed. For more scalable multiprocessors with general interconnection network between processors, 10 directory based protocols [2, 4, 14, 44, 73, 86] and compiler assisted software protocols [20, 51, 57, 79] ....
....[3] uses write through strategy to keep a consistent global memory by broadcasting all write on the bus to invalidate any shared page. Improvements have been made to reduce the amount of broadcasting information by using a sophisticated mechanism to track the coherency status of cached data [36, 46, 61, 72, 75]. The main shortcoming of snoopy protocols is poor scalability due to broadcasting on a shared bus. It works well on a multiprocessor with a moderate number of processors. Directory Protocols Directory protocols are suggested for highly parallel multiprocessors. They use a directory to keep the ....
P. Sweazey and A. J. Smith. A class of compatible cache consistency protocols and their support by the ieee futurebus. In Proceedings of the 13th International Symposium on Computer Architecture, 1986. 38
....CDR process message message arrives CPU CACHE CNI block transfer of CDR time poll poll Figure 3 1. CDR Transfer Example. This figure shows CDR transfers between the CPU, CPU s cache, and a CNI, assuming write allocate caches kept consistent by a MOESI writeinvalidate coherence protocol [123]. Initially, the CPU polls a CDR to check the presence of a message. Assume this incurs a cache miss. This cache miss is satisfied by the CNI (instead of main memory) which indicates the absence of any message in the CNI. The CPU s subsequent polls to the CDR block (shaded region) is satisfied ....
Paul Sweazey and Alan Jay Smith. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, June 1986.
....data network 1 , as in the Sun E10000, and that memory is physically distributed among processors. We illustrate our ideas using a write invalidate MOSI protocol. 2. 1 Background: Snooping and Directories Consider snooping and directory protocols that implement a write invalidate MOSI protocol [38] which allows silent replacement of shared blocks. Processors can hold each cache block in one of four states: M (Modified) O (Owned shared) S (Shared) and I (Invalid) Memory has four stable states, corresponding to the states of the processors: M (memory invalid with one processor in state M ....
Paul Sweazey and Alan Jay Smith. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, June 1986.
....family of protocols. The changes necessary for write update protocols such as the Firefly protocol [24] are not detailed here for space reasons. In the Berkeley protocol, a cache line can be in one of the four following states (line states are described according to the terminology found in [23]) 1. Invalid (I) The cache copy is not up to date. 2. Non modified Shared (S) The line has not been modified since it was loaded into this cache. Other caches may also have a copy; one of these copies might be in state O while others must be in state S. 3. Modified Exclusive (M) The line is ....
Sweazey, P., and Smith, A. J. A class of compatible cache consistency protocols and their support by the IEEE futurebus. In Proc. of 35 13th Annual International Symposium on Computer Architecture (Tokyo, June 1986), ACM/IEEE, pp. 414--423.
....when the set is purged (reduced to size one for writing) We next describe four classes of hardware cache coherence protocols: snooping, hierarchical, directory, and distributed pointer protocols. After that, we discuss related work in data structures. 1.3.1. Snooping Protocols Snooping protocols [Good83, RuSe84, PaPa84, KEWP85, SwSm86, ArBa86, ThSt87] rely on the broadcast capability of a shared bus. Each cache remembers to which sharing sets it belongs (implicit by the lines that are stored) as well as the notions of data validity, exclusiveness, and ownership for each cache line. All cache controllers observe every bus transaction, called ....
Paul Sweazey and Alan Jay Smith, "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus," Proceedings of the Thirteenth Annual International Symposium on Computer Architecture 14, 2 (June 1986), 414-423.
....in cache and only written back to memory when the line is replaced creates less network traffic than a write through policy. This leads one of five possible states, graphed in Figure 1, that a cache line may be assigned. Collectively these five states are referred to as the MOESI cache states [15] and illustrate the ownership, validity and exclusivity attributes. Using the MOESI model both writeinvalidate and write update snoopy coherence protocols, can be developed. Write invalidate protocols maintain consistency of replicated data by invalidating all but one copy of a cached block each ....
P. Sweazey and A Smith, "A class of compatible cache consistency protocols and their support by the IEEE Futurebus," Proc. 13th Annual Int. Symp. on Comp. Arch., 1986, pp. 414-423.
....to or from a CNI device. A CQ generalizes this concept into a contiguous region of coherent cache blocks. We describe the major issues in successfully exploiting CDRs and CQs. We describe their operation assuming write allocate caches kept consistent by a MOESI write invalidate coherence protocol [43]. 2.1 Cachable Device Registers Cachable Device Registers (CDRs) combine the traditional notion of memory mapped device registers with the now ubiquitous bus based cache coherence protocols supported by all major microprocessors. Reinhardt, et al. 39, 40] first proposed CDRs to communicate ....
Paul Sweazey and Alan Jay Smith. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, 1986.
No context found.
Sweazey, P. and Smith, A.J., "A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus", Proc. of the 13th Int'l Symp. on Computer Architecture, pp. 414423, 1986.
No context found.
P. Sweazey and A. J. Smith. A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 414--423, June 1986.
No context found.
P. Sweazey and A. J. Smith, "A class of compatible cache consistency protocols and their support by the IEEE futurebus, " in Proceedings of the 13th Annual International Symposium on Computer Architecture, ISCA, 414 -- 423 #1986#.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC