155 citations found. Retrieving documents...
Archibald, J. and Baer, J-L., Cache coherence protocols: Evaluation using a multiprocessor simulation model, ACM Trans. Computer Systems 4 (4), Nov. 1986, 273-298.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Algorithmic Verification of Invalidation-based Protocols - Bozzano, Delzanno   (Correct)

....in the models of cache coherence protocols given in [15, 16] each individual process (cache) is modeled via a nitestate automata, obtained by forgetting cache identi ers and by considering a single cache line and a single memory location. However, the original formulation of these protocols (see [5]) as well as of other invalidation based protocols (see [20] depends on several parameters like cache lines or entries in a page table. To obtain more concrete models, we need speci cation languages in which individual processes are allowed to carry along information ranging over a possibly in ....

P. A. Archibald, J. Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. TOCS 4(4): 273-298. 1986.


A Study on Memory-Based Communications and Synchronization in.. - Matsumoto (2001)   (Correct)

....of research gradually shifted to distributed memory multiprocessors. In order to obtain the benefits of shared memory, even from distributed memory multiprocessors, distributed shared memory (DSM) mechanisms became the subject of extensive study from the end of the 1980s. Although snoop cache [15, 3] mechanisms are usually adopted for small scale tightly coupled multiproces1 sors, directory based caching schemes are used in distributed memory multiprocessors with hardware based DSM mechanisms. This is because snoop cache mechanisms require some method for the broadcasting of memory ....

J. Archibald and J. L. Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. ACM Trans. Computer Systems, 4(4):273--298, 1986.


Out-of-Order Execution in Sequentially Consistent Shared-Memory.. - Hu, Xia   (Correct)

....there in the order they are issued. Each simulation was run for 20; 000 cycles. During each run, the simulator collects detailed statistics about the simulated scheme. Counters for events which serve as performance indicators are provided. The main system performance metric is the system power[2, 15] which is computed as the sum of processor utilizations in the system. 5.1 Trace Generation Since no suitable real multiprocessor traces are available to us, we adopted the proven synthetic workload model developed in [4] and improved in [2] and [15] 12 The workload model generates a unique ....

....main system performance metric is the system power[2, 15] which is computed as the sum of processor utilizations in the system. 5. 1 Trace Generation Since no suitable real multiprocessor traces are available to us, we adopted the proven synthetic workload model developed in [4] and improved in [2] and [15] 12 The workload model generates a unique stream for each processor in the system. The reference stream of each processor is the merging of a private stream and a shared stream. Each time the workload module is called by the processor module for next reference, it generates a shared ....

J. Archibald and J. Baer, "Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model", ACM Transactions on Computer Systems, pp. 273--298, Vol. 4, November 1986.


Specifying and Verifying a Broadcast and a.. - Sorin, Plakal.. (2000)   (5 citations)  (Correct)

....the literature are often visually accessible, but they have not been sufficiently detailed for purposes of implementation or verification. In academia, protocol specifications tend to be high level because a complete, detailed specification may not be necessary for the goal of publishing research [5], 8] 15] In industry, low level, detailed specifications are necessary and exist, but, to the best of our knowledge, none have been published in the literature. Moreover, these detailed specifications often match the hardware closely, which complicates verification and limits alternative ....

J. Archibald and J.-L. Baer, "Cache Coherence Protocols: Evalua- tion Using a Multiprocessor Simulation Model," ACM Trans. Computer Systems, vol. 4, no. 4, pp. 273-298, Nov. 1986.


Implementing Shared Memory On Large-Scale Multiprocessors - Parthasarathy (1992)   (1 citation)  (Correct)

....problem [5] Any system allowing multiple copies of a data item to exist at the same time must solve this problem. Various hardware schemes have been proposed to solve the coherence problem. They can be classified as either snooping cache protocols or directory based protocols. Snooping protocols [3] require inexpensive broadcast, limiting their scalability. Directory based protocols do not impose any restrictions on the interconnection network and, therefore, can be used in large scale parallel machines. Examples of such protocols are the fullmap scheme [5, 12] the chained list scheme [6] ....

....on the existing system state. Since the protocol directly affects the memory latencies experienced by a processor, it is a critical issue in the design of a high performance large scale shared memory machine. A variety of hardware cache coherence protocols have been proposed in the literature [3, 5, 11, 18, 6]. As mentioned in Chapter 1, they can be classified as either snooping cache protocols or directory based protocols. Regardless of their classification, most cache coherence protocols provide several basic capabilities. These are to ffl Allow multiple copies of a data block to exist. 10 ffl ....

[Article contains additional citation context not shown here]

J. Archibald and J.Baer, "Cache coherence protocols: Evaluation using a multiprocessor simulation model," ACM Transactions on Computer Systems, pp. 273 --298, November 1986.


Cache Characterization and Performance Studies Using Locality.. - Sorensen (2003)   (Correct)

....length or accuracy, so synthetic trace generation was the only way to get long, somewhat reasonable traces. Some models created at this time include the Independent Reference Model [20] 21] the Distance and Distance Strings Models [22] 36] the Partial Markov Model [32] and the Stack Model [37]. Each of these models has a nite number of input parameters that can either be invented from thin air or pulled from a real trace. More recently, other models have been proposed, such as the Piecewise Independent Stochastic Process Model [38] and the Random Walk Model [11] These models use ....

J. Archibald and J-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273-298, November 1986.


A Methodology for Formal Design of Hardware.. - Eisner.. (2000)   (1 citation)  (Correct)

....Most cache coherence protocols depend on the abilityofcache controllers to observe bus transactions of other processors in the system. This process of observation is known as ########. The following explanation of the Illinois cache coherence protocol [12] is based on the explanation in [1]. Each cache holds a cache state per block. The cache state is one of: 1. Invalid: the data for this block is not cached 2. Valid Exclusive: the data for this blockisvalid, clean (identical to the data held in main memory) and is the only cached copy of the block in the system 3. Shared: the ....

J. Archibald and J. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ### ############ ## ######## #######, 4(4):273{ 298, November 1986.


Effect of Virtual Channels and Memory Organization on.. - Kumar, Bhuyan (1996)   (Correct)

....both for message passing and shared memory environments. However, the model is not appropriate for cache coherent systems, where a number of invalidation messages are generated to maintain coherence among the caches and the main memory. On the other hand, analytical and simulation models, such as [11, 12], capture the cache coherence traffic in detail, but here we concentrate on wormhole routing with virtual channels. Also, we wish to study the execution of real applications in this paper. In a related work [13] the performance of the multistage interconnection network in Cedar multiprocessor ....

J. Archibald and J.-L. Baer, "Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model," ACM Transactions on Computer Systems, vol. 4, no. 4, pp. 273--298, November 1986.


Comparative Evaluation of Latency Reducing and.. - Gupta, Hennessy.. (1991)   (103 citations)  (Correct)

....offset any performance gains expected from the use of parallelism. Techniques that can help to reduce or hide these latencies are essential for achieving high processor utilization. To cope with the large latencies, several different architectural techniques have been proposed. Coherent caches [3, 4, 18, 30] allow shared read write data to be cached and significantly reduce the memory latency seen by the processors. Relaxed memory consistency models [1, 5, 8] hide latency by allowing buffering and pipelining of memory references. Prefetching techniques [11, 16, 21, 23] hide the latency by bringing ....

....technique for reducing latencies in uniprocessors. Their use in multiprocessors, however, is complicated by the fact that the caches need to be kept coherent. While the coherence problem is easily solved for small bus based multiprocessors through the use of snoopy cachecoherence protocols [4], the problem is much more complicated for large scale multiprocessors that use general interconnection networks. As a result, some existing large scale multiprocessors do not provide caches (e.g. BBN Butterfly [25] others provide caches that must be kept coherent by software (e.g. IBM RP3 ....

[Article contains additional citation context not shown here]

J. Archibald and J.-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst., 4(4):273498, 1986.


CICO: A Practical Shared-Memory Programming Performance Model - Larus, Chandra, Wood (1993)   (4 citations)  (Correct)

....2 CICO Shared Memory Performance Model Unlike message passing programs, communication in shared memory programs is not easily identifiable. In cache coherent, shared memory computers, interprocessor communication occurs when a memory reference misses in a cache and the hardware coherence protocol [2, 5] requests a copy of the referenced datum from another processor, which may cause outstanding copies to be invalidated. This mechanism offers several advantages: caches dynamically adapt to a program s reference pattern, the replicating data retains the same address as the original, and the ....

J. Archibald and J.-L. Bacr. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. ACM Transactions on Computer Systems, 4(4):273 298, 1986.


Fine-Grain Distributed Shared Memory on Clusters of Workstations - Schoinas (1997)   (3 citations)  (Correct)

....which allows the implementation of shared memory coherence protocols as user level libraries. The Tempest interface has been specifically designed to support fine grain shared memory coherence protocols. Typically, such protocols are modeled after hardware based invalidate directory protocols [AB86] They transfer data in small, cache block sized quantities. Moreover, they are blocking protocols, in which the computation is suspended on access violations until the protocol events (e.g. misses) have been processed. Therefore, Tempest requires low overhead messages to provide low latency ....

J. Archibald and J.-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273-298, 1986.


A Comparison of Software and Hardware Synchronization - Mechanisms For Distributed   (Correct)

....this by allowing shared data and synchronization protocols to be maintained using the protocol best suited to the way the programming is accessing the data. For example, data that is being accessed primarily by a single processor would likely be handled by a conventional write invalidate protocol [2], while data being heavily shared by multiple processes, such as global counters or edge elements in finite differencing codes, would likely be handled using a delayed write update protocol [3] Similarly, locks will be handled using appropriate locking protocol in hardware, while more complex ....

J. Archibald and J.-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273--298, November 1986.


A Combination of Scalable Caching Methods for a Weakly.. - Zamanifar, Nash, Dew (1996)   (Correct)

....a caching system is the cache coherency problem. This relates to the fact that the copies of a variable resident in the caches of multiple processors must be invalidated when the value of variable is updated, in order to maintain the consistency of the system. Snoopy based cache coherency schemes [15, 3, 19] are limited to small scale multiprocessors, because of the limited bandwidth of the shared bus. Hardware solutions to the cache coherency problem for multiprocessors with point to point connections more commonly employs a directory based scheme [27, 2, 6, 18, 16] Due to the increased complexity ....

J. Archibald and J. L. Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Mode. ACM Transactions on Computer Systems, 4(4):273--298, November 1986.


Automated Inductive Verification of Parameterized Protocols - Roychoudhury, Ramakrishnan (2001)   (1 citation)  (Correct)

....RISC #dirty # 2 6#8 503 146 Tree cache #bus with data # 2 9#9 178 18 ##### ## Summary of protocol veri cation results In the table, we have used the following notational shorthand: ## denotes the number of processes in local state #. Mesi and Berkeley RISC are single bus broadcast protocols [3, 11, 12]. Illinois is a single bus cache coherence protocol with global conditions which cannot be modeled as a broadcast protocol [8, 23] Tree cache is a binary tree network which simulates the interactions between the cache agents in a hierarchical cache coherence protocol [27] The running times of ....

J. Archibald and J.-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4, 1986.


A Methodology for Formal Design of Hardware Control with.. - Cindy Eisner Russ (2000)   (1 citation)  (Correct)

....Most cache coherence protocols depend on the ability of cache controllers to observe bus transactions of other processors in the system. This process of observation is known as snooping. The following explanation of the Illinois cache coherence protocol [PP84] is based on the explanation in [AB86]. Each cache holds a cache state per block. The cache state is one of: 1. Invalid: the data for this block is not cached 2. Valid Exclusive: the data for this block is valid, clean (identical to the data held in main memory) and is the only cached copy of the block in the system 3. Shared: the ....

J. Archibald, and JL. Baer, "Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model", ACM Transactions on Computer Systems, Vol. 4, No. 4, November 1986, pp. 273-298.


Automated Inductive Verification of Parameterized Protocols - Roychoudhury, Ramakrishnan (2001)   (1 citation)  (Correct)

....RISC #dirty 2 6:8 503 146 Tree cache #bus with data 2 9:9 178 18 Table 1. Summary of protocol veri cation results In the table, we have used the following notational shorthand: #s denotes the number of processes in local state s. Mesi and Berkeley RISC are single bus broadcast protocols [3, 11, 12]. Illinois is a single bus cache coherence protocol with global conditions which cannot be modeled as a broadcast protocol [8, 23] Tree cache is a binary tree network which simulates the interactions between the cache agents in a hierarchical cache coherence protocol [27] The running times of ....

J. Archibald and J.-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4, 1986.


Study on Kernel Level Scheduling in SSS-CORE: A General-Purpose.. - Nobukuni   (Correct)

....part of a process memory space. In addition, this makes much clear to what characteristics an application using the frequency table has as for memory reference. The disadvantage is that ordering of memory accesses cannot be expressed. Re computing the distribution of frequency at each time tick[1] requires too much computing power. The distribution of reference rate to pages within a memory space will not change during a simulation. The rate of access to local space and shared space is control ed by per process parameter. This way, distribution of frequency rate within both spaces can be ....

J. Archibald and J.-L. Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. ACM Trans. on Computer System, Vol. 4, No. 4, pp. 273--298, November 1986.


Unknown - Implementing Coherent Memory   (Correct)

No context found.

Archibald, J. and Baer, J-L., Cache coherence protocols: Evaluation using a multiprocessor simulation model, ACM Trans. Computer Systems 4 (4), Nov. 1986, 273-298.


Emulation of a Virtual Shared Memory Architecture - Raina (1993)   (3 citations)  (Correct)

No context found.

J. Archibald and J. Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. ACM Transactions on Computer Systems, 4:273--298, 1986.


Characterization of TCC on Chip-Multiprocessors - Austen Mcdonald Jaewoong (2005)   (Correct)

No context found.

J. Archibald and J. L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation mode. ACM Transactions on Computer Systems, pages 273--298, Nov. 1986.


Formal Automatic Verification of Cache Coherence in.. - Pong, Dubois (2000)   (2 citations)  (Correct)

No context found.

Archibald, J. and Baer, J.-L. "Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model", ACM Trans. on Computer Systems, Vol.4, No4, Nov. 1986, pp. 273-298.


A Comparison of Software and Hardware Synchronization.. - Carter, Kuo, Kuramkote (1996)   (9 citations)  (Correct)

No context found.

J. Archibald and J.-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, November 1986.


A Study on Memory-Based Communications and Synchronization in.. - Matsumoto (2001)   (Correct)

No context found.

J. Archibald and J. L. Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. ACM Trans. Computer Systems, 4(4):273--298, 1986.


A Memory Coherence Technique for Online Transient - Error Recovery Of   (Correct)

No context found.

Archibald, J., and J.-L. Baer, "Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model," ACM Trans. Computer Systems, pp. 273-298, Nov. 1986.


On the Automated Verification of Parameterized Concurrent.. - Delzanno   (Correct)

No context found.

P. A. Archibald, J. Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. ACM Transactions on Computer Systems 4(4): 273-298. 1986.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC