14 citations found. Retrieving documents...
S. L. Min and J. Baer. Design and analysis of a scalable cache coherence scheme based on clocks and timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992. 37

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Compiling Techniques for Improving Decoupled Virtual Shared Memory.. - Zhu   (Correct)

....46, 36, 72, 65, 75] for systems with a broadcast medium such as a bus interconnection network have been proposed. For more scalable multiprocessors with general interconnection network between processors, 10 directory based protocols [2, 4, 14, 44, 73, 86] and compiler assisted software protocols [20, 51, 57, 79] have been suggested. Recently, dynamically tagged directory protocols [15, 41, 55, 56] have evolved from previous directory based schemes. Snoopy Protocols Snoopy protocols are also called bus based protocols. All processors in the system can observe any memory access by snooping on the bus. ....

S. L. Min and J. Baer. Design and analysis of a scalable cache coherence scheme based on clocks and timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992. 37


Cache Coherence Using Local Knowledge - Darnell, Kennedy (1993)   (16 citations)  (Correct)

....knowledge at run time. If no global knowledge is shared at run time, then coherence must rely on locally collected knowledge plus whatever global knowledge was collected at compile time. Previous local knowledge strategies have been referred to in the literature as software [4, 6] or hardware [14] strategies depending on whether most of the work was done in software or hardware. We consider all of these strategies to be similar and refer to them collectively as local strategies. A local strategy will likely never result in a globally optimal hit ratio for a processor because some useful ....

....are also lost to inter epoch reuse. The use of an epoch bit alone would not suffice to prevent this. For PEI to achieve good results, arrays, or in the worst case, each dimension of an array, must occupy an amount of memory equal to a power of two. 3. 4 Time Stamping Time stamping (TS) strategies [4, 14] are more effective at preserving reuse than any of the previously mentioned strategies. For a given quality of compiler analysis, it is impossible to achieve a better hit ratio with any other local strategy. The trade off is that they require several extra bits per cache line, extra bits per ....

[Article contains additional citation context not shown here]

S. Min and J. Baer. Design and analysis of a scalable cache coherence scheme based on clocks and timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, Jan. 1992.


Compiler Support for the Efficient Use of Cache Coherence.. - Trung Nguyen   (Correct)

....using either a hardware or software mechanism [21] Several schemes [12, 16, 30, 34, 36] have been proposed for bus based systems. For more scalable systems that use general interconnection networks, directory based hardware schemes [2, 3, 5, 15, 35, 38] and compiler assisted software schemes [7, 18, 17, 25, 37] have been suggested. Recently, several authors have proposed dynamically tagged directories [6, 14, 22, 23, 24, 30] in which pointers to processors with a copy of a memory block are allocated only when the block is actually cached. These directories maintain a cache of pointers in each memory ....

....implemented our algorithm for marking scalar references only. All array references are conservatively marked as needing a directory pointer. We also briefly discuss how to adapt our algorithm to mark array references, given array data flow information. 1. 1 Background Software coherence schemes [7, 18, 17, 25, 37] analyze the source program at compile time to predict memory reference behavior and potential cache incoherence. Software schemes typically insert special instructions in the program to invalidate a cache line before it is referenced, or to clear the whole cache at appropriate times. However, due ....

S. L. Min and J. Baer. Design and analysis of a scalable cache coherence scheme based on clocks and timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992.


Eliminating Stale Data References through Array Data-Flow Analysis - Choi, Yew (1996)   (Correct)

....in many research machines [13, 17, 18] the directory based hardware coherence protocols have been studied to enforce the cache coherence. However, those protocols require complex and expensive hardware cache directory controllers. Several compiler directed coherence schemes have been proposed [7, 11, 19, 20]. In this approach, cache coherence is maintained locally without directory hardware, avoiding the complexity and the overhead associated with the hardware directories. Although the performance of such schemes have been demonstrated through simulations [6, 19, 20] most of those studies assume ....

....schemes have been proposed [7, 11, 19, 20] In this approach, cache coherence is maintained locally without directory hardware, avoiding the complexity and the overhead associated with the hardware directories. Although the performance of such schemes have been demonstrated through simulations [6, 19, 20], most of those studies assume either perfect compile time analysis or analytic models [1] without real compiler implementations. It is still unknown how effectively the compiler can detect potential stale references and what kind of performance can be obtained by using a real compiler. In this ....

S. L. Min and J.-L. Baer. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps. IEEE Transactions on Parallel and Distributed Systems, III(1):25--44, January 1992.


Performance Evaluation of the Late Delta Cache.. - de Supinski.. (1996)   (Correct)

....the models proposed. Much research has been conducted using simulations to evaluate cache coherence protocols directly. Researchers have used both synthetic workloads, generally based on the workload model developed by Archibald and Baer [ArB86] and memory traces to generate a reference stream [EgK88, BMR89, MiB92, ASH88]. Still others have simulated an entire computer system in software [CKA91] or used the execution driven approach, where actual programs are run on the host computer and memory references are trapped to a memory hierarchy simulation as necessary [DGH90, CDK94] We have evaluated the late delta ....

Min, S.L. and J.L. Baer, "Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps," IEEE Transactions on Parallel and Distributed Systems, Vol. 3, No. 1, pp.25-44, 1992.


Hardware And Compiler Support For Cache Coherence In Large-Scale.. - Choi (1996)   (5 citations)  (Correct)

....time analysis to detect possible stale data accesses and to invalidate stale cache entries. Although the performance of such schemes has been demonstrated through simulations, most of the studies assume either perfect compile time analysis or analytical models without real compiler implementations [1, 10, 27, 38, 40, 41]. It is still unknown how effectively the compiler can detect potentially stale references and what kind of performance can be obtained using a real compiler. iii Also, most of the compiler directed coherence schemes proposed to date have not addressed the real cost of the required hardware ....

....support. We call them hardware supported compiler directed (HSCD) coherence schemes, which is distinctly different from a pure hardware directory scheme or a pure software scheme. Several studies have compared the performance of directory schemes and some recent HSCD schemes. Min and Baer [40] compared the performance of a directory scheme and a timestamp based scheme, assuming infinite cache size and single word cache lines. Lilja [37] compared the performance of the version control scheme [15] and directory schemes, and analyzed the directory overhead of several implementations. Both ....

S. L. Min and J.-L. Baer. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992.


A Timestamp-based Selective Invalidation Scheme for.. - Yuan, Melhem, Gupta   (Correct)

.... software controlled cache coherence schemes, the methods based on the concept of timestamps are more effective in preserving cache lines across task boundaries than other software methods [4, 9] The early timestamp based methods, such as the version control method [3] and the timestamp method[10], use an explicit timestamp table to store the current version number for each variable. They also use an additional field in each cache line to store the version number of the cache line. Although the cache performance of these methods approaches that of the hardware schemes, the maintenance of ....

....to increase the CVNs. The cache line is valid only if its BVN is bigger than or equal to the corresponding variable s CVN. The version control method is effective in preserving the reuse of cache lines. The major limitation of this method is the hardware and runtime overhead. The timestamp method [10] is not discussed here because it is similar to the version control method. The TS1 Method In TS1, one additional bit, referred to as the epoch bit, is required for each cache line. The compiler determines the levels and the variables modified in each level. The epoch bit is reset at the end of ....

S. Min and J. Baer. "Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps." IEEE Trans. on Parallel and Dist. Systems, 3(1):25-44, Jan. 1992.


Hardware and Compiler-Directed Cache Coherence in Large-Scale.. - Choi, Yew (1996)   (1 citation)  (Correct)

....compiler support. We call them hardware supported compilerdirected (HSCD) coherence schemes, which is distinctly different from a pure hardware directory scheme and a pure software scheme. Several studies have compared the performance of directory schemes and some recent HSCD schemes. Min and Baer [31] compared the performance of a directory scheme and a timestampbased scheme assuming infinite cache size and single word cache lines. Lilja [28] compared the performance of the version control scheme [13] with directory schemes, and analyzed the directory overhead of several implementations. Both ....

S. L. Min and J.-L. Baer. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992.


Compiler and Hardware Support for Cache Coherence in Large-Scale .. - Choi, Yew (1996)   (3 citations)  (Correct)

....for these data. Furthermore, data movement instructions are provided so that the programmer can explicitly move data between the cluster and global memories. By using these software mechanisms, coherence can be maintained for globally shared data. Several compiler directed cache coherence schemes [5, 6, 7, 9, 11, 15] have been recently proposed. These schemes give better performance, but demand more hardware and compiler supports than the previous schemes. They require a more precise program analysis to maintain coherence on a reference basis [5, 9] instead of a program region basis compared to the previous ....

....support. We call them hardware supported compiler directed (HSCD) coherence schemes, which is distinctly different from a pure hardware directory scheme and a pure software scheme. Several studies have compared the performance of directory schemes and some recent HSCD schemes. Min and Baer [15] compared the performance of a directory scheme and a timestamp based scheme assuming an infinite cache size and single word cache lines. Lilja [14] compared the performance of the version control scheme [6] with directory schemes, and analyzed the directory overhead of several implementations. ....

S. L. Min and J.-L. Baer. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992.


Software Caching and Computation Migration in Olden - Carlisle, Rogers (1995)   (52 citations)  (Correct)

....sees the most recently completed write to that location. Sequential consistency may be provided by the underlying system, or guaranteed by having invalidations in the code. A coherence mechanism that requires no communication is called a local knowledge scheme. Prior local knowledge schemes (e.g. [12, 15, 31]) have relied on the compiler specifically inserting code to invalidate the local cache. We implement a local knowledge scheme in Olden using the runtime system, by having each processor invalidate its entire cache upon receiving a migration. Since Olden s synchronization mechanisms require that ....

....and Scott [25] implement release consistency using the operating system s virtual memory mechanisms. Most papers that use the term software coherence have referred to the insertion of invalidation instructions by a compiler. Darnell and Kennedy [15] Cheong and Veidenbaum [12] and Min and Baer [31] propose a variety of local coherence mechanisms for FORTRAN. A comparison of software and hardware schemes was done by Adve et al. 1] Olden s coherence scheme is an adaptation of these ideas to our programming model. We obtain relaxed consistency by performing coherence events only at ....

S. Min and J. Baer. Design and analysis of a scalable cache coherence scheme based on clocks and timestamps. IEEE Trans. on Parallel and Distributed Systems, 3(1):25--44, January 1992.


Eliminating Stale Data References through Array Data-Flow Analysis - Choi, Yew (1996)   (Correct)

....the usefulness of caches for remote memory access. For example in Cray T3D, lack of cache coherence mechanism forces each cache line loaded by a remote read not to be cached (by an uncacheable load instruction) or to be flushed [16] Several compiler directed coherence schemes have been proposed [6, 8, 9, 12, 18]. In this approach, cache coherence is maintained locally without directory hardware, avoiding the complexity and the overhead associated with the hardware directories. Although the performance of such schemes have been demonstrated through simulations, most of those studies assume either perfect ....

....avoiding the complexity and the overhead associated with the hardware directories. Although the performance of such schemes have been demonstrated through simulations, most of those studies assume either perfect compile time analysis or analytic models without real compiler implementations [1, 18]. It is still unknown how effectively the compiler can detect potential stale references and what kind of performance can be obtained by using a real compiler. In this paper, we develop and implement a compiler algorithm on Polaris [19] parallelizing compiler to test the feasibility and the ....

S. L. Min and J.-L. Baer. Design and analysis of a scalable cache coherence scheme based on clocks and timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992.


Compiler Analysis for Cache Coherence: Interprocedural Array.. - Choi, Yew (1996)   (1 citation)  (Correct)

....analysis to detect possible stale data accesses and to invalidate stale cache entries. Although the performance of such schemes have been demonstrated through simulations, most of those studies assume either perfect compile time analysis or analytical models without real compiler implementations [1, 6, 17, 23, 25, 26]. It is still unknown how effectively the compiler can detect potential stale references and what kind of performance can be obtained by using a real compiler. In this paper, we develop and implement both intraprocedural and interprocedural compiler algorithms on the Polaris parallelizing compiler ....

S. L. Min and J.-L. Baer. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992.


Compiler and Hardware Support for Cache Coherence in Large-Scale .. - Choi, Yew (1996)   (3 citations)  (Correct)

....compiler support. We call them hardwaresupported compiler directed (HSCD) coherence schemes, which is distinctly different from a pure hardware directory scheme and a pure software scheme. Several studies have compared the performance of directory schemes and some recent HSCD schemes. Min and Baer [29] compared the performance of a directory scheme and a timestampbased scheme assuming infinite cache size and single word cache lines. Lilja [26] compared the performance of the version control scheme [14] with directory schemes, and analyzed the directory overhead of several implementations. Both ....

S. L. Min and J.-L. Baer. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps. IEEE Transactions on Parallel and Distributed Systems, 3(1):25--44, January 1992.


Comparison of Hardware and Software Cache Coherence Schemes - Adve, Adve, Hill, Vernon (1991)   (37 citations)  (Correct)

....outcomes, while the hardware takes responsibility for ensuring correctness when a prediction is wrong. Although we have specifically modeled the scheme described by Cytron et al. we believe our results apply equally to the Fast Selective Invalidation scheme [ChV88] and to the timestamp based [MiB90b] and version control schemes [Che90] The Fast Selective Invalidation scheme has been shown to be very similar to the Cytron et al. scheme in terms of compile time analysis and exploiting temporal locality. The timestamp based and version control schemes have been shown to perform better than the ....

S. L. MIN and J. BAER, Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps, Submitted for Publication, 1990.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC