42 citations found. Retrieving documents...
D. Kroft, "Lockup-free instruction fetch/prefetch cache organization," Proceedings of the 8th Annual International Symposium on Computer Architecture, pp. 81-87, 1981.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Effective Compile-Time Analysis for Data Prefetching in Java - Cahoon (2002)   (Correct)

....high performance architectures contain several simple hardware mechanisms for hiding memory hierarchy access costs. Early cache designs allowed only a single outstanding memory access to occur. Thus, all memory accesses stalled the processor until completed. Kroft introduced lockup free caches [62] to enable multiple concurrent memory accesses. Lockup free caches permit non blocking loads that do not stall the processor until a future instruction references the data. Lockup free caches require a mechanism, such as miss status handling registers (MSHRs) to maintain information about pending ....

David Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the 8th Annual International Symposium on Computer Architecture, pages 81--87, May 1981.


Data Locality Optimizations for Multigrid Methods on Structured.. - Weiß   (Correct)

....memory. This is a severe problem since the microprocessor could execute up to 200 floating point instructions during that time. This problem is called latency problem. Researchers have developed several techniques, like software and hardware prefetching [CKP91, MLG92, CB94] non blocking caches [Kro81, SF91] stream buffers [Jou90, PK94] multithread 2.2 The Bottleneck: Memory Performance 9 Processor Bandwidth Out of Order Cache (I D L2) Sun Ultra 3 4.8 Gbyte s none 32 K 64 K Intel Pentium 4 3.2 Gbyte s 126 ROPs 12 K 8 K 256 K Alpha 21264B 2.7 Gbyte s 80 instr 64 K 64 K ....

D. Kroft. Lockup--Free Instruction Fetch/Prefetch Cache Organisation. In Proceedings of the 8th Annual International Symposium on Computer Architecture, pages 81--87, May 1981.


Improving the Performance of - Heterogeneous Dsms Via (2000)   (Correct)

....in terms of the size of hardware structures dedicated to ILP exploitation. The heterogeneous ILP parameters investigated in this paper are issue rate, instruction window size, number of arithmetic (ALU) floating point (FPU) and address units, and maximum number of outstanding cache misses (MSHRs [9]) Heterogeneity in the memory subsystem is modeled in terms of the size and speed of caches. The HDSMs under study have three levels, with 2, 4 and 10 nodes in levels 1, 2 and 3, respectively. The machine is configured as a processor and memory hierarchy [1] the number of processing elements ....

Kroft, D. Lockup-Free Instruction Fetch/Prefetch Cache Organization. In Proc. 8 International Symposium on Computer Architecture, 1981.


Techniques Utilizing Memory Reference Characteristics for Improved .. - Wong   (Correct)

....DRAM core access latency continues to improve more slowly than increases in processor speed. Several microarchitecture techniques have been developed to hide or tolerate memory latency. Some of them are purely hardware based, such as lock up free caches, multithreading, and value speculation [Kroft 81, Alverson 90, Lipasti 96] These techniques can be quite successful for hiding small latencies such as those between the on chip cache (L1) and a closely integrated second level cache (L2) However, in a performance study of the Pentium Pro, Bhandarkar and Ding specifically point out that ....

D. Kroft. Lockup-Free Instruction Fetch/Prefetch Cache Organization. In Architecture, pages 81--87, May 1981.


Speculation-Based Techniques for Lockfree Execution of Lock-Based .. - Rajwar (2002)   (Correct)

....with blocking caches, a second request cannot be issued until the first outstanding request is serviced. In processors with non blocking caches subsequent requests (secondary misses) to a block that already has a request outstanding (the primary miss) for it, are merged with the primary miss [88]. 132 We first reproduce the example shown above again in Figure 4 8. The coherence protocol chains for two cache blocks, A and B, are shown. The protocol chain for any coherence block is always rooted at a stable block; in the figure the stable state is the modified (M) state of the MOESI ....

David Kroft. Lockup-Free Instruction Fetch/Prefetch Cache Organization. In Proceedings of the Eighth Annual International Symposium on Computer Architecture, pages 81--87, May 1981.


Runahead Execution: An Alternative to Very Large.. - Mutlu, Stark.. (2003)   (9 citations)  (Correct)

....an identical machine with a 384 entry instruction window. 2. Related work Memory access is a very important long latency operation that has concerned researchers for a long time. Caches [29] tolerate memory latency by exploiting the temporal and spatial reference locality of applications. Kroft [19] improved the latency tolerance of caches by allowing them to handle multiple outstanding misses and to service cache hits in the presence of pending misses. Software prefetching techniques [5, 22, 24] are effective for applications where the compiler can statically predict which memory ....

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the 8th Annual International Symposium on Computer Architecture, 1981.


Just Say No: Benefits of Early Cache Miss Determination - Memik, Reinman.. (2003)   (2 citations)  (Correct)

....of other caching structures such as the TLBs. 5. RELATED WORK Related work falls into a group of studies conducted for reducing the negative effects of cache misses. Arguably the most important technique to reduce cache miss penalty is the non blocking caches, also called the lock up free caches [7]. Non blocking caches do not block after a cache miss, being able to provide data to other requests. Sohi and Franklin [13] discuss a multi port non blocking L1 cache. Farkas and Jouppi [3] explore alternative implementations of the non blocking caches. Farkas et. al [4] studies the usefulness of ....

D. Kroft. Lock-up Free Instruction Fetch/Prefetch Cache Organization. In Proc. of 8 International Symposium on Computer Architecture, May 1981.


Single Region vs. Multiple Regions: A Comparison of Different.. - Hsu, Kremer (2002)   (2 citations)  (Correct)

....provides a cycle accurate simulation environment for a modern out of order superscalar processor with 5 stage pipelines and fairly accurate branch prediction mechanism. The memory extensions model the limitedness of non blocking caches through finite miss status holding registers (MSHRs) [12]. Bus contention and arbitration at all levels are also taken into account. Table 1 gives the simulation parameters used in the experiments. The DVS extensions introduce a new speed setting instruction. The speed setting instruction takes as argument an integer that specifies the desired CPU ....

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the 18th International Symposium on Computer Architecture, pages 81--87, May 1981.


Sharing Speculation: A Mechanism for Low-Latency.. - Desikan, Huh, Burger, .. (2003)   Self-citation (May)   (Correct)

....value of its result to all the consumers. The version numbers and the commit bits enable only version numbers and commit bits to be forwarded inside the grid, when the speculation is correct, instead of actual data values. Hence, to implement sharing speculation, the caches or the MSHRs [8] in the system would need logic to use the selective re execution mechanism implemented in the GPA, to inject speculative values into the processor. It is worth noting that if mis speculation recovery overhead is sufficiently low, as in the GPA, then it is always better to speculate, since waiting ....

David Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the Eighth International Symposium on Computer Architecture, pages 81--87, May 1981.


Memory Latency Rediction via Data Prefetching and Data Forwarding .. - Poulsen (1994)   (Correct)

No context found.

D. Kroft, "Lockup-free instruction fetch/prefetch cache organization," Proceedings of the 8th Annual International Symposium on Computer Architecture, pp. 81-87, 1981.


Cache Refill/Access Decoupling for Vector Machines - Batten, Krashinsky.. (2004)   (Correct)

No context found.

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In ISCA-8, pages 81--87, May 1981.


Cache Filtering Techniques to Reduce the Negative Impact of .. - Onur Mutlu Hyesoon (2004)   (Correct)

No context found.

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the 8th Intl. Symposium on Computer Architecture, pages 81--87, 1981.


Analytic Evaluation of - Shared-Memory Architectures Daniel   (Correct)

No context found.

D. Kroft, "Lockup-Free Instruction Fetch/Prefetch Cache Organization, " Proc. Eighth Int'l Symp. Computer Architecture, pp. 81-87, May 1981.


Next-Generation Memory Systems - Wang (2004)   (Correct)

No context found.

David Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the Eighth International Symposium on Computer Architecture, pages 81--87, May 1981.


Latency Tolerant Architectures - Bennett (1998)   (2 citations)  (Correct)

No context found.

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proc. Eighth Symposium on Computer Architecture, pages 81--87, May 1981.


Cache-Conscious Allocation of Pointer-Based Data.. - Hallberg, Palm, Brorsson (2003)   (Correct)

No context found.

David Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the Eighth International Symposium on Computer Architecture, pages 81--87. ACM, SIGARCH, May 1981.


Permission to Make Digital Or Hard Copies of All Or Part.. - Personal Or Classroom   (Correct)

No context found.

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the Eighth Annual International Symposium on Computer Architecture, pages 81--87, May 1981.


Improving the Speed vs. Accuracy Tradeoff for Simulating.. - Durbhakula (1998)   (Correct)

No context found.

D. Kroft. Lockup-free Instruction Fetch/Prefetch Cache Organization. In Proceedings of the 8th Annual International Symposium on Computer Architecture, 1981.


Improving Latency Tolerance of Multithreading through - Decoupling Joan-Manuel..   (Correct)

No context found.

D. Kroft. Lockup-Free Instruction Fetch/Prefetch Cache Organization. In Proc. of the 8th. Int. Symp. on Comp. Architecture. May 1981, pp 81-87.


Hardware Optimizations Enabled by a Decoupled Fetch Architecture - Reinman (2001)   (Correct)

No context found.

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In 8th Annual International Symposium of Computer Architecture, pages 81--87, May 1981.


Memory Dependence Prediction - Andreas Ioannis Moshovos   (Correct)

No context found.

D. Kroft. Lockup-Free Instruction Fetch/Prefetch Cache Organization. In Proc. ISCA-8, May 1981.


Efficient Remapping Mechanisms for an Adaptable Memory System - Zhang (2002)   (Correct)

No context found.

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In pages 81-87, Honolulu, Hawaii, May 1981.


Coherence Buffer: An Architectural Support for.. - Sarojadevi, Nandy.. (2002)   (Correct)

No context found.

David Kroft. Lockup-Free Instruction Fetch/Prefetch Cache Organization. In Proceedings of the 8th Annual International Symposium on Compute r Architecture, pages 81-87, May 1981.


Data Memory Alternatives for Multiscalar Processors - Scott Breach Vijaykumar (1997)   (4 citations)  (Correct)

No context found.

D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In Proceedings of the 8th Annual International Symposium on Computer Architecture, pages 81--87, 1981.


Performance Implication of Fine-Grained Synchronization in .. - Merino, Vlassov, al.   (Correct)

No context found.

Kroft, D.: "Lockup-Free Instruction Fetch/Prefetch Cache Organization", 25 years of the International Symposia on Computer Architecture (selected papers), Association for Computing Machinery, August 1998, pages 20-21

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC