8 citations found. Retrieving documents...
R. Gupta and C.-H. Chi, "Improving Instruction Cache Behavior by Reducing Cache Pollution, " Proceedings of the

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Optimizing the Instruction Cache Performance of the.. - Torrellas, Xia, Daigle (1995)   (38 citations)  (Correct)

....caches, the caching behavior of systems code needs to be understood better and improved. Improving the performance of instruction caches has been addressed by many researchers. It has been shown that it is feasible to reduce the misses in applications by improving the layout of the code in memory [1, 12, 13, 15, 16, 17, 18, 21, 22, 24]. The techniques proposed are based on repositioning or replicating code at the procedure or basic block level, usually to reduce cache conflicts or utilize the cache lines better. In most cases, these techniques perform quite well, speeding up applications by 5 30 or even more. The schemes where ....

....of code that gets repositioned or replicated is the procedure [1, 24] tend to be less effective. This is because a procedure has parts that are invoked frequently and parts that are not. As a result, it is not the optimal unit to handle. Instead, the schemes that move or replicate basic blocks [12, 13, 15, 16, 18, 21, 22] are the most effective ones. Among these schemes, McFarling s technique [16] uses a profile of the conditional, loop, and routine structure of the program. With this information, he places the basic blocks so that callers of routines, loops, and conditionals do not interfere with the callee ....

[Article contains additional citation context not shown here]

R. Gupta and C.-H. Chi. Improving Instruction Cache Behavior by Reducing Cache Pollution. In Proceedings of Supercomputing 1990, pages 82--91, November 1990.


Compile Time Instruction Cache Optimizations - Abraham Mendlson Shlomit (1994)   (9 citations)  (Correct)

....into the cache instructions either when they are used only once before being purged from the cache, or because they might conflict with other code in the loop. Such instructions are left in the main memory (non cachable) The effect of inlining procedures on the performance is discussed [9] In [4], code fragments are repositions so that the cache line will not be polluted (will not contain instructions that are never referenced) All of these techniques were found to be successful mostly for small caches. Our algorithm works in two phases. In the first one, instructions executed in a loop ....

R. Gupta and C. Chi-Hung. Improving instruction cache behavior by reducing cache pollution. In Proc. of Supercomputing '90, pages 82--91, New-York, November 1990.


SPAID: Software Prefetching in Pointer- and.. - Lipasti, Schmidt.. (1995)   (38 citations)  (Correct)

.... Efforts to improve instruction cache behavior of programs have their roots in methods to improve paging behavior of main memory [HG71, Fer74, Har88] A popular area of research has been repositioning of code sections by the compiler, both at the basic block level and the procedure level [HC89, GC90, PH90, CMH91, Wu92] Some such methods operate on the executable after link time, allowing intermingling of basic blocks from different procedures [Hei94b, Hei94a] while others take into account the branch prediction architecture of the hardware [CG94] McFarling [McF91] has investigated the use ....

Rajiv Gupta and Chi-Hung Chi. Improving instruction cache behavior by reducing cache pollution. In Proceedings of Supercomputing '90, pages 82--91, New York, November 1990.


Analysis of Techniques to Improve Protocol Processing.. - Mosberger, Peterson.. (1996)   (43 citations)  (Correct)

....the gaps introduced by the micro positioning approach do cost extra i cache bandwidth. We have not found a single instance where aligning function entry points or similar gap introducing techniques would have improved end to end latency. This is in stark contrast with the findings published in [GC90] where i cache optimization focused on functions with a very high degree of locality. So it may be that micro positioning suffers because of the i cache bandwidth wasted on loading gaps. Third, the DEC 3000 600 workstations used in the experiments employ a large second level cache. It may be the ....

Rajiv Gupta and Chi-Hung Chi. Improving instruction cache behavior by reducing cache pollution. In Proceedings Supercomputing '90, pages 82--91. IEEE, 1990.


SCOUT: A Path-Based Operating System - Mosberger (1997)   (16 citations)  (Correct)

....do cost extra memory bandwidth. This hypothesis is corroborated by the fact that we have not found a single instance where aligning function entry points or similar gap introducing techniques would have improved end to end latency. Note that this is in stark contrast with the findings published in [43], where i cache optimization focused on functions with a very high degree of locality. So it may be that micro positioning suffers because of the memory bandwidth wasted on loading gaps. Third, the DEC 3000 600 workstations used in the experiments employ a large second level cache. It may be the ....

Rajiv Gupta and Chi-Hung Chi. Improving instruction cache behavior by reducing cache pollution. In Proceedings Supercomputing '90, pages 82--91, New York, NY, November 1990. IEEE.


Protocol Latency: MIPS and Reality - David Mosberger (1995)   (2 citations)  (Correct)

....the gaps introduced by the micro positioning approach do cost extra i cache bandwidth. We have not found a single instance where aligning function entry points or similar gap introducing techniques would have improved end to end latency. This is in stark contrast with the findings published in [GC90], where i cache optimization focused on functions with a very high degree of locality. Third, the DEC 3000 600 workstations used in the experiments employ a large second level cache. It may be the case that the initial i cache misses also missed in the second level cache. On the other hand, ....

Rajiv Gupta and Chi-Hung Chi. Improving instruction cache behavior by reducing cache pollution. In Proceedings Supercomputing '90, pages 82--91. IEEE, 1990.


Analysis of Techniques to Improve Protocol Processing Latency - Mosberger (1996)   (43 citations)  (Correct)

....do cost extra memory bandwidth. This hypothesis is corroborated by the fact that we have not found a single instance where aligning function entry points or similar gap introducing techniques would have improved end to end latency. Note that this is in stark contrast with the findings published in [11], where icache optimization focused on functions with a very high degree of locality. So it may be that micro positioning suffers because of the memory bandwidth wasted on loading gaps. Third, the DEC 3000 600 workstations used in the experiments employ a large second level cache. It may be the ....

R. Gupta and C.-H. Chi. Improving instruction cache behavior by reducing cache pollution. In Proceedings Supercomputing '90, pages 82--91. IEEE, 1990.


Design of Trace Caches for High Bandwidth Instruction Fetching - Sung   (Correct)

No context found.

R. Gupta and C.-H. Chi, "Improving Instruction Cache Behavior by Reducing Cache Pollution, " Proceedings of the

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC