| Abraham Mendlson, Shlomit S. Pinter, and Ruth Shtokhamer. Compile time instruction cache optimizations. In Compiler Construction, pages 404--418, April 1994. |
....the relative weight of the fall through and taken paths of a branch. In pack, the algorithm naively packs frequently executed basic blocks. All of the above algorithms utilize execution frequencies and or transitional counts to guide code repositioning. Two other approaches discussed in [78] and [79] reorganize code, based on compile time information. In [79] code replication is performed based on the structure of the control flow graph (augmented with loop and procedure call information) The graph is partitioned into subgraphs, smaller or equal in size to the cache, using heuristics. In ....
....a branch. In pack, the algorithm naively packs frequently executed basic blocks. All of the above algorithms utilize execution frequencies and or transitional counts to guide code repositioning. Two other approaches discussed in [78] and [79] reorganize code, based on compile time information. In [79], code replication is performed based on the structure of the control flow graph (augmented with loop and procedure call information) The graph is partitioned into subgraphs, smaller or equal in size to the cache, using heuristics. In [78] a similar approach to [52] is presented, where a call ....
A. Mendlson, S. Pinter, and R. Shtokhamer. Compile-Time Instruction Cache Optimizations. In Proceedings of the International Conference on Compiler Construction, pages 404--418, April 1994.
....[Hei94b, Hei94a] while others take into account the branch prediction architecture of the hardware [CG94] McFarling [McF91] has investigated the use of cache parameters in selecting procedures to be inlined. Mendlson et al. have shown how to avoid conflict misses of instructions in loops [MPS94] Improvements to the data cache performance of programs have primarily been limited to scientific code that operates on loops. Many of these investigations focus on analysis of data utilization to guide program transformations, particularly on loops, to improve data locality [ASKL81, GJG88, ....
Abraham Mendlson, Shlomit S. Pinter, and Ruth Shtokhamer. Compile time instruction cache optimizations. ACM Computer Architecture News, 22(1):44-- 51, March 1994.
....eliminate first generation cache conflicts, even when the popular subgraph size is larger than the instruction cache, by using the color mapping and the unavailable set of colors. Mendlson et al. also examined using static information (i.e. locations of program loops) to avoid cache conflicts [19]. Their work was at the instruction level of a program and both replicate and rearrange code in order to provide an improved program layout. They use the concept of an abstract cache to remap program segments. They applied their algorithm to the gzip program, and obtained a 95 reduction in the ....
A. Mendlson, S.S. Pinter, and R. Shtokhamer. Compile time instruction cache optimizations. Computer Architecture News, pages 44--51, 1993.
....Chains are formed by merging nodes, laying them out next to each other until the entire graph is processed. A number of related techniques have been proposed, focusing on mapping loops [6] operating system code [10] traces [9] and activity sets [7] Two other approaches discussed in [8] and [11] reorganize code based on compile time information. The profile guided algorithms described above use calling frequencies to weight a graph and guide placement [3] 6] 9] 10] Our first approach also uses calling frequencies but improves performance by intelligently placing procedures in the ....
....in spirit with the approach described in [5] with some differences that will be highlighted in Section V. Also, our graph coloring algorithm works at a finer level of granularity (cache line size instead IEEE TRANSACTIONS ON COMPUTERS, VOL. 01, NO. 1, FEBRUARY 1999 101 of cache size [6] [11]) and can avoid conflicts encountered when either forming chains with the closest is best heuristic [3] or dealing with subgraphs having a size larger than the cache. This paper is organized as follows. In Sections II and III we describe our graph construction algorithms. Section II describes an ....
A.Mendlson, S.Pinter, and R.Shtokhamer, "Compile-Time Instruction Cache Optimizations," in Proceedings of the International Conference on Compiler Construction, April 1994, pp. 404--418.
....reordering. McFarling examined improving instruction cache performance by not caching infrequently used instructions and by performing code reordering compiler optimizations [18 ] Mendlson et al. also examined using static information (i.e. locations of program loops) to avoid cache conflicts [19 ]. Their work was at the instruction level of a program and both replicate and rearrange code in order to provide an improved program layout. Torrellas, Xia and Daigle [28] TXD) also described an algorithm for code layout for operating system intensive workloads. Their work takes into ....
A. Mendlson, S.S. Pinter, and R. Shtokhamer. Compile time instruction cache optimizations. Computer Architecture News, pages 44--51, 1993.
....section) If our scheme was implemented over code organized in descending order of usage, the improvement would have been much higher than what we obtained here. 4 Other Related Research Optimizations based on compile time information have been observed to be very effective in the past [9] 31] [30] [41] 42] 47] If the compiler can find instructions (or groups of instructions) that need to be in the cache at the same time, the performance can be improved by placing the instructions so that they do not map into the same cache block. Chang et al. 9] developed a technique based on ....
....cache hit ratio. McFarling s technique uses a profile of the conditional, loop and routine structure of the program. McFarling places the basic blocks in such a way that callers of routines, loops, and conditionals do not interfere with the callee routines or their descendents. Mendlson et al. [30] employs code replication based on static information to eliminate conflicts. Temam and Drach [41] used simple software information on the temporal spatial locality of array references, as provided by current data locality optimizing algorithms to significantly increase cache performance. ....
A. Mendlson, S. Pinter, and R. Shtokhamer, "Compile Time Instruction Cache Optimizations ", In Computer Architecture News, pp. 44-51, March 1994.
....the remaining portion of X) placing X1 with respect to Y is only constrained by the size of X1, and the same is true for placing X2 with respect to Z. 7. Comparison to previous work There has been a considerable amount of work done on code positioning for improved instruction cache performance [2, 3, 4, 6, 7, 8, 9, 10, 11]. We next discuss some of this work, as it relates to our algorithm. In [9] McFarling proposes a basic block remapping algorithm which captures control flow in the form of a Directed Acyclic Graph. The algorithm partitions the graph, paying special attention to loop nodes, with the goal of ....
....with sizes smaller than the cache (activity sets) A search function is used to guide an iterative process to find a minimum of the cost function. Their search function is chosen so that a large number of combinations are covered on each experiment. Two other approaches discussed in [3] and [10] reorganize code, based on compile time information. In [10] code replication is performed based on the control flow of the graph (augmented with loop and procedure call information) The authors also attempt to partition the graph into subgraphs, smaller or equal in size to the cache using ....
[Article contains additional citation context not shown here]
A. Mendlson, S. Pinter, and R. Shtokhamer. Compiletime instruction cache optimizations. In Proceedings of the Int. Conference on Compiler Construction, pages 404--418, April 1994.
No context found.
Abraham Mendlson, Shlomit S. Pinter, and Ruth Shtokhamer. Compile time instruction cache optimizations. In Compiler Construction, pages 404--418, April 1994.
No context found.
A. Mendlson, S. Pinter, and R. Shtokhamer, "Compile Time Instruction Cache Optimizations ", In Computer Architecture News, pp. 44-51, March 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC