| A.H. Hashemi, D. R.Kaeli, and B. Calder. Procedure Mapping using Static Call Graph Estimation. In Proceedings of the Workshop on Interaction between Compiler and Computer Architecture, February 1997. |
....on the relative weight of the fall through and taken paths of a branch. In pack, the algorithm naively packs frequently executed basic blocks. All of the above algorithms utilize execution frequencies and or transitional counts to guide code repositioning. Two other approaches discussed in [78] and [79] reorganize code, based on compile time information. In [79] code replication is performed based on the structure of the control flow graph (augmented with loop and procedure call information) The graph is partitioned into subgraphs, smaller or equal in size to the cache, using ....
....reorganize code, based on compile time information. In [79] code replication is performed based on the structure of the control flow graph (augmented with loop and procedure call information) The graph is partitioned into subgraphs, smaller or equal in size to the cache, using heuristics. In [78] a similar approach to [52] is presented, where a call graph is constructed statically and weighted based on program estimation. The weighting process takes into account loops and recursive calls. A cache line coloring algorithm, similar to the one introduced in [52] is used to guide procedure ....
[Article contains additional citation context not shown here]
A.H. Hashemi, D. R.Kaeli, and B. Calder. Procedure Mapping using Static Call Graph Estimation. In Proceedings of the Workshop on Interaction between Compiler and Computer Architecture, February 1997.
....is applied only to popular procedures, but could be easily extended for the remaining procedures in the program. Basic block reordering requires weighting the control flow edges of a procedure. We use profile data to perform weighting, but compile time generated edge weights could also be used [8]. Modifying the order of basic blocks requires a fix up step to guar antee correct program semantics. Since we are simulating the reordering step, we have detected all different cases and marked every basic block with a label. The label uniquely defines each case or denotes that no fix up was ....
A. Hashemi, D. Kaeli, and B. Calder. Procedure Mapping using Static Call Graph Estimation. In Proceedings of the Workshop on Interaction between Compiler and Computer Architecture, February 1997.
.... We show how we can use them to compare the temporal content between the CGO, CMG and the Temporal Relationship Graph (TRG) as described by Gloy et al. in [5] There has been a considerable amount of work done on code repositioning for improved instruction cache performance [3] 5] 6] 7] [8], 9] 10] In the following section we discuss some of this work, as it relates to our work here. A. Related Work Pettis and Hansen [3] employ procedure and basic block reordering as well as procedure splitting based on frequency counts to minimize instruction cache conflicts. The layout of a ....
....strategy. Chains are formed by merging nodes, laying them out next to each other until the entire graph is processed. A number of related techniques have been proposed, focusing on mapping loops [6] operating system code [10] traces [9] and activity sets [7] Two other approaches discussed in [8] and [11] reorganize code based on compile time information. The profile guided algorithms described above use calling frequencies to weight a graph and guide placement [3] 6] 9] 10] Our first approach also uses calling frequencies but improves performance by intelligently placing ....
[Article contains additional citation context not shown here]
A.H. Hashemi, D.R. Kaeli, and B. Calder, "Procedure Mapping using Static Call Graph Estimation," in Proceedings of the Workshop on Interaction between Compiler and Computer Architecture, February 1997.
....the remaining portion of X) placing X1 with respect to Y is only constrained by the size of X1, and the same is true for placing X2 with respect to Z. 7. Comparison to previous work There has been a considerable amount of work done on code positioning for improved instruction cache performance [2, 3, 4, 6, 7, 8, 9, 10, 11]. We next discuss some of this work, as it relates to our algorithm. In [9] McFarling proposes a basic block remapping algorithm which captures control flow in the form of a Directed Acyclic Graph. The algorithm partitions the graph, paying special attention to loop nodes, with the goal of ....
....modules with sizes smaller than the cache (activity sets) A search function is used to guide an iterative process to find a minimum of the cost function. Their search function is chosen so that a large number of combinations are covered on each experiment. Two other approaches discussed in [3] and [10] reorganize code, based on compile time information. In [10] code replication is performed based on the control flow of the graph (augmented with loop and procedure call information) The authors also attempt to partition the graph into subgraphs, smaller or equal in size to the cache ....
[Article contains additional citation context not shown here]
A. Hashemi, D. R. Kaeli, and B. Calder. Procedure Mapping using Static Call Graph Estimation. In Proceedings of the Workshop on Interaction between Compiler and Computer Architecture, February 1997.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC