28 citations found. Retrieving documents...
P. A. Steenkiste and J. L. Hennessy, `A simple interprocedural register allocation algorithm and its effectiveness for LISP', ACM TOPLAS, 11, (l), 1--32 (1989).

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Light Weight Optimizations for Reducing Hot Saves and.. - Of Callee-Saved Registers   (Correct)

....assigns global variables to dedicated registers and assigns local variables, that are not referenced by concurrent paths, to the same registers according to their liveness in the call and control flow graphs of the program This method requires a large number of registers. Steenkiste and Hennessey [18] show a simpler inter procedural analysis that allocates registers for procedures using a depth first traversal on the call graph of the program. Chow [1] presents a one pass inter procedural register allocation scheme by using bottom up processing of the procedures call graph as in [18] Several ....

....Hennessey [18] show a simpler inter procedural analysis that allocates registers for procedures using a depth first traversal on the call graph of the program. Chow [1] presents a one pass inter procedural register allocation scheme by using bottom up processing of the procedures call graph as in [18]. Several post link tools such as ALTO [13] and SPIKE [2, 3] also try to attack this problera Similar to [1] they distinguish between rarely (cold) and frequently (hot) executed paths within each function and try to avoid execution of instructions in the hot paths at the expense of the colder ....

P. Steenkiste and J. Hennessy, "A Simple Interprocedural Register Allocation Algorithm and its Effectiveness for LISP", ACM Transactions on Programming Languages and Systems, Volume 11, Number 1, pp. 1-32, January 1989.


Performance Tradeoffs In Multithreaded Processors - Agarwal (1991)   (38 citations)  (Correct)

....special instruction that is not strictly tied to the procedure call. In our design, a process does not use multiple register windows. Several studies have shown that single process frames, combined with register allocation methods, can achieve comparable performance to register windows (e.g. see [18, 19]) Our hrdware modifications will improve SPARC s switching efficiency and allow multiple context partitioning of the registers in the floating point coprocessor as well. See [10] for more details. A processor that permits rapid context switching and fast trap handling (facilitated by the same ....

P. A. Steenkiste and J. L. Hennessy. A Simple Interprocedural Register Allocation Algo- rithm and Its Effectiveness for LISP. A CM Transactions on Programming Languages and Systems, 11(1):1-32, January 1989.


Fusion-Based Register Allocation - Lueh (1997)   (1 citation)  (Correct)

....f tries to avoid using the same callee save registers used by g and h such that save restore operations of the registers become redundant and can therefore be eliminated. 26 CHAPTER 2. BACKGROUND ffl Steenkiste develops an inter procedural register allocator in the context of a LISP compiler[58]. Because LISP programs tend to spend most of time in the bottom of the call graph, register allocation is performed from the leaves to the root over the call graph, like Chow s approach [17] Different registers (not used by the descendants of the current function in the call graph) are assigned ....

P.A. Steenkiste and J.L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


Code Compression Techniques for Embedded Systems - Nyström, Runeson, Sjödin   (Correct)

....the number of register to register copies through coalescing [8] we expect that the overhead of procedure calls can be kept low. To reduce the overhead of saving and restoring registers at procedure calls, it will probably be worthwhile to consider an interprocedural register allocation technique [1,17,16,10]. An ambitious compiler for an embedded processor could of course apply code compression at both the intermediate code and machine code. Code compression applied to intermediate code can recognize repeated occurrences of large program fragments, will not miss opportunities due to the effects of ....

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for Lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


Systems for Late Code Modification - Wall (1992)   (26 citations)  (Correct)

....chose the register variables during linking and modified the object modules being linked to reflect this choice. Register allocation is a fairly high level optimization, however, and other approaches have been taken, such as monolithic compilation of source modules or intermediate language modules [3,10,20] or compilation with reference to program summary databases [19] Optimization removes unnecessary operations; instrumentation adds them. A common form of machine level instrumentation is basic block counting. We transform a program into an equivalent program that also counts each basic block as ....

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for LISP. ACM Transactions on Programming Languages and Systems 11 (1), pp. 1-32, January 1989.


Experience with a Software-Defined Machine Architecture - Wall (1991)   (6 citations)  (Correct)

....restoring these registers at entry and exit. In practice this means that it keeps a local in a register only if the local is used more than twice. The allocator uses frequency estimates or a dynamic profile to make that judgement. 3.3.9. Comparison with Steenkiste s allocation method Steenkiste [35,36] independently developed an approach similar to ours. He does not do allocation in the linker and therefore has no need for annotations or module rewriting, but his algorithm for allocating registers to variables is much the same. The major difference is that Steenkiste does not use frequency ....

....was, we might have taken the trouble to make the debugger smarter. Our greatest success was the technique of link time code modification. By itself, register allocation at link time may be overkill; very global optimization by monolithic compilation of source files or of intermediate code files [12,21,36,45] or perhaps by reference to persistent program data bases [34] could still turn out to be a better tradeoff. But our machinery for code modification led us to develop a wide variety of tools for performance analysis at the source and machine levels. Most of these tools actually require very little ....

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for LISP. ACM Transactions on Programming Languages and Systems 11 (1), pp. 1-32, January 1989.


Back End Issues for Modern Microprocessors: The State of the Art - Faxén (1997)   (Correct)

....interprocedural register allocation and can be seen as a flexible combination of the above; for registers which the code for bar uses, the caller saves convention is used with callee saves used for the rest. Interprocedural register allocation as described here has been used by several authors [30, 12, 4, 28]. This method essentially moves register saving and restoring up in the call graph; this improves the running time of the program if it spends most of its time lower down. If that is not the case, care has to be taken so that the code generated high up in the call graph does not become too ....

P. A. Steenkiste and J. L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for Lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


Interprocedural Register Allocation for Lazy Functional Languages - Boquist (1995)   (8 citations)  (Correct)

....if we allocate registers for one procedure at a time in a bottom up traversal of the procedure call graph, and avoid using registers used by the descendant procedures we do not need to save and restore registers around procedure calls. Such an algorithm was developed by Steenkiste and Hennessy [SH89, Ste91] and implemented in the context of the PSL (Portable Standard Lisp) compiler [GBJ82] The main motivation for the bottom up algorithm was the observation 2.1. REGISTER ALLOCATION 11 that many Lisp programs ispend most of their time in the bottom of the call graphj. If the allocator ran out ....

....is then calculated as n d where d is the depth of the procedure where the use occurs. This will result in larger costs for procedures in the bottom of the call graph. It is in line with Steenkiste and Hennessy s observation that iprograms spend most of their time in the bottom of the call graphj [SH89] for Lisp programs) The actual value of n is not particularly important, what is important is that some regions of the code are more weighted than other regions. As above, 10 is a number that seems to work well in practice. 4.3.4 Simplify Our simplify phase is a variant of Briggs optimistic ....

[Article contains additional citation context not shown here]

Peter A. Steenkiste and John L. Hennessy. A Simple Interprocedural Register Allocation and Its Eoeectiveness for LISP. ACM Transactions on Programming Languages and Systems, 11(1):132, January 1989.


Systems for Late Code Modification - Wall (1991)   (26 citations)  (Correct)

....chose the register variables during linking and modified the object modules being linked to reflect this choice. Register allocation is a fairly high level optimization, however, and other approaches have been taken, such as monolithic compilation of source modules or intermediate language modules [15] or compilation with reference to program summary databases [14] Optimization removes unnecessary operations; instrumentation adds them. A common form of machine level instrumentation is basic block counting. We transform a program into an equivalent program that also counts each basic block as ....

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for LISP. ACM Transactions on Programming Languages and Systems 11 (1), pp. 1-32, January 1989.


Minimum Cost Interprocedural Register Allocation - Kurlander, Fischer (1996)   (6 citations)  (Correct)

....then allocated registers based on the total frequency in which their members are referenced. Walls allocator may not find the best allocation with respect to his model, since he allows locals infrequently referenced to be grouped together with locals frequently referenced. Steenkiste and Hennessy [SH89] design an interprocedural register allocator for LISP programs. Their approach allocates registers to locals in a bottom up fashion over the call graph. Since they find that LISP programs tend to spend their time in the leaf procedures of a call graph, their method first allocates registers in ....

....benefit of allocating a register to a candidate, these candidates are always allocated a register. Figure 10 compares the execution time improvement of adding our minimum cost interprocedural register allocator with spills with Steenkiste and Hennessy s bottom up interprocedural register allocator[SH89] to gcc. The benchmarks are compiled at optimization level O2 with loop unrolling enabled. Results from a sample of SPEC92 benchmarks are presented. Both interprocedural register allocators find a significant improvement on benchmark doduc, as this benchmark has procedures with many registers live ....

[Article contains additional citation context not shown here]

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for LISP. Transactions on Programming Languages and Systems, pages 1--30, January 1989.


A Comparison of Scalable Superscalar Processors - Bradley Kuszmaul   (Correct)

.... with an infinite instruction window and good branch prediction [8] Patel, Evers and Patt demonstrate significant parallelism for a 16 wide machine given a good trace cache [14] Patt et al. argue that a window size of 1000 s is the best way to use large chips [15] Steenkiste and Hennessy [18] conclude that certain compiler optimizations can significantly improve a program s performance if many logical registers are available. And, although the past cannot guarantee the future, the number of logical registers has been steadily increasing over time. The amount of parallelism available ....

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


A Typed Functional Language for Expressing Register Usage - Agat (1998)   (2 citations)  (Correct)

....capable of expressing the register behaviour of such lambda terms. Our language has strong flavours of assembler and is capable of expressing the register assignments resulting from any register allocation algorithm, including that of inter procedural, graph colouring based register allocation [SH89, Ste91] Terms in our language are annotated with registers and stack slots to determine their semantics and type. While still being a lambda calculus, the language can express register transfers and other low level operations. In our system, types propagate information about where in the ....

.... estimate how function argument and suspended computations drift from construction site to consumption sites [BJ96] Based on the observation that many Lisp programs spend most of their time in the bottom of the call graph, Steenkiste and Hennessy have developed the bottom up register allocation [SH89, Ste91] This approach uses the entire program to compute its call graph but register allocation is then made on a per procedure basis starting at the bottom nodes of the call graph and moving up. Information on the register behaviour of children in the call graph are propagated up and then used ....

Peter A. Steenkiste and John L. Hennessy. A Simple Interprocedural Register Allocation and Its Effectiveness for LISP. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


Issues in Register Allocation by Graph Coloring - Lueh (1996)   (Correct)

....tries to avoid using the same callee save registers used by its lower regions on the call graph such that save restore operations of the registers become redundant and can therefore be eliminated. ffl Steenkiste develops an inter procedural register allocation in the context of a LISP compiler[16]. Because LISP programs tend to spend most of time in the bottom of the call graph, register allocation is performed from the leaves to the root over the call graph, like Chow s approach [8] Different registers (not used by the descendants of the current function in the call graph) are assigned ....

P.A. Steenkiste and J.L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


Accurate Static Branch Prediction by Value Range Propagation - Patterson (1995)   (38 citations)  (Correct)

....calls in most programs significantly reduces the effectiveness of the register set. Register windows are a hardware solution to this problem, but interprocedural register allocation which takes into account the probabilities of function calls can make much better use of a given register set [Wall86, Wall88, Wall91, SteenkisteHennessy89]. In addition to the three optimizations mentioned above, several traditional high level optimizations can also benefit from knowledge of frequently executed paths by using tail duplication to create what are effectively larger basic block structures [ChangMahlkeHwu91] A similar mechanism can ....

Peter A. Steenkiste and John L. Hennessy. A Simple Interprocedural Register Allocation Algorithm and Its Effectiveness for LISP. ACM Transactions on Programming Languages and Systems 11(1), January 1989, pages 1-32.


Interprocedural Register Allocation for Lazy Functional Languages - Boquist (1995)   (8 citations)  (Correct)

....are used to decide what variables should be allocated to registers. An observation used by Wall, and by most other interprocedural allocators, is that local variables of procedures that cannot be active at the same time can be allocated to the same registers. Chow [14] and Steenkiste and Hennessy [31] present methods where the allocation is done on one procedure at a time, in contrast to Wall s method, but interprocedural information is used to reduce the procedure call and return overhead. By compiling the procedures according to a bottomup ordering of the procedure call graph, and avoid ....

....(SCCs) of the procedure call graph to decide if a call can be recursive. We need only save and restore local variables for calls inside the same SCC. There are some variations on where the save and restore instructions can be placed. In Figure 5 we show two different ways, as used by Steenkiste [31] and Wall [33] respectively. Each node represents a procedure and is marked with its register usage. Edges represent calls and are marked with the register save operations done before the call. r1 r1 r2 r1 save r1 save r1 r2 save r1 save r1 r1 r2 r3 r4 save r1 r4 Figure 5: Different ways to handle ....

[Article contains additional citation context not shown here]

Peter A. Steenkiste and John L. Hennessy. A Simple Interprocedural Register Allocation and Its Effectiveness for LISP. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


Quantifying Behavioral Differences Between C and C++ Programs - Calder (1994)   (47 citations)  (Correct)

....and calling conventions are at the core of many architectural optimizations; for example, the Berkeley RISC architecture proposed using rotating register windows, in part because that project relied on compiler implementations that, although dated, were in wide use at the time. Later research [51, 47] indicated that register windows were less advantageous when more sophisticated compile or link time analysis could be performed. Initially, we felt that register windows would benefit C programs more than C programs, because the complexity of interprocedural analysis in the presence of indirect ....

P.A. Steenkiste and J.L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for Lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


Whole-Program Optimization for Time and Space Efficient Threads - Grunwald, Neves (1996)   (11 citations)  (Correct)

....modify, and reassemble a linked program binary that we took advantage of. This paper examines the effect of procedure cloning, quantifies the amount of unnecessary spill code, and applies interprocedural register analysis to the optimization of context switches. Wall [22] and Steenkiste [21] describe active global register optimizations that re allocate registers using interprocedural information. This work showed that assigning registers at link time rather than at compile time can result in much better register utilization. Both the storage management and context switch ....

P.A. Steenkiste and J.L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for Lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


APRIL: A Processor Architecture for Multiprocessing - Agarwal (1990)   (186 citations)  (Correct)

....switching and rapid trap handling because most of the state of a process (i.e. its 24 local registers) can be switched with a single cycle instruction. Although we are not using multiple register windows for procedure calls within a single thread, this should not significantly hurt performance [25, 24]. To implement coarse grain multithreading, we use two register windows per task frame a user window and a trap window. The SPARC processor chosen for our implementation has eight register windows, allowing a maximum of four hardware task frames. Since the SPARC does not have multiple program ....

P. A. Steenkiste and J. L. Hennessy. A Simple Interprocedural Register Allocation Algorithm and Its Effectiveness for LISP. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


An Experimental Study of Several Cooperative Register.. - Norris, Pollock (1995)   (8 citations)  (Correct)

....24, 4, 14, 32] The goal of an ambitious register allocator is to allocate the machine s physical registers to program values to minimize the number of run time memory accesses. Register allocation techniques are either local [22] global [11, 10, 9, 30, 8, 20, 27, 23, 19] or interprocedural [33, 31] depending on whether the allocator attempts an assignment of registers to values within basic blocks in isolation of other basic blocks, across basic blocks of This work was partially supported by NSF under grant CCR 9300212. a procedure, or across procedure boundaries, respectively. The ....

Peter Steenkiste and John Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for LISP. ACM Transactions on Programming Languages and Systems, January 1989.


The GRIN Project: A Highly Optimising Back End for Lazy.. - Boquist, Johnsson (1996)   (4 citations)  (Correct)

....eliminate evals and thunks, to do unboxing, and update elimination. Register allocation. To our knowledge, interprocedural register allocation has not been applied previously to code generated from a lazy functional language. It has been applied to other kinds of languages though, e.g. to Lisp [SH89] and to C [Cho88,Wal86] 14 Conclusions and further work Our preliminary results look very promising, but there is a lot of implementation work that needs to be done before we can say if our back end really can be made practical. We can not yet say how our interprocedural approach will scale up ....

Peter A. Steenkiste and John L. Hennessy. A Simple Interprocedural Register Allocation and Its Eoeectiveness for LISP. ACM Transactions on Programming Languages and Systems, 11(1):132, January 1989.


A Register Allocation Framework Based on Hierarchical.. - Hendren, Gao, Altman, .. (1993)   (34 citations)  (Correct)

....For example, when allocating registers interprocedurally it is beneficial to allocate a minimal number of registers to each procedure using such a solution. This reduces the amount of register saving required at procedure call time, and can also improve interprocedural register allocation [10]. 2. Using the information captured by interval graphs, we have developed a two step approach for solving Problem 2. This approach makes effective use of the optimal solution of Problem 1 to minimize the spilling cost. As we show in Section 4, this is particularly important for programs in which ....

....subscripted variables. The subscripted variables are allocated a set of registers that form a register pipeline. Eisenbeis et.al. proposed a method based on cyclic scheduling for optimizing register usage on the Cray 2 [19] Interprocedural register allocation has been studied by a number of people [20, 10, 21]. For example, Steenkiste and Hennessy have developed an algorithm for interprocedural register allocation where a procedure interference graph is constructed. Each node in the graph is a procedure of the program. Two procedures which are active at the same time are adjacent in the procedure ....

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for LISP. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


A New Fast Algorithm for Optimal Register Allocation in.. - Lelait, Gao, Eisenbeis (1998)   (2 citations)  (Correct)

....For example, ffl When allocating registers interprocedurally it is beneficial to allocate a minimal number of registers to each procedure using such a solution. This reduces the amount of register saving required at procedure call time, and can also improve interprocedural register allocation [20]. ffl When performing global register allocation, it is often useful to do the allocation hierarchically, i.e. it is useful to know the minimum register budget needed for a particular code section (i.e. loops) as an input to the overall register allocation decision. Optimal register allocation ....

Peter A. Steenkiste and John L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for Lisp. ACM Transactions on Programming Languages and Systems, 11(1):1--32, January 1989.


An Experiment with Inline Substitution - Cooper, Hall, Torczon (1991)   (32 citations)  (Correct)

No context found.

P. A. Steenkiste and J. L. Hennessy, `A simple interprocedural register allocation algorithm and its effectiveness for LISP', ACM TOPLAS, 11, (l), 1--32 (1989).


Quantifying Behavioral Differences Between C and C++ Programs - Calder, Grunwald, Zorn (1995)   (47 citations)  (Correct)

No context found.

P.A. Steenkiste and J.L. Hennessy. A simple interprocedural register allocation algorithm and its effectiveness for Lisp. ACM Transactions on Programming Languagesand Systems, 11(1):1--32, January 1989.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC