Optimizing Memory Accesses Using Advanced Compile-Time Memory Disambiguation Techniques
Abstract:
Load latency observed by dependent instructions can significantly delay the initiation of subsequent computation. Therefore various load optimization techniques that expedite the arrival of loaded data can effectively reduce a program's execution time. Memory to register promotion performed by the compiler is the most effective optimization since it completely bypasses the memory system. An alternative compiletime approach is to schedule a load earlier to partially or completely hide its latency. However, their applicability is usually limited by aliases, function boundaries, and the number of registers. In this paper, we report the performance improvement enabled by advanced compiletime memory disambiguation. We have designed and implemented a new interprocedural pointer analysis framework in the IMPACT compiler. With the improved quality in compile-time memory disambiguation, register promotion and instruction scheduling can be done in a more aggressive manner. The effectiveness of the new compilation strategy has been comprehensively evaluated using the complete SPECint92 and SPECint95 benchmark suites, which have long been conservatively optimized due to the lack of static memory disambiguation technology able to accommodate these programs ' size and intensive usage of pointers. The experimental results show that advanced memory disambiguation translates to peak speedup of 1.58 over commonly practiced intraprocedural memory disambiguation, with an average speedup of 1.19.
Citations
| 213 | The superblock: an effective technique for vliw and superscalar compilation – Hwu, Mahlke, et al. - 1993 |
| 5 | Pinline: A profile-driven automatic inliner for the impact compiler – Cheng - 1997 |

