| J. Llosa, M. Valero, E. Ayguade, and A. Gonzalez, "Modulo scheduling with reduced register pressure," IEEE Transactions on Computers, vol. 47, no. 6, pp. 625--638, 1998. |
....for VLIW architectures and numerical programs. Our workbench is composed of 1180 loops that account for 78 of the execution time of the Perfect Club [3] The loops have been obtained using the experimental tool Ictneo [2] and software pipelined using Hypernode Reduction Modulo Scheduling [15,16], a register pressure sensitive heuristic that achieves near optimal schedules. Register allocation has been performed using the wands only strategy and the end fit with adjacency ordering [22] When a loop requires more than the available number of registers, spill code is added and the loop is ....
J. Llosa, M. Valero, E. Ayguad and A. Gonzlez. Modulo Scheduling with reduced register pressure. In IEEE Trans. on Computers, vol. 47 n. 6 pp 625-638, June 1998.
No context found.
J. Llosa, M. Valero, E. Ayguade, and A. Gonzalez. Modulo Scheduling with Reduced Register Pressure. IEEE Transactions on Computers, 47(6):625--638, 1998.
....and resource requirements. For loops with high trip counts, the can be used to approximate the overall runtime of the loop. High register bus pressure caused by inter cluster communications and high register pressure (i.e. many operands live concurrently) can dramatically increase the [26]. In this work we look to provide an scheduling approach that L 1 CACHE L OCAL REGIS T E R FILE FU FU FU L OCAL REGIS T E R FILE FU FU FU Regis ter Buses CLUSTER 1 CLUSTER n Figure 1. Clustered VLIW microarchitecture. Register values are communicated through inter cluster register ....
....have been proposed in an attempt to find near optimal schedules. The objectives of past heuristicbased approaches have had different goals: increasing throughput [20, 32] minimizing register pressure [9, 8] reducing the effect of the cache misses, or improving several objectives simultaneously [8, 19, 26, 34]. All of these studies focused on modulo scheduling algorithms targeting unified (i.e. non partitioned) architectures. A comparison of some of these techniques can be found in [5] There are several works related to acyclic code scheduling for clustered architectures [3, 7, 10, 21, 30] The most ....
J. Llosa, M. Valero, E. Ayguade, and A. Gonzalez. Modulo Scheduling with Reduced Register Pressure. IEEE Transactions on Computers, 47(6):625--638, 1998.
....in order to decide when each operation is scheduled. The algorithm is similar to Iterative Modulo Scheduling in the sense that it uses a limited amount of backtracking by possibly ejecting operations already scheduled to give place to a new one. Hypernode Reduction Modulo Scheduling (HRMS) [24,25] is a heuristic strategy that tries to shorten loop variant lifetimes, without sacrificing performance. The main contribution of HRMS is the node ordering strategy. The ordering phase sorts the nodes before scheduling them, such that only predecessors or successors of a node can be scheduled ....
J. Llosa, M. Valero, E. Ayguade, and A. Gonzalez. Modulo scheduling with reduced register pressure. In IEEE Trans. on Computers, vol. 47, no. 6, pages 625-638, June 1998.
No context found.
J. Llosa, M. Valero, E. Ayguade, and A. Gonzalez, "Modulo scheduling with reduced register pressure," IEEE Transactions on Computers, vol. 47, no. 6, pp. 625--638, 1998.
No context found.
J. Llosa, M. Valero, E. Ayguade, and A. Gonzalez, "Modulo scheduling with reduced register pressure," IEEE Transactions on Computers, vol. 47, no. 6, pp. 625--638, 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC