MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Application-specific processing on a general-purpose core via transparent instruction set customization (2004) [11 citations — 0 self]

Download:
Download as a PDF
by Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, Krisztián Flautner
In Proceedings of the International Symposium on Microarchitecture
http://cccp.eecs.umich.edu/papers/ntclark-micro04.pdf
Add To MetaCart

Abstract:

Application-specific instruction set extensions are an effective way of improving the performance of processors. Critical computation subgraphs can be accelerated by collapsing them into new instructions that are executed on specialized function units. Collapsing the subgraphs simultaneously reduces the length of computation as well as the number of intermediate results stored in the register file. The main problem with this approach is that a new processor must be generated for each application domain. While new instructions can be designed automatically, there is a substantial amount of engineering cost incurred to verify and to implement the final custom processor. In this work, we propose a strategy to transparent customization of the core computation capabilities of the processor without changing its instruction set. A configurable array of function units is added to the baseline processor that enables the acceleration of a wide range of dataflow subgraphs. To exploit the array, the microarchitecture performs subgraph identification at run-time, replacing them with new microcode instructions to configure and utilize the array. We compare the effectiveness of replacing subgraphs in the fill unit of a trace cache versus using a translation table during decode, and evaluate the tradeoffs between static and dynamic identification of subgraphs for instruction set customization. 1.

Citations

284 Garp: A MIPS Processor with a Reconfigurable Coprocessor – Hauser, Wawrzynek - 1997
279 Dynamo: a transparent dynamic optimization system – Bala, Duesterwald, et al. - 2000
267 SimpleScalar: an infrastructure for computer system modeling – Austin, Larson, et al. - 2002
165 A high-performance microarchitecture with hardware-programmable functional units – Razdan, Smith - 1994
157 Processor reconfiguration through instruction-set metamorphosis – Athanas, Silverman - 1993
86 The Performance Potential of Data Dependence Speculation and Collapsing – Sazeides, Vassiliadis, et al. - 1996
76 Putting the Fill Unit to Work: Dynamic Optimizations for Trace Cache Microprocessors – Friendly, Patel, et al. - 1998
65 Automatic application-specific instruction-set extensions under microarchitectural constraints – Atasu, Pozzi, et al. - 2003
46 rePLay: A Hardware Framework for Dynamic Optimization – Patel, Lumetta - 2001
43 Instruction pre-processing in trace processors – Jacobson, Smith - 1999
37 Processor acceleration through automated instruction set customization – Clark, Zhong, et al. - 2003
37 A Quantitative Analysis of Reconfigurable Coprocessors for Multimedia Applications – Miyamori, Olukotun - 1998
30 DISE: A programmable macro engine for customizing applications – Corliss, Lewis, et al. - 2003
25 Automatic generation of application specific processors – Goodwin, Petkov - 2003
24 Synthesis of application specific instruction sets – Huang, Despain - 1995
24 High-performance 3-1 interlock collapsing ALU’s – Phillips, Vassiliadis - 1994
20 DISC: The dynamic instruction set computer – Wirthlin, Hutchings - 1995
18 The effect of reconfigurable units in superscalar processors – Esparza, Chow - 2001
18 Dynamic binary translation and optimization – Ebcio˘glu, Altman, et al. - 2001
17 From sequences of dependent instructions to functions: A complexity-effective approach for improving performance without ILP or speculation – Yehia, Temam - 2004
15 et al. The multiflow trace scheduling compiler – Lowney - 1993
15 Characterizing Embedded Applications for Instruction-Set Extensible Processors – Yu, Mitra - 2004
14 et al. The Transmeta Code Morphing Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-life Challenges – Dehnert - 2003
8 et al. Performance characterization of a hardware mechanism for dynamic optimization – Fahs - 2001
7 et al. Synthesis of custom processors based on extensible platforms – Sun - 2002
7 et al. Chimaera: a high-performance architecture with a tightly-coupled reconfigurable functional unit – Ye - 2000
6 Characterization of repeating dynamic code fragments – Spadini, Fertig, et al. - 2002
5 et al. Instruction generation and regularity extraction for reconfigurable processors – Brisk - 2002
5 et al. Automatic instruction set extension and utilization for embedded processors – Peymandoust - 2003
4 Piperench implementation of the instruction path coprocessor – Chou, Pillai, et al. - 2000
4 A high performance 32-bit alu for programmable logic – Metzgen - 2004
1 et al. Morphosys: A reconfigurable processor trageted to high performance image application – Lu - 1999