by Timothy Sherwood, Mark Oskin, Brad Calder
In CASES ’04: Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems
http://www.cse.ucsd.edu/~calder/abstracts/../papers/CASES-04-Sherpa.pdf
Add To MetaCart
Abstract:
Application specific processors offer the potential of rapidly designed logic specifically constructed to meet the performance and area demands of the task at hand. Recently, there have been several major projects that attempt to automate the process of transforming a predetermined processor configuration into a low level description for fabrication. These projects either leave the specification of the processor to the designer, which can be a significant engineering burden, or handle it in a fully automated fashion, which completely removes the designer from the loop. In this paper we introduce a technique for guiding the design and optimization of application specific processors. The goal of the Sherpa design framework is to automate certain design tasks and provide early feedback to help the designer navigate their way through the architecture design space. Our approach is to decompose the overall problem of choosing an optimal architecture into a set of sub-problems that are, to the first order, independent. For each sub-problem, we create a model that relates performance to area. From this, we build a constraint system that can be solved using integer-linear programming techniques, and arrive at an ideal parameter selection for all architectural components. Our approach only takes a few minutes to explore the design space allowing the designer or compiler to see the potential benefits of optimizations rapidly. We show that the expected performance using our model correlates strongly to detailed pipeline simulations, and present results showing design tradeoffs for several different benchmarks. Categories and Subject Descriptors:
Citations
|
1253
|
The Simplescalar toolset, version 2.0
– Burger, Austin
- 1997
|
|
664
|
ATOM: A system for building customized program analysis tools
– Srivastava, Eustace
- 1994
|
|
161
|
CACTI: An enhanced cache access and cycle time model
– Wilton, Jouppi
- 1996
|
|
123
|
Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing
– Barroso, Gharachorloo, et al.
- 2000
|
|
106
|
An analytical cache model
– AGARWAL, HOROWITZ, et al.
- 1989
|
|
94
|
An area model for on-chip memories and its applications
– Mulder, Quach, et al.
- 1991
|
|
74
|
Lx: A technology platform for customizable VLIW embedded processing
– Faraboschi, Desoli, et al.
- 2000
|
|
63
|
Xtensa: A configurable and extensible processor
– Gonzalez
|
|
54
|
and bound Methods A Survey
– Lawler, Wood
- 1966
|
|
41
|
Custom-fit processors: Letting applications define architecures
– Fisher, Faraboschi, et al.
- 1996
|
|
32
|
Cryptomaniac: A Fast Flexible Architecture for Se Communication
– Wu, Weaver, et al.
- 2001
|
|
28
|
An Analytical Model for Designing Memory Hierarchies
– Jacob, Chen, et al.
- 1996
|
|
27
|
Platune: a tuning framework for system-on-a-chip platforms
– Givargis, Vahid
- 2002
|
|
26
|
System-level Exploration for Pareto-optimal configurations
– Givargis, Vahid, et al.
- 2001
|
|
21
|
Automatic and efficient evaluation of memory hierarchies for embedded systems
– Abraham, Mahlke
- 1999
|
|
16
|
lp_solve: A Mixed Integer Linear Program Solver, available from ftp://ftp.es.ele.tue.nl/pub/lp_solve
– Berkelaar
|
|
16
|
Expected i-cache miss rates via the gap model
– Quong
- 1994
|
|
14
|
Customized Instruction-Sets for Embedded Processors. Design Automation Conference (DAC
– Fisher
- 1999
|
|
14
|
Efficient architecture/compiler co-exploration for asips
– Fischer, Teich, et al.
- 2002
|
|
13
|
A methodology for accurate performance evaluation in architecture exploration
– Hadjiyiannis, Russo, et al.
- 1999
|
|
13
|
Set-associative cache simulation using generalized binomial trees
– Sugumar, Abraham
- 1995
|
|
12
|
Fast instruction cache performance evaluation using compile-time analysis
– Whalley
- 1992
|
|
11
|
XScale (StrongArm-2) Muscles In
– Leibson
- 2000
|
|
8
|
Effectiveness of the ASIP design system PEAS-III in design of pipelined processors
– Kitajima, Itoh, et al.
- 2001
|
|
7
|
Cacti version 2.0. http://www.research.digital.com/wrl/people/jouppi/CACTI.html
– Reinman, Jouppi
- 1999
|
|
6
|
Efficient design space exploration in pico
– Abraham, Rau, et al.
- 2000
|
|
5
|
Verification of configurable processor cores
– Puig-Medina, Ezer, et al.
- 2000
|
|
5
|
Strongarm 110: A 160mhz 32b 0.5w cmos arm processor
– Santhanam
- 1996
|
|
5
|
Fpga processors cores get serious
– Snyder
- 2000
|
|
4
|
Advanced processor design using hardware description language aidl
– Morimoto, Saito, et al.
- 1997
|
|
3
|
Automated design of finite state machine predictors for customized processors
– Sherwood, Calder
- 2001
|
|
3
|
Synthesizable core makeover: Is lexra’s seven-stage pipelined core the speed king
– Snyder
- 2001
|
|
2
|
Customizing a soft microprocessor core. http://www.arccores.com
– Whitepaper
- 2001
|