by Its Smoothability, Kevin B. Theobald, Guang R. Gao, Laurie J. Hendren
Smoothability, Proceedings of Micro-25
http://www.sable.mcgill.ca/~hendren/ftp/acaps/memo40.ps.gz
Add To MetaCart
Abstract:
recent studies of the so-called "limits on instruction parallelism " in application programs have reported limits that are surprisingly low (3-7 instructions) for well-known benchmark programs. In this paper, we report results of a new study of instruction-level parallelism and the smoothability of this parallelism. In addition to showing a strikingly high limit of parallelism for an oracle machine model, we also study the following new aspects of parallelism and smoothability. Parallelism Limits: In addition to confirming some results recently reported (i.e. by Wilson and Lam [LW92]), our work also provides answers to the following important questions for architects and compiler writters which were left open: ffl What are the most important characteristics of the oracle machine models? ffl What happens if we allow each test program to run to full completion instead of stopping after a limited number of instructions? ffl Do these results apply to other real programs (run to completion) in addition to the selected benchmark programs? ffl How do various restrictions on the use and reuse of memory impact the potential parallelism Smoothability: In our study, smoothability is measured quantitatively and compared for a number of
Citations
|
333
|
Limits of instruction-level parallelism
– Wall
- 1991
|
|
254
|
APRIL: a processor architecture for multiprocessing
– Agarwal, Lim, et al.
- 1990
|
|
205
|
Limits of control flow on parallelism
– Lam, Wilson
- 1992
|
|
182
|
Available instruction-level parallelism for superscalar and superpipelined machines
– Jouppi, Wall
- 1989
|
|
138
|
Fine-grain parallelism with minimal hardware support: A compiler-controlled threaded abstract machine
– Culler, Sah, et al.
- 1991
|
|
76
|
Dynamic dependency analysis of ordinary programs
– Austin, Sohi
- 1992
|
|
74
|
On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup
– Kuck, Muraoka, et al.
- 1972
|
|
72
|
Measuring the Parallelism Available for Very Long Instruction Word Architectures
– Nicolau, Fisher
- 1984
|
|
62
|
Single instruction stream parallelism is greater than two
– Butler, Yeh, et al.
- 1991
|
|
59
|
Measuring parallelism in computationintensive scientific/engineering applications
– Kumar
- 1988
|
|
44
|
dataflow subsume von Neumann computing
– Can
- 1989
|
|
41
|
Architecture of a Message-Driven Processor
– Dally
- 1987
|
|
25
|
A Mechanism for Efficient Context Switching
– Nuth, Dally
- 1991
|
|
6
|
MASA: A Multithreaded Processor Architecture for Parallel Symbolic Computing
– Jr, Fujita
- 1988
|
|
1
|
Alverson et al. The Tera computer system
– Robert
- 1990
|