Abstract:
This paper presents new architectural concepts for uniprocessor system designs. They result in a uniprocessor design that conforms to the data-driven (i.e., dataflow) computation paradigm. It is shown that usage of this, namely D 2-CPU (Data-Driven) processor, follows the natural flow of programs, minimizes redundant (micro)operations, lowers the hardware cost, and reduces the power consumption. We assume that programs are developed naturally using a graphical or equivalent language that can explicitly show all data dependencies. Instead of giving the CPU the privileged right of deciding what instructions to fetch in each cycle (as is the case for CPUs with a program counter), instructions are entering the CPU when they are ready to execute or when all their operand(s) are to be available within a few clock cycles. This way, the application-knowledgeable algorithm, rather than the application-ignorant CPU, is in control. The CPU is used just as a resource, the way it should normally be. This approach results in outstanding performance and elimination of large numbers of redundant operations that plague current processor designs. The latter, conventional CPUs are characterized by numerous redundant operations, such as the first memory cycle in instruction fetching that is part of any instruction cycle, and instruction and data prefetchings for instructions that are not always needed. A comparative analysis of our design with conventional designs proves that it is capable of better performance and simpler programming. Finally, VHDL implementation is used to prove the viability of this approach.
Citations
|
158
|
Memory bandwidth limitations of future microprocessors
– Burger, Goodman, et al.
- 1996
|
|
139
|
Monsoon: An explicit tokenstore architecture
– Papadopoulos, Culler
- 1990
|
|
123
|
Executing a program on the MIT tagged-token dataflow architecture
– Arvind, Nikhil
- 1990
|
|
89
|
Very Long Instruction Word Architectures and the ELI-52
– Fisher
- 1983
|
|
61
|
A preliminary architecture for a basic dataflow processor
– Dennis
- 1975
|
|
54
|
Space-efficient scheduling of multithreaded computations
– Blumofe, Leiserson
- 1998
|
|
51
|
The Future of Systems Research
– Hennessy
- 1999
|
|
40
|
EPIC: Explicitly Parallel Instruction Computing
– Schlansker, Rau
|
|
37
|
HPSm, a High Performance Restricted Data Flow Architecture Having Minimal Functionality
– Hwu, Patt
- 1986
|
|
28
|
RH: a versatile family of reduced hypercube interconnection networks
– Ziavras
- 1994
|
|
23
|
Memory Interfacing and Instruction Specification for Reconfigurable Processors. FPGA’99
– Jacob, Chow, et al.
- 1999
|
|
8
|
Automatically partitioning threads for multithreaded architectures
– Tang, Gao
- 1999
|
|
7
|
A Universal, Dynamically Adaptable and Programmable Network Router for Parallel Computers
– Golota, Ziavras
- 2001
|
|
7
|
Processing
– Gokhale, Holmes, et al.
- 1995
|
|
5
|
A Low-Complexity Parallel System for Gracious, Scalable Performance. Case Study for Near PetaFLOPS Computing
– Ziavras
- 1996
|
|
4
|
Widening Resources: A Cost-Effective Technique for Aggressive ILP Architectures
– Lopez, Llosa, et al.
- 2000
|
|
2
|
Processor Management
– Alverson, Kahan, et al.
- 1995
|
|
2
|
et al., "A Design Study of the Earth Multiprocessor
– Hum
- 1995
|
|
2
|
et al., "Design Philosophy of a Data-Driven Processor: Q-p
– Terada
- 1988
|
|
2
|
Buffer Assignment Algorithms on Data Driven ASICs
– Chatterjee, Banerjee, et al.
- 2000
|
|
1
|
Memory Management
– Korry, McCann, et al.
- 1995
|
|
1
|
et al., "Cilk: An Efficient Multithreaded Runtime System
– Blumofe
- 1996
|
|
1
|
Wafer Scale Integration and Two Level Pipelined Implementation of Systolic Arrays
– Kung, Lam
- 1984
|
|
1
|
Intelligent Memories for Dataflow Computation and Emulation on FieldProgrammable Gate Arrays
– Ingersoll, Ziavras
- 2002
|