28 citations found. Retrieving documents...
M. Katevenis, "Reduced Instruction Set Computer Architectures for VLSI", ACM Doctoral Dissertation Award, MIT Press, 1984.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Scheduling Time-Constrained Instructions on Pipelined.. - Leung, Palem, Pnueli   (Correct)

.... runs in O(n log n (n) ne) time , where n is the number of instructions to be scheduled, and e is the number of edges in the input DAG, and is guaranteed to nd feasible schedules for arbitrary basic blocks of code, for such early RISC machines as the IBM 801 [36] the Berkeley RISC [26] and Stanford MIPS [22] processors. Also, our algorithm can in the same time bound nd feasible schedules whenever such schedules exist, for basic blocks whose data dependence graphs are monotone interval orders [34; 35] This rest of the paper is structured as follows. In Section 2 we provide ....

Katevenis, M. Reduced Instruction Set Computer Architecture for VLSI. MIT Press, Cambridge, Mass., 1984.


The Implementation and Application of Micro Rollback in.. - Tamir, Tremblay, Rennels (1988)   (Correct)

....and the cache in section IV. III. Support for Micro Rollback in a VLSI RISC As part of our investigation of the cost of hardware support for micro rollback, we are designing and implementing a VLSI processor capable of micro rollback. We chose to start with the Berkeley RISC II processor [4] and determine the area overhead and performance penalty for adding to it the ability to perform micro rollback. The process of saving the state of a RISC processor and the method used to rollback in one cycle are described below. The state to be saved and restored is located in the register file ....

....is a possibility that a write into the register file occurs. It is impractical to preserve the state of the file for N cycles by replicating it N times (for example, using shift registers) due to the large area occupied by even one copy of the on chip register file (40 of chip area in RISC II) [4]. We propose an alternative method which minimizes the extra hardware while still allowing a rollback of up to N cycles to be executed in one cycle. High level Description # ##################################### . Whenever the processor writes data to one of its registers, the full address of ....

[Article contains additional citation context not shown here]

M. G. H. Katevenis, "Reduced Instruction Set Computer Architectures for VLSI," CS Division Report No. UCB/CSD 83/141, University of California, Berkeley, CA (October 1983).


Performance Tradeoffs In Multithreaded Processors - Agarwal (1991)   (38 citations)  (Correct)

....register frames act as a software controlled cache of process contexts. The current implementation of APRIL using a SPARC processor [15] modified to support coarse multithreading is called SPARCLE. The register set is divided into several frames that are conventionally used as register windows [16, 17] for speeding up procedure calls. SPARC permits the use of these frames for context switching because the frame pointer is incremented in software by a special instruction that is not strictly tied to the procedure call. In our design, a process does not use multiple register windows. Several ....

M. Katevenis. Reduced Instruction Set Computer Architectures for VLSL Ph.D. Thesis, Computer Science Division (EECS) UCB/CSD 83/141, University of California at Berkeley, October 1983.


A Fast Algorithm for Scheduling Instructions with Deadline.. - Wu, Jaffar, Yap (2000)   (1 citation)  (Correct)

....time of instruction v j must be at least 2 cycles later than the completion time of v i . When a RISC machine is executing a program, it may execute NOPs (No Operations) due to the latencies. The latencies vary in machines. For several RISC machines such as IBM 801 [21] and the Berkeley pipelined [23], the maximum latency is one machine cycle. For some other RISC machines such as IBM RISC System 6000 [24] the maximum latency is more than one cycle. Latencies complicate instruction scheduling. The general problem for scheduling instructions in a basic block so that the maximum completion time ....

Katevenis, M. Reduced Instruction Set Computer Architecture for VLSI. MIT Press, Cambridge, Mass., 1984.


Fault-Tolerance for High-Performance Multi-Module VLSI.. - Tremblay, Tamir (1989)   (Correct)

....for N cycles by simply replicating the storage N times and copying the current values to the oldest replica every cycle. This method can be very costly in terms of area. For example, the area of one copy of the register file in the Berkeley RISC II processor takes up 33 of the total chip area [5]. We have previously proposed [12] an alternative technique that takes advantage of the fact that only one register out of the set can be modified every cycle. An example of this technique is shown in Figure 4. Every time a write is performed, the data and its full register address 301 FR V V V ....

....State Register 3. A Delayed Write Buffer for Infrequently Modified Registers The method used to roll back the register file, described in section 2, is based on the fact that writes into the register file can occur every cycle. For example many RISC processors complete an instruction every cycle [4, 5]. Because it is possible to have N writes during N cycles, a Delayed Write Buffer of depth N is necessary. In some modules, writes to the register file may occur at a lower rate 303 than once per cycle, so a smaller DWB may be sufficient. For example in the Motorola 68881 most instructions take ....

[Article contains additional citation context not shown here]

M. G. H. Katevenis, "Reduced Instruction Set Computer Architectures for VLSI," CS Division Report No. UCB/CSD 83/141, University of California, Berkeley, CA (October 1983).


Architectural Design and Analysis of a VLIW Processor - Arthur Abnous And (1991)   (2 citations)  (Correct)

....approach allows the architect to address design problems with a combination of hardware and software solutions and results in improved performance. 1. 1 Pipelining and RISC processors One of the most important goals of RISC (Reduced Instruction Set Computer) processors is efficient pipelining [7, 8]. The instruction sets of RISC processors are simple and are designed with efficient pipelining and decoding in mind. Because of the simplicity of the instruction set, the underlying hardware of a RISC processor is simple and can run at high speeds. Because of efficient pipelining, the CPI factor ....

....Design problems are addressed with a combination of hardware and software techniques after evaluating hardware software trade offs. Simulation results are used to verify design decisions. This approach has been demonstrated to be very effective for VLSI processor design by RISC research efforts [7, 8]. 3.1 Processor Configuration Operations executed by a processor can be divided into three types. Each type of operation is executed by a corresponding type of hardware execution unit: 1. Control Transfer (CT) operations 2. Load Store (LS) operations 3. Arithmetic Logic (AL) operations Based ....

M. G. H. Katevenis, Reduced Instruction Set Computer Architectures for VLSI , MIT Press, Cambridge, Mass., 1985.


Microarchitectural and Compile-Time Optimizations for.. - Kalamatianos (2000)   (1 citation)  (Correct)

....and compiler design. Architectural support in the form of microarchitectural mechanisms and instruction set architecture (ISA) extensions, as well as many compiler optimizations, have been motivated by studies performed on programs written in procedural languages such as C and or Fortran [1, 2]. Program behavior [3, 4] has played and continues to play an important role in the evolution of modern architectures [5, 6] Design decisions in several machines such as the IBM 801, Stanford MIPS and Berkeley RISC projects were guided by trends found in a set of C, Pascal and FORTRAN programs ....

.... 2] Program behavior [3, 4] has played and continues to play an important role in the evolution of modern architectures [5, 6] Design decisions in several machines such as the IBM 801, Stanford MIPS and Berkeley RISC projects were guided by trends found in a set of C, Pascal and FORTRAN programs [2, 7]. The IBM System 360 ISA has been influenced by frequently used operations in COBOL and FORTRAN applications, such as string manipulation and floating point operations [8] Most recently, microprocessor performance has been evaluated based on the SPEC92 and SPEC95 benchmarks, a mixture of C and ....

Manolis Katevenis. Reduced Instruction Set Computer Architecture for VLSI. MIT Press, 1985.


A Micropipelined ARM - Furber, Day, Garside, Paver, Woods (1993)   (34 citations)  (Correct)

....the practical difficulties such as exact exceptions and backwards instruction set compatibility. 2. BACKGROUND 2.1. The ARM The ARM processor, originally developed at Acorn Computers Ltd in the UK in 1983 85, was the first commercial RISC and was inspired principally by the original Berkeley [3] and Stanford [4] work. ARM uses a load store architecture and a register oriented instruction set [5] It is characterised by being very small, simple (the original silicon used 25,000 transistors) lowcost and low power (the current ARM6 macrocell delivers 100MIPS Watt) and has been chosen as ....

M. G. H. Katevenis, "Reduced Instruction Set Computer Architectures for VLSI", MIT Press, Cambridge, MA (1985).


A fast algorithm for scheduling time-constrained instructions.. - Leung, al. (1998)   (4 citations)  (Correct)

.... that of Garey and Johnson on scheduling tasks with release times and deadlines on two identical processors [8] Running in O(n 3 (n) time, our algorithm is guaranteed to find feasible sched ules for arbitrary basic blocks of code, for such RISC machines as the IBM 801 [18] the Berkeley RISC [13] and Stanford MIPS [11] processors. Also, our algorithm can in the same time bound find feasible schedules whenever such schedules exist, for basic blocks whose data dependence graphs are monotone interval orders [16, 17] in this case, the RISC processor can have a variable number m of RISC ....

M. Katevenis. Reduced Instruction Set Computer Architecture for VLSI. MIT Press, Cambridge, Mass., 1984.


Scheduling Time-Constrained Instructions on Pipelined.. - Leung, Palem, Pnueli   (Correct)

.... runs in O(n 2 log n#(n) ne) time 2 , where n is the number of instructions to be scheduled, and e is the number of edges in the input DAG, and is guaranteed to find feasible schedules for arbitrary basic blocks of code, for such early RISC machines as the IBM 801 [31] the Berkeley RISC [21] and Stanford MIPS [17] processors. Also, our algorithm can in the same time bound find feasible schedules whenever such schedules exist, for basic blocks whose data dependence graphs are monotone interval orders [29; 30] This rest of the paper is structured as follows. In Section 2 we provide ....

Katevenis, M. Reduced Instruction Set Computer Architecture for VLSI. MIT Press, Cambridge, Mass., 1984.


A fast algorithm for scheduling time-constrained.. - Leung, Palem, Pnueli (1998)   (4 citations)  (Correct)

.... of Garey and Johnson on scheduling tasks with release times and deadlines on two identical processors [8] Running in O(n 3 ff(n) time, our algorithm is guaranteed to find feasible schedules for arbitrary basic blocks of code, for such RISC ma chines as the IBM 801 [18] the Berkeley RISC [13] and Stanford MIPS [11] processors. Also, our algorithm can in the same time bound find feasible schedules whenever such schedules exist, for basic blocks whose data dependence graphs are monotone interval orders [16, 17] in this case, the RISC processor can have a variable number m of RISC ....

M. Katevenis. Reduced Instruction Set Computer Architecture for VLSI. MIT Press, Cambridge, Mass., 1984.


Fred: An Architecture for a Self-Timed Decoupled Computer - Richardson, Brunvand (1995)   (4 citations)  (Correct)

....the producer consumer relationship of the queues is violated. Fred s dispatch logic will detect these cases, and take an exception before an instruction sequence is issued that would result in deadlock. 4. 1 Instruction Set Choosing an instruction set for a RISC processor can be a complex task [9, 8, 10]. Rather than attempt to design a new instruction set from scratch, an instruction set from an existing commercial RISC processor was adapted. Much of the Fred instruction set is taken directly from the Motorola 88100 instruction set [12] However, Fred does not implement all the 88100 ....

Manolis G. H. Katevenis. Reduced Instruction Set Computer Architectures for VLSI. MIT Press, 1985.


Architectural Considerations in Silf-Timed Processor Design - Richardson (1996)   (Correct)

....Branch Queue Done Queue R1 Queue Address Write Data Read Data Results Results Results Results Operand Queue Operand Request Queue Set Clear Clear Read Execute Unit Figure 4.1 Fred block diagram 16 4.3. Instruction Set Choosing an instruction set for a RISC processor can be a complex task [21,20,23]. Rather than attempt to design a new instruction set from scratch, much of the Fred instruction set was taken directly from the Motorola 88100 instruction set [26] However, Fred does not implement all of the 88100 instructions, and several of Fred s instructions do not correspond to any ....

M. G. H. Katevenis, Reduced Instruction Set Computer Architectures for VLSI. MIT Press, 1985.


The Interaction of Architecture and Operating System.. - Anderson, Levy, Bershad, .. (1991)   (107 citations)  (Correct)

.... as one of several available interfaces, as is done, for example, on V [Cheriton et al. 90] Mach [Golub et al. 90] and Topaz [Thacker et al. 88] Unfortunately, modern operating systems and architectures have evolved somewhat independently: ffl While simulation and measurement studies (such as [Katevenis 85] have been used to guide hardware design tradeoffs, these studies have tended to overlook the operating system. The problem is partly technological: most early program tracing tools were unable to trace operating system code. But the amount of information overlooked can be huge. In trace driven ....

M. G. H. Katevenis. Reduced Instruction Set Computer Architectures for VLSI. The MIT Press, Cambridge, Massachusetts, 1985.


Quantifying Behavioral Differences Between C and C++ Programs - Calder, Grunwald, Zorn (1995)   (47 citations)  (Correct)

.... Early studies of program behavior [3, 4, 5, 6] have guided architectural design, and the importance of measurement and simulation has permeated architectural design philosophy [7] In particular, the IBM 801, Berkeley, and Stanford RISC projects were all guided by studies of C and FORTRAN programs [8]. More recent studies have used the SPEC program suite. This set of programs, widely used to benchmark new hardware platforms and compiler implementations, consists of a mixture of C and FORTRAN programs [9, 10] More recently, object oriented programming, and specifically the language C , has ....

Manolis G. H. Katevenis. Reduced Instruction Set Computer Architecture for VLSI. ACM Doctoral Dissertation Award Series. MIT Press, 1985.


Relating Static and Dynamic Machine Code Measurements - Davidson, Rabung, Whalley (1992)   (1 citation)  (Correct)

....evaluation, computer architecture, dynamic measurements, static measurements, machine design. I. Introduction Static measurements of program code at machine level are generally thought to be useful for determining textual space needs while dynamic measurements can be used in performance evaluation [10, 18]. Making either type of measurement is conceptually straightforward with access to assembly code, but the implementation of dynamic measuring techniques is more difficult and time consuming since it may involve simulation, tracing, or compiler modification. This difficulty often results in using a ....

M. G. H. Katevenis, Reduced Instruction Set Computer Architectures for VLSI, PhD Dissertation, University of California, Berkeley, CA, 1983.


Fast Accurate Instruction Fetch and Branch Prediction - Calder, Grunwald (1994)   (8 citations)  (Correct)

....(2) is not really needed. If we can determine the instruction type (1) and the destination address via other means, we may be able to dispense with the BTB. 6. 1 Computing the Branch Target Traditional branch architectures use a PC relative displacement; Figure 2(a) modeled after the diagrams in [9], schematically illustrates the process. In the encodings, information in lightly outlined boxes is provided or computed at execution time; for example, in Figure 2(a) the PC is available during execution. Heavily outlined boxes show the information provided by the branch instruction in Figure ....

....Each branch can directly address instructions at address PC Gamma 2 n Gamma1 Gamma 1 : PC 2 n Gamma1 . For simplicity, we assume the program counter is always aligned on instruction boundaries, since we are chiefly concerned with architectures with fixed width instructions. Katevenis [9] proposed several branch encodings where the branch displacement field contains the least significant bits of the branch target address. Figure 2(b) shows one such encoding. Here, the sign bit for the offset and the carry for the addition of the lower bits are computedby the compiler (or linker) ....

Manolis G. H. Katevenis. Reduced Instruction Set Computer Architecture for VLSI. ACM Doctoral Dissertation Award Series. MIT Press, 1985.


Quantifying Behavioral Differences Between C and C++ Programs - Calder (1994)   (47 citations)  (Correct)

.... programs [13, 49] Early studies of program behavior [2, 10, 15, 29] guided the architectural design, and the importance of measurement and simulation has permeated architectural design philosophy [22] In particular, the Berkeley and Stanford RISC were guided by studies of C and FORTRAN programs [27]. More recent studies have used the SPEC program suite. This set of programs, widely used to benchmark new hardware platforms and compiler implementations, consists of a mixture of C and FORTRAN programs [45, 46] More recently, object oriented programming, and specifically the language C , has ....

Manolis G. H. Katevenis. Reduced Instruction Set Computer Architecture for VLSI. ACM Doctoral Dissertation Award Series. MIT Press, 1985.


Branch Prediction Architectures for 64-bit Address Space - Brad Calder (1993)   (1 citation)  (Correct)

....shows the branch encoding, while the right hand side shows the instructions that can be addressed by each encoding. For simplicity, we assume the program counter is always aligned on instruction boundaries, since we are chiefly concerned with architectures with fixed width instructions. Katevenis [7] proposed several branch encodings where the branch displacement field contains the least significant bits of the branch target address. Figure 3(b) modelled after the diagrams in [7] shows one such encoding. Here, the sign bit for the offset and the carry for the addition of the lower bits are ....

....instruction boundaries, since we are chiefly concerned with architectures with fixed width instructions. Katevenis [7] proposed several branch encodings where the branch displacement field contains the least significant bits of the branch target address. Figure 3(b) modelled after the diagrams in [7], shows one such encoding. Here, the sign bit for the offset and the carry for the addition of the lower bits are computed by the compiler (or linker) and encoded in the instruction. The lower bits can be immediately used to index a cache; concurrent with the cache fetch, the higher order bits are ....

Manolis G. H. Katevenis. Reduced Instruction Set Computer Architecture for VLSI. ACM Doctoral Dissertation Award Series. MIT Press, 1985.


The Precomputed Branch Architecture - Calder, Grunwald (1999)   (Correct)

....Branch PC Offset Target PC Target Range N bits 2 N 2 bit boundary N (c) Precomputed Branch (Not Sign Extended) Figure 3: Alternate Branch Methods 3. 3 Computing the Branch Target Traditional branch architectures use a PC relative displacement; Figure 3(a) modeled after the diagrams in [20], schematically illustrates the process. In the encodings, information in lightly outlined boxes is provided or computed at execution time; for example, in Figure 3(a) the PC is available during execution. Heavily outlined boxes show the information provided by the branch instruction the ....

....Each branch can directly address instructions at address PC Gamma 2 n Gamma1 Gamma 1 : PC 2 n Gamma1 . For simplicity, we assume the program counter is always aligned on instruction boundaries, since we are chiefly concerned with architectures with fixed width instructions. Katevenis [20] proposed several branch encodings where the branch displacement field contains the least significant bits of the branch target address. Figure 3(b) shows one such encoding. Here, the sign bit for the offset and the carry for the addition of the lower bits are computed by the compiler (or linker) ....

Manolis G. H. Katevenis. Reduced Instruction Set Computer Architecture for VLSI. ACM Doctoral Dissertation Award Series. MIT Press, 1985.


Time in general-purpose control systems: The Control Time - Protocol And An   (Correct)

No context found.

M. Katevenis, "Reduced Instruction Set Computer Architectures for VLSI", ACM Doctoral Dissertation Award, MIT Press, 1984.


Issues in the Convergence of Control with Communication.. - Graham, Baliga, Kumar   (Correct)

No context found.

M. Katevenis, "Reduced Instruction Set Computer Architectures for VLSI", ACM Doctoral Dissertation Award, MIT Press, 1984.


Fred: An Architecture for a Self-Timed Decoupled Computer - Richardson, Brunvand (1995)   (4 citations)  (Correct)

No context found.

Manolis G. H. Katevenis. Reduced Instruction Set Computer Architectures for VLSI. MIT Press, 1985.


Instruction Scheduling with Timing Constraints on a Single.. - Wu, Jaffar, Yap (2000)   (2 citations)  (Correct)

No context found.

Katevenis, M. Reduced instruction set Computer architecture for VLSI. MIT Press, Cambridge, Mass., 1984.


Loop Optimization Techniques On Multi-Issue Architectures - Kaiser   (Correct)

No context found.

M. G. H. Katevenis, Reduced Instruction Set Computer Architectures for VLSI, ACM Doctoral Dissertation Award, The MIT Press, 1984.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC