57 citations found. Retrieving documents...
J. E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," Proceedings of the 12th Annual International Symposium on Computer Architecture, June 1985.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Low-Complexity Reorder Buffer Architecture - Kucuk, Ponomarev, Ghose (2002)   (1 citation)  (Correct)

....1. INTRODUCTION Contemporary superscalar microprocessors rely on aggressive execution reordering mechanisms to maximize the number of instructions committed per cycle. One of the main dynamic instruction scheduling artifacts used in such datapath designs is the Reorder Buffer (ROB) [17], which guarantees the recovery to a precise state when interrupts occur. The ROB is also used to handle branch mispredictions. It is typically implemented as a circular FIFO queue with head and tail pointers. Entries are made at the tail of the ROB in program order for each of the co dispatched ....

Smith, J. andPleszkun,A., "Implementation of Precise Interrupts in Pipelined Processors", in Proc. of Int'l. Symposium on Computer Architecture, pp.36--44, 1985.


Dynamic SimpleScalar: Simulating Java Virtual Machines - Eliot (2003)   (Correct)

....because it did not handle exceptions. We thus implemented precise interrupts in DSS for exceptions, to attain correct timing and program behavior in DSS. There are several methods we could have used to implement precise interrupts, such as a reorder buffer, a history buffer, or a future file [21]. As do many current microarchitectures, we use a reorder buffer to simulate the timing effects of precise interrupts. As we described previously, DSS checks for exceptions after each instruction, and if one is found, it flushes all entries in the reorder buffer after the faulting instruction. ....

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th International Symposium on Computer Architecture, pages 36--44, June 1985. 23


Speculation-Based Techniques for Lockfree Execution of Lock-Based .. - Rajwar (2002)   (Correct)

....loads or stores [129] Other thread level speculations proposals followed the Multiscalar work [61, 156] 2.4. 2 Handling speculative state Buffering speculative register state is well studied and supported in modern processors either in the form of checkpoints, history buffers, or future files [153]. Nearly all proposals for speculative execution discussed earlier use local buffers to store speculative updates to memory. Knight [86] used the confirm cache local to each processor to store uncommitted data. Herlihy and Moss [66] used a transactional cache to track and buffer speculative ....

....versioning cache [52] to perform memory disambiguation and store speculative memory updates. Gharachorloo et al. 45] used the processor reorder buffer to track speculatively issued loads and the coherence protocol to check for violations. Ranganathan et al. 143] used a history buffer [153] to store speculatively retired instructions. These two schemes do not update memory specu 59 latively. Gniady et al. 48] use a special buffer, the Store History Queue, to buffer speculative updates to memory. 2.4.3 Detecting violations Nearly all techniques discussed above that speculate on ....

[Article contains additional citation context not shown here]

James E. Smith and Andrew R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


Access Time and Power Characteristics of Various Future File.. - Law, Lee   (Correct)

....of the most recently completed and pending assignments to each register, relative to the end of the known instruction sequence, regardless of which instructions have been issued or completed. 6] Figure 3. Reorder Buffer Reorder Buffer The reorder buffer was first proposed by Smith and Plezkun [5] as a method for providing precise interrupts in processors with out of order execution. Today this method is useful not only for exception and interrupt recovery, but also for recovering from branch mispredictions. Using this method, the in order state is kept in the register file, and the ....

....lookahead state[Figure 3] The architectural state is found by combining the register file and the ROB. Implementation of the ROB is straightforward with the use of a first in first out (FIFO) queue. The ROB also overcomes the main drawbacks of the History Buffer presented by Smith and Plezkun in [5]. The history buffer takes several cycles to restore the in order state to the register file, but the ROB leaves the in order state intact in the register file. Thus, after an exception the ROB can simply discard its contents after the faulting instruction and instruction fetching can continue. ....

[Article contains additional citation context not shown here]

J. E. Smith and A. R. Pleszkun, "Implementation of precise interrupts in pipelined processors" In Proceedings of the 12th Annual International Symposium on Computer Architecture, pp. 36-44, June 1985.


Diagonal Registers: Novel Vector Register File Design for High.. - Hanounik (2000)   (Correct)

....been shown that implementing diagonal registers is as easy as adding new ports to VRF. Many modern processors increase the number of ports in register file to boost their performance by a small amount. The new ports could serve an added function units or to improve the interrupt handling system [54]. The diagonal registers improve the performance of some two dimensional applications by at least 100 , and their cost is compensated by adjusting the driver circuitry of VRF and adding new bu#ers if necessary. In general, the 5 overhead of the diagonal registers found in the simulation can be ....

Smith, J. E., Pleszkun, A.R., "Implementation of precise interrupts in pipelined processors ", International Symposium on Computer Architecture Conference, 1985


Speculative Lock Elision: Enabling Highly Concurrent.. - Rajwar, Goodman (2001)   (16 citations)  (Correct)

....of future research. 5.2 Buffering speculative state To recover from an SLE misspeculation, register and memory state must be buffered until SLE is validated. Speculative register state. Two simple techniques for handling register state are: 1. Reorder buffer (ROB) Using the reorder buffer [35] has the advantage of using recovery mechanisms already used for branch mispredictions. However, the size of the ROB places a limit on the size of the critical section (in terms of dynamic instructions) 2. Register checkpoint: This may be of dependence maps (there may be certain restrictions on ....

J.E. Smith and A.R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


[11]). We recently proposed a mechanism called - Predicating Which Provides   (Correct)

....show that the simple VLIW machine slightly outperforms the superscalar machine, while the VLIW machine with predicating achieves a significant speedup of 1.41x over the superscalar machine. 1 Introduction Current high end microprocessors exhibit good performance through superscalar techniques [9][10][12] A superscalar machine dynamically schedules instructions from an instruction window on a predicted control path to exploit instruction level parallelism (ILP) Speculative execution is essential for instruction scheduling so that the scheduler can exploit ILP beyond basic block boundaries. ....

....out of order execution machine with support for speculative execution. Register renaming is used to avoid output and antidependences. Reservation stations [12] are provided for each function unit to check operand availability and issue instructions in parallel. A reorder buffer [10] is used to maintain the correct machine state. With these mechanisms in conjunction with dynamic branch prediction, the superscalar machine fetches a single instruction stream and schedules instructions in the stream so that pipelines never stall. 3 W V 0 31 n c1 c2 1 1 1234 5678 0 E ....

J. E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," In Proc. 12th Int. Symp. on Computer Architecture, pp.36-44, June 1985.


A Novel Renaming Scheme to Exploit Value Temporal.. - Jourdan, Ronen.. (1998)   (19 citations)  (Correct)

....execution was first implemented in the IBM 360 90 [Ande67] Later studies in the early 70s formalized the concept of out of order execution as in [Tjad70] Kell75] The concept was revisited and extended in the 80s [Weiss84] Patt85] None of these studies tackled the problem of exceptions. [Smit85] first presented hardware schemes to manage exceptions precisely. Two major renaming structures were introduced: the reorder buffer and the history buffer. The reorder buffer provides physical locations to store the result of each instruction. Results update the processor state in this scheme once ....

J. E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors", in Proceedings of the 12 th International Symposium on Computer Architecture, 1985..


Code Reordering and Speculation Support for Dynamic.. - Nystrom, Barnes.. (2001)   (6 citations)  (Correct)

....handlers. 2.1 Hardware Speculation Mechanisms Some out of order execution processors preserve precise exceptions by deferring the commits of speculative instructions. Memory and register modifications are buffered in their program order into a retirement structure called a reorder buffer [21]. Instructions in the buffer are not allowed to affect state until all older instructions have completed and committed. Checkpoint repair mechanisms [12] have also been proposed to periodically preserve state. At checkpoints, copies of the register file are made, while between checkpoints, lists ....

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


On Precise Interrupts - Mayan Moudgilly Stamatis (1996)   (4 citations)  (Correct)

....normal execution after processing the interrupt. The hardware must provide mechanisms that enable an interrupt handler to accomplish all these tasks. The way by which most processors make it possible for interrupt handlers to accomplish all these functions is by implementing precise interrupts [9, 3]. The definition of a precise interrupt is derived from execution on a sequential architecture. In a sequential architecture, instructions are issued serially. An instruction is allowed to run to completion before the next one is issued. If any instruction interrupts, the interrupt is reported ....

....so on. When all instructions in the buffer have been processed, the register file has been restored to the required, precise, state. Most of the out of order completion, precise interrupt schemes described in the literature, such as the futurefile, in order buffer and reorder buffer mechanisms [9] are based on the idea of keeping around multiple copies of any overwritten register. The precise state is recovered by the discarding all values written after the interrupting instruction, and restoring the register state from the remaining values. They basically differ in implementation cost and ....

J.E. Smith and A.R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


Data-Flow Prescheduling for Large Instruction Windows in.. - Michaud, Seznec (2001)   (10 citations)  (Correct)

....Section 7 gives some directions for future research. 2. Background and related works The issue buffer is the hardware structure materializing the instruction window. Instructions wait in the issue buffer until they are ready to be launched to the execution units. Unlike the reorder buffer [14], instructions can be removed from the issue buffer soon after issuing, to make room for new instructions. The two main phases of instruction issue are the wakeup phase and the selection phase [11] The wake up phase determines which instructions have their data dependencies resolved. The ....

J.E. Smith and A.R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, 1985.


Improving Latency Tolerance of Multithreading through.. - Parcerisa, González (1999)   (1 citation)  (Correct)

....All memory instructions are dispatched to the AP. The IQ allows the AP to execute ahead of the EP, providing the necessary slippage between them to hide the memory latency. Exceptions are kept precise by means of a reorder buffer, a graduation mechanism, and the register renaming map table [13, 28]. Other decoupled architectures [27] had chosen to steer memory instructions to both units to allow copying data from the load queue to registers. Since preliminary studies showed that such code expansion would significantly reduce the performance, we implemented dynamic register renaming, which ....

J.E. Smith, A.R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proc. of the 12th. Int. Symp. on Computer Architecture, June 1985, 36-44.


MPS: Miss-path Scheduling for Multiple-issue Processors - Banerjia, Sathaye.. (1998)   (1 citation)  (Correct)

....several preceding branches. When speculation is performed, speculated instructions must be prevented from retiring their results when their control dependent branches are mispredicted. Hardware support identical to that used by speculative out of order issue designs can be used to accomplish this [13], 14] 15] IV. Details of a miss path scheduler The previous section introduced some of the hardware structures required for an MPS implementation. The data stored in the def use table and the reservation table are used to make scheduling decisions. This section details the requirements on ....

....in Section III B. If speculation is used, a mechanism is required to prevent incorrectly speculated instructions from retiring their results. A method that is well suited for a speculative miss path scheduler is a reorder buffer with a future file to supplement the architectural register file [13]. Slots are allocated in the reorder buffer in original program order (this is preserved by the scheduler and stored with the individual instructions in the cache) The central issue in speculating instructions is choosing which instructions to speculate, a decision that relies on predicting ....

J. E. Smith and A. Pleszkun, "Implementation of precise interrupts in pipelined processors," in Proc. 12th Ann. Int'l Symp. Computer Architecture, Boston, MA, June 1985.


Load Latency Tolerance In Dynamically Scheduled Processors - Srinivasan, Lebeck (1999)   (32 citations)  (Correct)

....of order, the processor is able to tolerate some long latency operations including cache misses with almost no overall performance degradation. To find enough independent instructions, most processors employ sophisticated branch prediction mechanisms [11, 29] and allow speculative execution [19, 12], committing results only when the true outcome of a branch is known. However, limitations due to finite resources, data dependencies and imperfect branch prediction, render the processor unable to tolerate the latencies of some long latency operations. These operations are likely to degrade ....

James E. Smith and Andrew R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 17--19, 1985. IEEE Computer Society TCA and ACM SIGARCH. Computer Architecture News, 13(3), June 1985.


A 20MHz CMOS Reorder Buffer for a Superscalar Microprocessor - Lenell, Wallace..   (Correct)

....and then updates the register file with the results in the original program order. Results are written to the register file in order by operating the reorder buffer in FIFO fashion. As entries in the reorder buffer reach the bottom of the FIFO, the completed results are written to the register file[3]. The organization of a reorder buffer in a superscalar processor is shown in Figure 1. Each decoded instruction is allocated an entry at the top of the reorder buffer. During allocation, the instruction s destination register identifier and a unique tag identifier for the instruction are entered ....

James Smith and Andrew Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


Handling Floating-Point Exceptions in Numeric Programs - Hauser (1996)   (5 citations)  (Correct)

.... Floating Point Exceptions 157 cycle and retiring operations out of order the trapping hardware itself becomes more costly to implement and can itself add to the time needed to perform operations on ordinary floating point numbers [Hennessy and Patterson 1990; Hwu and Patt 1987; Johnson 1991; Smith and Pleszkun 1985; Sohi and Vajapeyam 1987] Hence, incorporating gradual underflow into a processor s arithmetic involves engineering tradeo#s that are becoming increasingly uncomfortable. Recently, one manufacturer has decreed that subnormal numbers will be supported on their processors only in a degraded mode ....

Smith, J. E. and Pleszkun, A. R. 1985. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture. IEEE Computer Society Press, Silver Springs, Md., 36--44.


An Evaluation of Memory Consistency Models for.. - Parthasarathy..   (Correct)

....addition to fetching the data into the cache. A write prefetch can be issued in SC when the actual write operation is delayed due to consistency requirements. Additionally, with ILP processors, writes cannot be issued till they reach the head of the instruction window, to ensure precise exception [38]. Store prefetches can be used in both SC and RC to initiate ownership requests for all such store operations in the instruction window. Figure 2.3(a) demonstrates the benefits with hardware prefetching from the instruction window on SC. As the figure shows, hardware prefetching is an effective ....

....instruction retires (graduates [25] when it is complete and when all preceding instructions (by program order) have retired. A write in a release consistent system 20 retires when its address and value are resolved, and when all previous instructions have retired. To guarantee precise interrupts [38], writes are not issued into the memory system until they reach the head of the instruction window. We use the SPARC V9 MEMBAR [39] instructions (memory barriers or memory fences) to enforce ordering of memory operations as required by the consistency model. The processor also uses a two bit ....

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the International Symposium on Computer Architecture, 1985.


Hardware Techniques To Improve The Performance Of The.. - Burger (1998)   (10 citations)  (Correct)

....in the dispatch stage of the pipeline. The execution core of sim outorder is derived from the Register Update Unit (RUU) 113] depicted in Figure 2 6. The RUU is a centralized structure that effectively acts as a combined register renaming unit, reservation station pool [123] and reorder buffer [110, 115]. The RUU is implemented as a circular queue, with head and tail pointers. The tail pointer is advanced as new instructions are dispatched to the RUU, and the head moves as the 33 oldest instructions are committed to the architectural state. Operands are stored in the RUU, and are identified with ....

James E. Smith and Andrew R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


VLIW Processors: Efficiently Exploiting Instruction Level.. - Rudd (1999)   (Correct)

....an arbitrary order; however, the sequential execution model defines the visible behavior of a processor. The sequential execution model precisely defines this ordering for both normal as well as exceptional execution. The sequential execution model was originally described by Smith and Pleszkun [49] in the context of managing exceptions; here we restate their definition in the context of the precision of execution ordering. Exceptional execution is merely a special case of normal execution where an exception handler must be executed in the middle of the normal execution of an instruction ....

James E. Smith and Andrew R. Pleszkun. Implementation of precise interrupts in pipelined processor. In The 12th Annual International Symposium on Computer Architecture, pages 36--44, Boston, June 1985. IEEE Computer Society TCA and ACM SIGARCH.


Multiple-Block Ahead Branch Predictors - Seznec, Jourdan, Sainrat, Michaud (1996)   (32 citations)  (Correct)

....in the issue buffer may be issued out of order when all their operands are available, and a max dependent selection mechanism as described in [1] is used when more than one instruction compete for the same functional unit access. To enforce precise interrupt management, a history buffer [23] similar to the active list of the R10000, records the previous mappings discarded by the renaming process during the dispatch stage. Checkpoints [9] of the map table (architectural state) are established at every branch in order to recover from branch misprediction in one cycle, regardless of the ....

J. E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," Proceedings of the 12th Annual International Symposium on Computer Architecture, June 1985.


Miss Path Speculative Scheduling For High Issue Rates - Banerjia, Sathaye, Menezes, ..   (Correct)

....as superblock scheduling [12] When such speculation is performed, speculated instructions must be prevented from retiring their results when their dominating branches are mispredicted. Hardware support identical to that used by speculative out of order issue designs can be used to accomplish this [13], 14] 15] 4 Details of a miss path scheduler The previous section introduced some of the hardware structures required for miss path scheduling. The data stored in the def use table and the reservation table are used to make scheduling decisions. Additional logic is needed to interpret the ....

....in Section 3.2. If speculation is used, a 12 mechanism is required to prevent incorrectly speculated instructions from retiring their results. A method that is well suited for a speculative miss path scheduler is a reorder buffer with a future file to supplement the architectural register file [13]. Slots are allocated in the reorder buffer in original program order (this is preserved by the scheduler and stored with the individual instructions in the cache) 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 go m88ksim gcc compress li ijpeg perl vortex hmean Benchmarks Useful Instructions ....

J. E. Smith and A. Pleszkun, "Implementation of precise interrupts in pipelined processors, " in Proc. 12th Ann. International Symposium Computer Architecture, (Boston, MA), June 1985.


Evaluation Of Some Superscalar And VLIW Processor Designs - Holm (1992)   (Correct)

.... 360 91 [6, 7] Keller provides abstract models for out of order execution [8] Hwu demonstrates the feasibility of out of order design of the complete HPSm single chip micro architecture [9] The model used in this thesis most closely resembles the work of Johnson [10] who uses a reorder buffer [11]. In each cycle, the number of instructions fetched equals the issue rate. Unlike the previous hardware models, these instruction do not have to be independent. On the next cycle, these instructions are assigned tags and placed in the issue window. The issue window buffers instructions until they ....

J.E. Smith and A.R. Pleszkun, "Implementation of precise interrupts in pipelined processors," in The 11th Annual Symposium on Computer Architecture, June 1985.


Reducing Cache Miss Rates Using Prediction Caches - Bennett, Flynn (1996)   (3 citations)  (Correct)

.... It is a four issue, dynamically scheduled processor, with register renaming, branch prediction, speculative execution, and precise interrupts[HP90] Instructions issue out of order, as their operands become available, and a reorder buffer is used to restore the precise state after an interrupt[SP85, Joh91]. The load store buffer has 32 entries, and the reorder buffer is 64 entries long. For comparison, the P6 has 40 reorder buffer entries, and the PA 8000 has 56. More detailed information on the benchmarks, processor model, and simulation environment are available in [BF95] 8 4.3 The memory ....

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In 12th International Symposium on Computer Architecture, pages 36--44, June 1985. 20


Efficient Instruction Sequencing with Inline Target Insertion - Hwu, Chang (1990)   (5 citations)  (Correct)

....empirical results on the performance and stability of using profile information in compile time code restructuring. The first three issues were not addressed by McFarling and Hennessy [29] The second issue was not addressed by previous studies of hardware support for precise interrupt [18] [40]. In order to address these issues, we have specified a compiler and pipeline implementation method for Delayed Branches with Squashing. We refer to this method as Inline Target Insertion to reflect the fact that the compiler restructures the code by inserting predicted successors of branches ....

J. E. Smith and Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors", Proceedings of the 11th Annual Symposium on Computer Architectures, June, 1985.


Latency Tolerance For Dynamic Processors - Bennett, Flynn (1996)   (1 citation)  (Correct)

.... It is a four issue, dynamically scheduled processor, with register renaming, branch prediction, speculative execution, and precise interrupts[HP90] Instructions issue out of order, as their operands become available, and a reorder buffer is used to restore the precise state after an interrupt[SP85, Joh91]. The load store buffer has 32 entries, and the reorder buffer is 64 entries long. For comparison, the P6 has 40 reorder buffer entries, and the PA 8000 has 56. More detailed information on the benchmarks, processor model, and simulation environment are available in [BF95] 2.3 The memory ....

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In 12th International Symposium on Computer Architecture, pages 36--44, June 1985.


A Buffer-Oriented Methodology for Microarchitecture.. - Utamaphethai, Blanton, Shen (1999)   (2 citations)  (Correct)

....order to ensure precise exception handling. Pipelines of superscalar processors can be typically divided into an in order frontend, an out of order execution core and an in order backend. During the last stage of the in order frontend, an entry is allocated for an instruction in a reorder buffer [27]. The execution of the instruction is then performed in the out of order core. When the instruction finally completes or transfers its speculative state into permanent machine state in the in order backend, the associated reorder buffer entry is deallocated in program order. In essence, a reorder ....

J. E. Smith and A. R. Pleszkun. "Implementation of Precise Interrupts in Pipelined Processors,". Proc. of the International Symposium on Computer Architecture, pp. 36--44, Jun. 1985.


Systematic Computer Architecture Prototyping - Conte (1992)   (3 citations)  (Correct)

....that it is impossible to restart execution after an interrupt. This occurs when the registers and memory are left in a partial (i.e. non sequentially consistent) state. Although this problem exists for FICO class processors, its solution is simpler than what is required for FOCO class processors [46], 47] Techniques have since been developed to provide precise interrupts and coherent state for FOCO designs, including checkpoint repair and history reorder buffer approaches [47] 48] This thesis assumes that instructions always retire in program order and that the processor uses either ....

J. E. Smith and A. Pleszkun, "Implementation of precise interrupts in pipelined processors," in Proc. 12th Ann. Int'l. Symp. Computer Architecture, (Boston, MA), June 1985.


Exploring Configurations of Functional Units in an.. - St Phan Jourdan (1995)   (13 citations)  (Correct)

....oldest first where the entry which holds the oldest dispatched instruction has priority over the others. To manage interrupts precisely, an entry associated to each dispatched instruction is enqueued in a reorder buffer. However, operand values or tags are always obtained during the decoding [SmPl85]. The reorder buffer maintains the initial program order, and its size defines the upper bound of the number of instructions which can be simultaneously processed after their dispatch. As mentioned below, this upper bound is the size of the lookahead window. When an instruction is executed, the ....

J.E. Smith, A.R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors", Proceedings of the 12th Annual International Symposium on Computer Architecture, June 1985


An Investigation of the Performance of Various.. - Jourdan, Sainrat.. (1995)   (2 citations)  (Correct)

....we avoid most of the possible degradation of the issuing performance by choosing the oldest first priority since we are not investigating issuing policies. When an instruction finishes execution, its result is recorded in the update buffer. Examples of the update buffer are the reorder buffer [12] and the backup files of the checkpoint repair mechanism [5] To enforce precise interrupt management, an instruction is completed only when all prior instructions cannot produce an interrupt anymore. Instructions capable of generating interrupts are conditional branches, divides, and memory ....

J.E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," Proceedings of the 12th Annual International Symposium on Computer Architecture, June 1985.


The Latency Hiding Effectiveness of Decoupled.. - Parcerisa, González (1998)   (Correct)

....bus which completes one transaction each 2 CPU cycles (i.e. every cache line keeps the bus busy during 4 cycles) by overlapping several transactions. To maintain precise exceptions we assume that there exists an elementary reorder buffer, a graduation mechanism and some exception recovery hardware [12, 24] for the AP. The recovery hardware for the EP is greatly simplified by just preventing the EP from issuing ahead of uncompleted AP instructions (including conditional branches) As far as the AP executes ahead of the EP, this constraint saves lots of hardware complexity at the expense of very ....

....is delivered by the EP to the Condition Queue. This kind of LODs are removed by enabling the AP to execute speculatively instructions beyond one ore more branches. In case one of the branches is found to be mispredicted, the hardware must be able to recover the state previous to the branch [12, 24]. The cost of this hardware depends on the particular implementation and the speculation depth, which is the number of unresolved EP branches beyond which the instruction issue mechanism stalls. We have assumed a speculation depth of 4, which is the same as the MIPS R10000 [33] and the PowerPC 620 ....

J.E. Smith, A.R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proc. of the 12th. Int. Symp. on Computer Architecture, June 1985, 36-44.


A Parallel Architecture for Serializable Production Systems - Amaral (1994)   (Correct)

....as the arrival of the external tokens that caused these changes. The Output Buffer of Figure 7.1 is included in the representation of the Re Order Buffer in Figure 7.2. The idea of using buffers to overcome synchronization problems is borrowed from superscalar and superpipelined architectures [15, 75, 77]. 90 7.3 In Order Buffer The purpose of the In Order Buffer is to prevent two conflicting tokens from proceeding to the fi units while allowing non conflicting tokens to be processed without delay. Two tokens are conflicting if one of them enables and the other disables the same production. To ....

J. E. Smith and A. R. Pleszkum, Implementation of precise interrupts in pipelined processors, in Proc. 12th Annual International Symposium on Computer Architecture, June 1985, pp. 36--44.


Register Renaming and Dynamic Speculation: an.. - Moudgill, Pingali.. (1993)   (17 citations)  (Correct)

....This may change the instructions status from waiting to ready. The result is also sent to the register file. If the tag in the output register is the same as that of the instruction, the value is written to the register file. Tomasulo s algorithm must be extended to implement precise interrupts [9]. Johnson [5] describes one such extension, based on the future file mechanism [9] The registers described above correspond to the future file. There is also a duplicate register file called the in order file, and a reorder buffer. Instructions issue as described, using the future file 1 . ....

....sent to the register file. If the tag in the output register is the same as that of the instruction, the value is written to the register file. Tomasulo s algorithm must be extended to implement precise interrupts [9] Johnson [5] describes one such extension, based on the future file mechanism [9]. The registers described above correspond to the future file. There is also a duplicate register file called the in order file, and a reorder buffer. Instructions issue as described, using the future file 1 . Additionally, results are written to the reorder buffer. The reorder buffer is a FIFO ....

[Article contains additional citation context not shown here]

J.E. Smith and A.R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual InternationalSymposium on Computer Architecture, pages 36--44, June 1985.


Recent Advances in Memory Consistency Models for Hardware .. - Adve, Pai, Ranganathan (1999)   (6 citations)  (Correct)

....once the instruction s data dependences are resolved, even if previous instructions are still blocked. To maintain precise exceptions, however, instructions are retired from the instruction window and all changes to architectural state (e.g. registers and memory) are made in program order [7]. Non blocking loads. Many current processors do not block on a load, but can continue executing independent instructions (including other loads) while one or more loads (including cache misses) are outstanding. To maintain precise interrupts, a load does not leave the instruction window until it ....

J. E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," in Proc. 12th Intl. Symp. on Computer Architecture, 1985.


A Split Data Cache for Superscalar Processors - Boleyn, Debardelaben, Tiwari..   (Correct)

....policy when no empty block is available; and has a miss penalty of 16 cycles. The cache can continue to service cache hits during cache refills until a second cache miss occurs. In order to eliminate storage conflicts through register renaming, each operational unit contains a reorder buffer [Smith85]. Each reorder buffer holds sixteen entries, and a maximum of four results can be written into the reorder buffer on any given cycle. In addition, four results per cycle can be written from the reorder to the corresponding register file. This machine represents an aggressive superscalar ....

J. E. Smith and A. R. Pleszkun. "Implementation of Precise Interrupts in Pipelined Processors", Proceedings of the 12th Annual International Symposium on Computer Architecture , pp. 36-44., June 1985. 7


SuperDLX - A Generic Superscalar Simulator - Moura (1993)   (7 citations)  (Correct)

....uses a variant of Tomasulo s algorithm, to rename load destination floating point registers [Gro90] Other more complex (but more powerful) implementations use some type of buffering device which provides extra storage for instructions results. An example of such device is the reorder buffer, SP85] a FIFO queue where issued instructions place their results. While instructions complete out of order, the buffer reorders the results, so that they can be written in strict program order to the register file. There are variants of the reorder buffer: for example, the register update unit, ....

....there are different techniques for branch repairs, which rely on post issue instruction invalidation (instruction flush) Such invalidation is possible when the instruction has not modified the processor state. Array based schemes exist to support instruction invalidation: the reorder buffer [SP85] history buffer [SP85] future file [SP85] and the DCAF of the Metaflow architecture. They hold the results of branch dependent instructions until the branch outcome is safely determined. When instructions can be undone, branch prediction is well integrated in the out of order issue scheme, as ....

[Article contains additional citation context not shown here]

J.E. Smith and A.R. Pleszkun. Implementation of precise interrupts in pipelined processors. Proceedings of the 12th Annual International Synposium on Computer Architecture, pages 36--44, June 1985.


Limits of Control Flow on Parallelism - Lam, Wilson (1992)   (154 citations)  (Correct)

....an instruction may generate unwanted side effects. These side effects must be discarded if the branch prediction is incorrect. Bothhardware and software techniques can be used to implement speculative execution. Various hardware structures have been proposed to support speculative execution [7, 11, 13, 16]. These structures store the results of the speculative instructions until the branch direction is determined. If the branch prediction was correct the results are committed, otherwise they are discarded. Hardware scheduling, however, is limited by the fact that an instruction simply cannot ....

J. E. Smith and A. R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


A Comparison of Superscalar and Decoupled Access/Execute.. - Farrens, Ng, Nico (1993)   (2 citations)  (Correct)

.... however, this requires the use of hardware techniques such as Tomasulo s algorithm [Toma67] in addition to register renaming to eliminate dependencies among instructions [SoVa87] The adoption of out oforder execution also raises the question of how to deal with interrupts, branches, and exceptions [SmPl88]. Architectural queues also support out of order execution between the two processors with respect to the original program sequence. The register renaming effect of removing storage conflicts allows data fetch instructions to complete before their corresponding positions in the original sequence ....

J. E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors", IEEE Transactions on Computers, vol. 37, no. 5 (May 1988), pp. 562-573.


Comparing Static and Dynamic Scheduling on Superscalar Processors - Lo (1995)   (Correct)

....the instruction stream) begins. In order to support precise interrupts, the processor must save state that corresponds to the sequential model when an interrupt occurs. Without appropriate hardware support, speculative execution fails to adhere to sequential program execution. Smith and Pleszkun[SP85] described several techniques for implementing precise interrupts in pipelined processors. These designs have been extended to support precise interrupts as well as out of order execution in modern microprocessors. It is interesting to note that one of these mechanisms, the reorder buffer, can be ....

....windows of code are assigned to stages, and handled by the Issue and Execution (IE) units within the stages. Each stage has its own first level instruction cache (L1 IC) IE unit, and register file, known as a future file (FF) The future file is similar to that proposed by Smith and Pleszkun [SP85] and acts as a working register file that assists recovery when incorrect branch prediction occurs. A primary motivation behind the ESW paradigm is that it addresses the aforementioned decode bottleneck on wide issue machines. By passing basic windows to stages, instruction decoding and data ....

James E. Smith and Andrew R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings, 12th Annual International Symposium on Computer Architecture, pages 36--44. IEEE, June 1985.


Two Techniques to Enhance the Performance of Memory.. - Kourosh Gharachorloo (1991)   (51 citations)  (Correct)

....the instructions are fetched and decoded in program order. In addition, the processor allows execution of instructions past unresolved conditional branches. A branch target buffer (BTB) 16] is incorporated into the instruction cache to provide conditional branch prediction. The reorder buffer [22] used in the architecture is responsible for several functions. The first function is to eliminate storage conflicts through register renaming [12] The buffer provides the extra storage necessary to implement register renaming. Each instruction that is decoded is dynamically allocated a location ....

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 1985.


Performance Study of a Concurrent Multithreaded Processor - Tsai, Jiang, Ness, Yew (1998)   (19 citations)  (Correct)

....v Write back Unit Branch Unit Register File Figure 2. The block diagram of a thread processing element executed out of order when their operands are available. To support speculative execution and in order instruction completion, the instruction dispatch and completion unit uses an reorder buffer [9] to buffer instruction results before they are committed. The reorder buffer also serves as an rename buffer to provide later instructions with uncommitted results which they are flow (read after write) dependent on. Each thread processing unit also has a communication unit for transferring ....

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th International Symposium on Computer Architecture, pages 36--44, June 1985.


Data Memory Alternatives for Multiscalar Processors - Scott Breach Vijaykumar (1997)   (4 citations)  Self-citation (Smith)   (Correct)

No context found.

James E. Smith and Andrew R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36--44, June 17-- 19, 1985.


Boosting Beyond Static Scheduling in a Superscalar Processor - Smith, Lam, Horowitz (1990)   (63 citations)  Self-citation (Smith)   (Correct)

....perform memory disambiguation at run time, thereby allowing loads and stores to bypass each other when advantageous. Storage conflicts in the original code are eliminated at run time by performing register renaming in the hardware [10] Register renaming is implemented by using a reorder buffer [19] associated with each register file. The reorder buffer provides the additional storage necessary to implement register renaming. For example, when an instruction is decoded, MATCH dynamically allocates a location in the reorder buffer for this instruction s result and the instruction s ....

J.E. Smith and A.R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors." Proceedings of the 12th Annual International Symposium on Computer Architecture (June 1985), pp. 36-44.


Efficient Superscalar Performance Through Boosting - Michael Smith Mark (1992)   (48 citations)  Self-citation (Smith)   (Correct)

....movement into safe speculative execution, and thus a compiler alone cannot support the general movement of instructions above their control dependent branch. There are numerous hardware techniques that allow dynamic schedulers to safely move any instruction above its control dependent branch [15][21]. The basis of all these techniques is the inclusion of extra buffering in the hardware which holds the effects of the speculative operations 1 . The sequential state of the machine is defined as that machine state that is not a result of any speculative operation, and conversely, the ....

....superscalar processor with speculative execution support. The dynamic scheduler is functionally equivalent to our base superscalar machine. The dynamic scheduler fetches and decodes two instructions per cycle. It uses a total of 30 reservation station locations [28] and a 16 entry reorder buffer [21] to implement outof order execution with speculation, and it uses a 2048 entry, 4way set associative branch target buffer to predict branches. It has the same number of functional units as our statically scheduled machine, but since the dynamically scheduled machine uses reservation stations, it ....

J.E. Smith and A.R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proc. 12th Int. Symp. on Computer Architecture, pp. 36--44, June 1985.


Architectural Support For Compile-Time Speculation - Smith (1994)   (4 citations)  Self-citation (Smith)   (Correct)

....On a correct prediction, the hardware updates the machine state with the speculative results; on an incorrect prediction, the hardware discards the speculative results. Researchers have proposed 4 Chapter 1 a number of buffering schemes for supporting speculative execution in hardware [17] 18][29]. Researchers usually couple hardware assisted speculative execution with hardware scheduling because hardware assisted speculative execution requires more information than is found in a typical instruction stream. That is, hardwareassisted speculative execution requires the instruction scheduler ....

.... branch target buffer) and each assumes perfect register renaming (i.e. instruction issue is only limited by true data dependences and resource conflicts) The out of order instruction scheduler uses a total of 30 reservation station locations [34] and it has a 16 entry reorder buffer [29] to support speculative execution. There are enough locations in these structures to guarantee that the machine never stalls waiting for a buffer location. in order out of order 2 issue 4 issue 2 issue awk 1.10 1.16 1.59 compress 1.16 1.28 1.66 eqntott 1.12 1.24 1.52 espresso 1.12 1.24 1.66 ....

Smith, J. and Pleszkun, A., "Implementation of precise interrupts in pipelined processors," Proc. of 12th Annual Int. Symp. on Comp. Arch., June 1985, pp. 36--44.


The Effects of Mispredicted-Path Execution on Branch - Prediction Structures..   (Correct)

No context found.

J. E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," Proceedings of the 12th Annual International Symposium on Computer Architecture, June 1985.


Checkpoint Processing and Recovery: - Towards Scalable Large   (Correct)

No context found.

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th pages 36--44, June 1985.


High-Performance Frontends for Trace Processors - Jacobson (1999)   (Correct)

No context found.

J. E. Smith, A. Pleszkun, " Implementation of Precise Interrupts in Pipelined Processors," in International Symposium on Computer Architecture, pp. 36-44, June 1985.


Latency Tolerant Architectures - Bennett (1998)   (2 citations)  (Correct)

No context found.

J.E. Smith and A.R. Pleszkun. Implementation of precise interrupts in pipelined processors. In 12th International Symposium on Computer Architecture, pages 36--44, June 1985.


Improving Latency Tolerance of Multithreading through - Decoupling Joan-Manuel..   (Correct)

No context found.

J.E. Smith, A.R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proc. of the 12th. Int. Symp. on Computer Architecture, June 1985, 36-44.


Checkpoint Processing and Recovery: - Towards Scalable Large   (Correct)

No context found.

J. E. Smith and A. R. Pleszkun. Implementation of precise interrupts in pipelined processors. In Proceedings of the 12th pages 36--44, June 1985.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC