Results 21 - 30
of
8,430
DBMSs on a modern processor: Where does time go
- in VLDB
, 1999
"... Recent high-performance processors employ sophisticated techniques to overlap and simultaneously execute multiple computation and memory operations. Intuitively, these techniques should help database applications, which are becoming increasingly compute and memory bound. Unfortunately, recent studie ..."
Abstract
-
Cited by 246 (26 self)
- Add to MetaCart
use a memory resident database. Using simple queries we find that database developers should (a) optimize data placement for the second level of data cache, and not the first, (b) optimize instruction placement to reduce first-level instruction cache stalls, but (c) not expect the overall execution
An Adaptive Issue Queue for Reduced Power at High Performance
, 2000
"... Increasing power dissipation has become a major constraint for future performance gains in the design of microprocessors. In this paper, we present the circuit design of an issue queue for a superscalar processor that leverages transmission gate insertion to provide dynamic low-cost configurability ..."
Abstract
-
Cited by 61 (6 self)
- Add to MetaCart
of size and speed. A novel circuit structure dynamically gathers statistics of issue queue activity over intervals of instruction execution. These statistics are then used to change the size of an issue queue organization onthe -fly to improve issue queue energy and performance. When applied to a fixed
A scalable instruction queue design using dependence chains
- in Proceedings of the 29th Annual International Symposium on Computer Architecture
, 2002
"... Increasing the number of instruction queue (IQ) entries in a dynamically scheduled processor exposes more instruction-level parallelism, leading to higher performance. However, increasing a conventional IQ’s physical size leads to larger latencies and slower clock speeds. We introduce a new IQ desig ..."
Abstract
-
Cited by 68 (0 self)
- Add to MetaCart
design that divides a large queue into small segments, which can be clocked at high frequencies. We use dynamic dependence-based scheduling to promote instructions from segment to segment until they reach a small issue buffer. Our segmented IQ is designed specifically to accommodate variable
Limits on Multiple Instruction Issue
- in Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems
, 1989
"... This paper demonstrates that highly-optimized, non-scientific applications also contain ample instruction-level concurrency to sustain an execution rate of two instructions per clock cycle. However, the cost requirements necessary to provide the instruction bandwidth needed by the instructionexecuti ..."
Abstract
-
Cited by 113 (6 self)
- Add to MetaCart
This paper demonstrates that highly-optimized, non-scientific applications also contain ample instruction-level concurrency to sustain an execution rate of two instructions per clock cycle. However, the cost requirements necessary to provide the instruction bandwidth needed
General
"... Speculatively issued instructions may be particularly sensitive to increases in pipeline depth. Our results indicate that as pipeline depth increases, speculation increases the percentage of issue queue instructions that are waiting to be potentially re-issued in case of a mis-speculation. To compen ..."
Abstract
- Add to MetaCart
Speculatively issued instructions may be particularly sensitive to increases in pipeline depth. Our results indicate that as pipeline depth increases, speculation increases the percentage of issue queue instructions that are waiting to be potentially re-issued in case of a mis
Energy-Efficient Issue Queue Design”, in
- IEEE Transactions on VLSI Systems
, 2003
"... Abstract—The out-of-order issue queue (IQ), used in modern superscalar processors is a considerable source of energy dissipation. We consider design alternatives that result in significant reductions in the power dissipation of the IQ (by as much as 75%) through the use of comparators that dissipate ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Abstract—The out-of-order issue queue (IQ), used in modern superscalar processors is a considerable source of energy dissipation. We consider design alternatives that result in significant reductions in the power dissipation of the IQ (by as much as 75%) through the use of comparators
Optimization of Instruction Fetch Mechanisms for High Issue Rates
- In 22nd Annual International Symposium on Computer Architecture
, 1995
"... Recent superscalar processors issue four instructions per cycle. These processors are also powered by highly-parallel superscalar cores. The potential performance can only be exploited when fed by high instruction bandwidth. This task is the responsibility of the instruction fetch unit. Accurate bra ..."
Abstract
-
Cited by 133 (4 self)
- Add to MetaCart
Recent superscalar processors issue four instructions per cycle. These processors are also powered by highly-parallel superscalar cores. The potential performance can only be exploited when fed by high instruction bandwidth. This task is the responsibility of the instruction fetch unit. Accurate
Memory Dependence Prediction using Store Sets
, 1998
"... For maximum performance, an out-of-order processor must issue load instructions as early as possible, while avoiding memory-order violations with prior store instructions that write to the same memory location. One approach is to use memory dependence prediction to identify the stores upon which a l ..."
Abstract
-
Cited by 211 (2 self)
- Add to MetaCart
For maximum performance, an out-of-order processor must issue load instructions as early as possible, while avoiding memory-order violations with prior store instructions that write to the same memory location. One approach is to use memory dependence prediction to identify the stores upon which a
A Large, Fast Instruction Window for Tolerating Cache Misses
"... Instruction window size is an important design parameter for many modern processors. Large instruction windows offer the potential advantage of exposing large amounts of instruction level parallelism. Unfortunately, naively scaling conventional window designs can significantly degrade clock cycle ti ..."
Abstract
-
Cited by 109 (1 self)
- Add to MetaCart
.g., cache miss) cannot execute until that source operation completes. These instructions are moved out of the conventional, small, issue queue to a much larger waiting instruction buffer (WIB). When the long latency operation completes, the instructions are reinserted into the issue queue. In this paper, we
A systematic methodology to compute the architectural vulnerability factors for a high performance microprocessor
- In International Symposium on Microarchitecture
, 2003
"... Single-event upsets from particle strikes have become a key challenge in microprocessor design. Techniques to deal with these transient faults exist, but come at a cost. Designers clearly require accurate estimates of processor error rates to make appropriate cost/reliability trade-offs. This paper ..."
Abstract
-
Cited by 197 (12 self)
- Add to MetaCart
that a fault in that particular structure will result in an error. A structure's error rate is the product of its raw error rate, as determined by process and circuit technology, and the AVF. Unfortunately, computing AVFs of complex structures, such as the instruction queue, can be quite involved
Results 21 - 30
of
8,430