Results 1 - 10
of
291
The predictability of data values
- IN PROCEEDINGS OF THE 30TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE
, 1997
"... ..."
(Show Context)
Highly Accurate Data Value Prediction using Hybrid Predictors
, 1997
"... Data dependences (data flow constraints) present a major hurdle to the amount of instruction-level parallelism that can be exploited from a program. Recent work has suggested that the limits imposed by data dependences can be overcome to some extent with the use of data value prediction. That is, wh ..."
Abstract
-
Cited by 211 (3 self)
- Add to MetaCart
Data dependences (data flow constraints) present a major hurdle to the amount of instruction-level parallelism that can be exploited from a program. Recent work has suggested that the limits imposed by data dependences can be overcome to some extent with the use of data value prediction. That is, when an instruction is fetched, its result can be predicted so that subsequent instructions that depend on the result can use this pre- dicted value. l/Vhen the correct result becomes avail- able, all instructions that are data dependent on that prediction can be validated. This paper investigates a variety of techniques to carry out highly accurate data value predictions. The first technique investigates the potential of monitoring the strides by which the results produced by different instances of an instruction change. The second technique investigates the potential of pattern-based two-level prediction schemes. Simulation results of these two schemes show improvements over the existing method of predicting the last outcome. In particular, some benchmarks show improvement with the stride-based predictor and others show improvement with the pattern-based predictor. To do uniformly well across benchmarks, we combine these two predictors to form a hybrid predictor. Simulation analysis of the hybrid predictor shows its overall prediction accuracy to be better than that of the component predictors across all benchmarks.
Dynamic instruction reuse, in:
- Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA’97),
, 1997
"... ..."
Dynamic Speculation and Synchronization of Data Dependences
- In Proc. 24th International Symposium on Computer Architecture
, 1997
"... Data dependence speculation is used in instruction-level parallel (ILP) processors to allow early execution of an instruction before a logically preceding instruction on which it may be data dependent. If the instruction is independent, data dependence speculation succeeds; if not, it fails, and the ..."
Abstract
-
Cited by 185 (22 self)
- Add to MetaCart
(Show Context)
Data dependence speculation is used in instruction-level parallel (ILP) processors to allow early execution of an instruction before a logically preceding instruction on which it may be data dependent. If the instruction is independent, data dependence speculation succeeds; if not, it fails, and the two instructions must be synchronized. The modern dynamically scheduled processors that use data dependence speculation do so blindly (i.e., every load instruction with unresolved dependences is speculated). In this paper, we demonstrate that as dynamic instruction windows get larger, significant performance benefits can result when intelligent decisions about data dependence speculation are made. We propose dynamic data dependence speculation techniques: (i) to predict if the execution of an instruction is likely to result in a data dependence mis-speculation, and (ii) to provide the synchronization needed to avoid a mis-speculation. Experimental results evaluating the effectiveness of the proposed techniques are presented within the context of a Multiscalar processor. 1
Clustered Speculative Multithreaded Processors
, 1999
"... In this paper we present a processor microarchitecture that can simultaneously execute multiple threads and has a clustered design for scalability purposes. A main feature of the proposed microarchitecture is its capability to spawn speculative threads from a single-thread application at run-time. T ..."
Abstract
-
Cited by 180 (10 self)
- Add to MetaCart
In this paper we present a processor microarchitecture that can simultaneously execute multiple threads and has a clustered design for scalability purposes. A main feature of the proposed microarchitecture is its capability to spawn speculative threads from a single-thread application at run-time. These speculative threaak use otherwise idle resources of the machine. Spawning a speculative thread involves predicting its control flow as well as its dependences with other threads and the values that flow through them. In this way, threads fhat are not independent can be executed in parallel. Control-Jlow, data value and data dependence predictors particularly designedfor this type of microarchitecture are presented. Results show the potential of the microarchitecture to exploit speculative parallelism in programs that are hard to parallelize at compile-time, such as the SpecInt9.5. For a 4-thread unit configuration, some programs such as ijpeg and Ii can exploit an average degree of parallelism of more than 2 threads per cycle. The average degree ofparallelism for the whole SpecInt95 suite is 1.6 threads per cycle. This speculative parallelism results in significant speedups for all the Speclnt95 programs when compared with a single-thread execution.
Selective value prediction
- In 26th Annual International Symposium on Computer Architecture
, 1999
"... Value Prediction is a relatively new technique to increase instruction-level parallelism by breaking true data dependence chains. A value prediction architecture produces values, which may be later consumed by instructions that execute speculatively using the predicted value. This paper examines sel ..."
Abstract
-
Cited by 138 (13 self)
- Add to MetaCart
(Show Context)
Value Prediction is a relatively new technique to increase instruction-level parallelism by breaking true data dependence chains. A value prediction architecture produces values, which may be later consumed by instructions that execute speculatively using the predicted value. This paper examines selective techniques for using value prediction in the presence of predictor capacity constraints and reasonable misprediction penalties. We examine prediction and confidence mechanisms in light of these constraints, and we minimize capacity conflicts through instruction filtering. The latter technique filters which instructions put values into the value prediction table. We examine filtering techniques based on instruction type, as well as giving priority to instructions belonging to the longest data dependence path in the processor’s active instruction window. We apply filtering both to the producers of predicted values and the consumers. In addition, we examine the benefit of using different confidence levels for instructions using predicted values on the longest dependence path. 1
Reconfigurable Caches and their Application to Media Processing
- In Proceedings of the 27th Annual International Symposium on Computer Architecture
, 2000
"... High performance general-purpose processors are increasingly being used for a variety of application domains -- scientific, engineering, databases, and more recently, media processing. It is therefore important to ensure that architectural features that use a significant fraction of the on-chip tra ..."
Abstract
-
Cited by 126 (3 self)
- Add to MetaCart
High performance general-purpose processors are increasingly being used for a variety of application domains -- scientific, engineering, databases, and more recently, media processing. It is therefore important to ensure that architectural features that use a significant fraction of the on-chip transistors are applicable across these different domains. For example, current processor designs often devote the largest fraction of on-chip transistors (up to 80%) to caches. Many workloads, however, do not make effective use of large caches; e.g., media processing workloads which often have streaming data access patterns and large working sets. This paper proposes a new reconfigurable cache design. This design enables the cache SRAM arrays to be dynamically divided into multiple partitions that can be used for different processor activities. These activities can benefit applications that would otherwise not use the storage allocated to large conventional caches. Our design involves relativ...
Value Profiling
- In MICRO-97
, 1997
"... Identifying variables as invariant or constant at compile-time allows the compiler to perform optimizations including constant folding, code specialization, and partial evaluation. Some variables, which cannot be labeled as constants, may exhibit semi-invariant behavior. A semiinvariant variable is ..."
Abstract
-
Cited by 117 (5 self)
- Add to MetaCart
(Show Context)
Identifying variables as invariant or constant at compile-time allows the compiler to perform optimizations including constant folding, code specialization, and partial evaluation. Some variables, which cannot be labeled as constants, may exhibit semi-invariant behavior. A semiinvariant variable is one that cannot be identified as a constant at compile-time, but has a high degree of invariant behavior at run-time. If run-time information was available to identify these variables as semi-invariant, they could then benefit from invariant-based compiler optimizations. In this paper we examine the invariance found from profiling instruction values, and show that many instructions have semi-invariant values even across different inputs. We also investigate the ability to estimate the invariance for all instructions in a program from only profiling load instructions. In addition, we propose a new type of profiling called Convergent Profiling. Estimating the invariance from loads and converg...
Focusing processor policies via critical-path prediction
- In Proceedings of the 28th Annual International Symposium on Computer Architecture
, 2001
"... ..."
(Show Context)
HLS: Combining Statistical and Symbolic Simulation to Guide Microprocessor Designs
, 2000
"... As micro processors continue to evolve, many optimizations reach a point of diminishing returns. We introduce HLS, a hybrid processor simulator which uses statistical models and symbolic execution to evaluate design alternatives. This simulation methodology allows for quick and accurate contour maps ..."
Abstract
-
Cited by 101 (0 self)
- Add to MetaCart
(Show Context)
As micro processors continue to evolve, many optimizations reach a point of diminishing returns. We introduce HLS, a hybrid processor simulator which uses statistical models and symbolic execution to evaluate design alternatives. This simulation methodology allows for quick and accurate contour maps to be generated of the performance space spanned by design parameters. We validate the accuracy of HLS through correlation with existing cycle-by-cycle simulation techniques and current generation hardware. We demonstrate the power of HLS by exploring design spaces defined by two parameters: code properties and value prediction. These examples motivate how HLS can be used to set design goals and individual component performance targets.