| A. M. G. Maynard, C. M. Donelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, October 1994. |
....One approach is to increase the amount of cache available on chip. For some applications, the on chip L1 caches will suffice [Lee 98] For other applications such as commercial or database programs, the working data set is too large to fit in the L2 cache and will still need to access main memory [Maynard 94, Perl 96, Barroso 98] Note that the L1 cache size is constrained by the desire to have a small access latency and cannot be made arbitrarily large. Similar in spirit to increasing the amount of cache available on chip are designs that integrate microarchitectures and main memory. The idea is to ....
....hierarchies cost effective. Another current trend is that caches are increasing their geometry in size and set associativity. While some smaller applications can be accommodated with relatively small L1 caches [Romer 96, Lee 98] many applications still exist that do not fit in larger L2 caches [Maynard 94, Perl 96] making research on how to efficiently use caches relevant. As caches become larger and more set associative, strategies that emphasize the reuse of lines already present in the cache might be more important than those that specifically target less associative caches. The increase in ....
A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski. Contrasting Characteristics and Cache Performance of Technical and Multi-User Commercial Workloads. In 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 145-- 156, October 1994.
....(while second level instruction cache misses are not important) These results therefore suggest that database developers should pay more attention to the data placement (layout) in the second level cache, and also focus on optimising the critical path for the instruction cache. An earlier study [13] uses traces and simulations from an IBM Power architecture to contrast the di erences between technical and commercial workloads. Six commercial applications (including TPC benchmarks, le servers, etc. are compared to eight technical and scienti c applications (which included computational ....
A.M. Grizza Maynard, C.M. Donnelly and B.R. Olszewski, Contrasting characteristics and cache performance of technical and multi-user commercial workloads, in Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, San Jose, California, 1994.
....Consequently, each resumed thread often suffers additional delays while re populating the processor caches with its evicted working set. As the processor memory speed gap [12] continues to increase, cache misses are becoming an increas ingly important factor of a server s performance. Research [15] has shown that this gap affects commercial database server performance more significantly than it affects other engineering, scientific, or desktop applications. The reason is that database applications access the memory subsystem far more often than desktop or engineering workloads. Moreover, ....
A.M. G. Maynard, C. M. Donelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, October 1994.
....batches, as well as a discussion of the architectural ramifications of these new types of workloads. 1. Introduction For many years, researchers have understood the importance of studying workload characteristics in order to evaluate their impact on current and future systems architecture [6, 18, 20, 27, 31]. Most of these previous application studies have focused on the detailed behavior of single applications, whether sequential or parallel. For example, the caching behavior of the SPEC workloads has long been a topic of intense scrutiny [7, 12] and the communication characteristics of parallel ....
A. M. G. Maynard, C. Donnelly, and B. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 1994.
....Additional front back ends or mediators [Wie92] add to the communication and CPU overhead. Several database researchers have indicated the need for a departure from traditional DBMS designs [Be 98] CW00] SZ 96] due to changes in the way people store and access information online. Research [MDO94] has shown that the ever increasing processor memory speed gap [HP96] affects commercial database server performance more than other engineering, scientific, or desktop applications. Database workloads exhibit large instruction footprints and tight data dependencies that reduce instruction level ....
A. M. G. Maynard, C. M. Donelly, and B. R. Olszewski. "Contrasting Characteristics and Cache Performance of Technical and Multi-user Commercial Workloads." In Proc. ASPLOS-6, 1994.
....prefetchers. 3.4.1. Instruction Prefetchers Instruction miss rates are traditional thought of as less important to processor performance than data miss rates. Recent research by Maynard et al. however, shows that they may be just as important or maybe even more important than data references [19]. While memory references of instructions can be treated just as data references from a prefetching perspective, they exhibit a more predictable access pattern that can be exploited. Fortunately, many L1 caches are separated between data and instructions making it easier to incorporate di erent ....
A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski, \Contrasting characteristics and cache performance of technical and multi-user commerical workloads," Proceedings of the 6th Annual International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 145-156, October 1994. 28
....code. In 1988, Agarwal, Hennessy, and Horowitz [1] modified the microcode of the VAX 8200 to trace both user and system references and to study alternative cache organizations. Later studies were trace based. Some researchers relied on intrusive instrumentation of the OS and user level workloads [16, 48] to obtain traces; while such instrumentation can capture all memory references, it perturbs workload execution [16] Other studies employed bus monitors [26] which have the drawback of capturing only memory activity reaching the bus. To overcome this, some have used a combination of ....
....monitors [78, 88, 79, 14] As an example of more recent studies, Torrellas, Gupta, and Hennessy [78] measured L2 cache misses on an SMP of MIPS R3000 processors; they report sharing and invalidation misses and distinguish between user and kernel conflict misses. Maynard, Donnelly, and Olszewski [48] looked at a trace driven simulation of an 51 IBM RISC system 6000 to investigate the performance of different cache configurations over a variety of commercial and scientific workloads. Their investigation focused on overall memory system performance and distinguishes between user and kernel ....
MAYNARD, A., DONNELLY, C., AND OLSZEWSKI, B. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (October 1994).
....processing, business decision support, and e business applications are driving the development of powerful server systems. From an execution point of view, commercial workloads are different from technical workloads and present more vigorous demands on the memory and storage sub systems [1, 2, 3]. In fact, studies [4] that analyzed transaction processing workloads indicate that systems spend a significant fraction of the execution time waiting for I O devices to access the data. Once the data is brought to main memory, the processor uses a substantial amount of the remaining execution ....
....6 looks at other ideas proposed in the literature. Finally, Section 7 concludes with a highlight of the most significant contributions. 2 Background and Motivation The memory hierarchy present in today s computer systems is designed to benefit from the use of locality. But it has been shown [1, 3, 5, 6, 25] that many commercial workloads incur high miss rates. Given the large sizes of the working sets used by this type of workloads and the current size of the caches, temporal locality is not exploited as successfully as it is in other workloads like SPECint [26] When it comes to spatial locality, ....
A.M. G. Maynard, C. M. Donnelly, and B. R. Olszewski, "Contrasting characteristics and cache performance of technical and multi-user commercial workloads," in Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, (San Jose, CA, USA), pp. 145-156, Oct. 4-7 1994.
....benchmarks from SPECint2000, a suite of more traditional workloads. We run these benchmarks on two IBM PowerPC microarchitectures, the RS64 III and the POWER3 II. 2. Related Work Commercial workloads have been increasing in importance, and efforts have been made to understand their behavior [2,11,8,7,16,1]. Most of these studies have been focused on applications written in C or C , in particular OLTP, DSS, and web server applications. Java has also been a popular subject of research. The majority of Java studies use SPECjvm98 [17,9,15] which is a client benchmark suite. SPECjvm98 has been ....
A.M.G. Maynard, C.M. Donnelly and B.R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the 6 for Programming Languages and Operating Systems. San Jose, October 1994, pp. 145-156.
....on two IBM PowerPC microarchitectures, the RS64 III and the POWER3 II, to determine the performance characteristics of multithreaded Java server applications. 2. Related Work Commercial workloads have been increasing in importance, and efforts have been made to understand their behavior [2,11,8,7,16,1]. Most of these studies have been focused on applications written in C or C , in particular OLTP, DSS, and web server applications. Java has also been a popular subject of research. The majority of Java studies use SPECjvm98 [17,9,15] which is a client benchmark suite. SPECjvm98 has been ....
A.M.G. Maynard, C.M. Donnelly and B.R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the 6 th International Conference on Architectural Support for Programming Languages and Operating Systems. San Jose, October 1994.
....servers. While applications such as decision support (DSS) and Web index search have been shown to be relatively insensitive to memory system performance [2] a number of recent studies have underscored the radically different behavior of online transaction processing (OLTP) workloads [2, 6, 7, 15, 17, 19, 25]. In general, OLTP workloads lead to inefficient executions with a large memory stall component and present a more challenging set of requirements for processor and memory system design. This behavior arises from large instruction and data footprints and high communication miss rates that are ....
....into the interaction between the code layout optimizations and the application behavior without involving potentially complex interactions between the application and operating system footprints. Previous studies have shown that OLTP applications exhibit significant operating system activity [17, 2]. The interactions between the application and operating system instruction streams are analyzed in this section. Figure 12(a) shows the number of instruction cache misses for the combined instruction streams of the unoptimized application and the operating system. The two dotted curves show the ....
[Article contains additional citation context not shown here]
A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 145--156, Oct 1994.
....of [14] we estimate the miss rate to the L2 cache for commercial applications (TPC A and disk to disk sort) at around 5 and for technical scientific applications at less than 0.2 . The L2 cache miss rates of Java workloads are very close to the miss rates of commercial applications reported in [28] (2.7 for TPC A and 1.2 for TPC C) and [4] 2.7 for TPC B) 4.4 The effectiveness of the L2 data cache Fig. 14 shows how effective the L2 data cache is for reducing the number of times a request has to go all the way to memory. This graph plots the percentage of L1 data cache misses that miss ....
A. Maynard, C. Donnelly, and B. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proc. of ASPLOS VI, Oct. 1994.
....by the CPL. The only benchmarks with a significant percentage of operating system instructions are the desktop applications, database applications, and the ############ multiple program trace. It has been noted previously that database programs contain a large percentage of operating system code [12, 13]. We observe that desktop applications also share this characteristic. The final two columns represent the average number of dynamic instructions that are executed by the application and operating system before switching privilege levels. Operating system code is characterized by short sequences ....
A. M. Maynard, C. Donnelly, and B. Olszewski, "Contrasting characteristics and cache performance of technical and multi-user commercial workloads," in Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 145--155, Oct 1994.
....of this server include symmetric multiprocessor systems (SMP) as well as cluster servers. When we look at the execution behavior of commercial workloads, we observe commercial workloads are di erent than technical workloads and present more vigorous demands to the memory and storage sub systems [3, 4, 5]. In fact, studies that analyzed transaction processing workloads indicate that systems spend around 90 of the time waiting for the I O devices to access the data [6] Once the data are brought to memory, the processor uses between 25 to 45 of the execution time handling memory accesses [7] ....
A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski, \Contrasting characteristics and cache performance of technical and multi-user commercial workloads," in Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, (San Jose, CA, USA), pp. 145-156, Oct. 4-7 1994.
....These techniques reduce data cache misses, and are orthogonal to the goal of CGP which tries to reduce I cache misses. CGP may be implemented on top of these cache conscious algorithms. It is only recently that researchers have examined the performance impact of architectural features on DBMSs [1, 12, 25, 10, 19, 9, 11, 14]. Their results show that database applications have large instruction and data footprints and exhibit more unpredictable branch behavior than benchmarks that are commonly used in architectural studies (e.g. SPEC) Database applications have fewer loops and suffer from frequent context switches, ....
A. Maynard, C. Donnelly, and B.R. Olszewski. Contrasting Characteristics and Cache Performance of Technical and Multi-user Commercial Workloads. In Proceedings 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 145-- 156, October 1994.
No context found.
A. M. G. Maynard, C. M. Donelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, October 1994.
No context found.
A. Maynard, C. Donnelly, B. Olszewski, "Contrasting Characteristics and Cache Performance of Technical and Multi-User Commercial Workloads", Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 145-156, 1994.
No context found.
A. Maynard, et al. "Contrasting characteristics and cache performance of technical and multi-user commercial workloads. " In Proc. of the 6th Intl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 145--156, October 1994.
No context found.
A. Maynard, C. Donnelly, and B. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proc. of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 145--56, October 1994.
No context found.
A. M. Maynard, C. M. Donnelly and B. R. Olszewski, "Contrasting Characteristics and Cache Performance Technical and Multi-User Commercial Workload", in Proc. ASPLOS'94, Oct. 1994, pp. 145155.
No context found.
A. M. Maynard, C. M. Donnelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 145-156, San Jose, CA USA, Oct. 1994.
No context found.
A. Maynard, C. Donnelly and B. Olszewski, "Contrasting Characteristics and Cache Performance of Technical and Multi-user Commercial Workloads", Proc. of the 6th Intl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pg 145-156, Oct. 1994.
No context found.
A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 145--156, October 1994.
No context found.
A.M.G. Maynard, C.M. Donnelly, and B.R. Olszewski, "Contrasting Characteristics and Cache Performance of Technical and Multi-user Commercial Workloads," Proc. 6th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS 94), ACM Press, 1994, pp. 145-156.
No context found.
Maynard, A. M., Donnelly, C. and Olszewski, B. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, ACM, 145-156, 1994.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC