| J. C. Mogul and A. Gorg, "The Effect of Context Switches on Cache Performance, " Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 75--84, April 1991. |
....invalidations introduce an additional miss rate component. In this paper, we focus on the multiprogramming component of the miss rate. Understanding this component is important because significant evidence exists showing that it is the chief determinant of the miss rate in large caches [6, 7]. While the model for the multiprogramming component of misses developed in [4] can be used to yield fast estimates of cache performance, it is not simple, and hence does not offer much S Number of cache sets (or rows) u Working set size (in blocks) of each process p Number of processes or the ....
....predicted cache miss rates with those obtained from trace driven simulation for various cache sizes and with different degrees of multiprogramming. Trace driven simulation [15] is a popular technique for cache evaluation, especially since address traces for various workloads are publicly available [16, 7]. This section first describes our simulation methodology and then analyzes the results. 6.1 Simulation Methodology Our experiments with the model will use both real traces of multiprogrammed workloads and replicated traces. Replicated traces are like multiprogrammed traces, but they are ....
Jeffrey C. Mogul and Anita Borg. The Effect of Context Switches on Cache Performance. In ASPLOS-IV Proceedings, pages 75--84, April 1991.
....threshold between hardware based and software based protection depends on the amount of work done in the service, but the evolution of technology will favor hardware based protection schemes. The first two assertions are not surprising. The third is different from what earlier work [Chen 93] Mogul 91] Bersh 92] implies, and is probably a result of changes in hardware architectures. The fourth is counter intuitive since it seems to say that sharing is better than copying for small data objects, but not for larger ones. The final one is most significant and, perhaps, most controversial. 3 2. ....
....although cpu performance has improved significantly over the last few years, operating system performance, especially context switch times have clearly not kept pace. His work did not analyze the reasons for this behavior. The impact of context switches on the cache performance is analyzed in [Mogul 91] This work demonstrated that the cost of cache refills can dominate overall cost of a context switch. Chen and Bershad [Chen 93] analyzed the effect of the system software decomposition on the memory subsystem performance. The work argues that the separation of operating system functionality ....
J.Mogul, A. Borg, The Effect of Context Switches on Cache Performance, 4th Int'l Conf Architectural Support for Programming Languages and Operating Systems, ACM, pp. 75-85, 1991
....to set up correct and representative scenarios to be measured. If for instance the worst case execution time (WCET) is to be measured, one must set up an execution path that leads to the WCET. Execution time and other performance issues can either be statically analyzed [1, 2, 3, 4] or simulated[5, 6], or measured directly on the target system[4, 7] The advantage of static methods is that they are safe if the system model and analysis method are correct and compatible with each other. The hard part is to add complex structures into the model like pipelining, cache memories, DMA and other ....
....was measured to 86,9 s , which means that the total preemption delay is 282.4s . In relative terms the major part of the context switch cost, or 195.5 282.4 = 69 , is cacherelated. It is quite interesting that the CRPD is almost the same compared to Mogul and Borg s measurements a decade ago[5], which were 10 400 s . During this time the processors have become magnitudes times faster and this means that the CRPD has grown in relative terms. The method to get the CRPD is practicable to get a safe value that is directly useable in a scheduling algorithm. Even if the value is ....
Jeffrey C. Mogul and Anita Borg. The effect of context switches on cache performance. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 75--84, Santa Clara, CA, USA, April 1991.
....however, focuses on whether the modeling of multiprogramming affects predictor behavior. Since we find that it typically does not, it is unnecessary (and prohibitively expensive) to obtain IPC measurements. Some similar work has explored caching in the face of context switching. Mogul and Borg [16] examined how cache hit rate varies after a context switch and noted that the cost of context switches, in terms of how it affects cache performance, can guide cache design. Hwu and Conte [10] studied the worst case susceptibility of programs to context switching. In their study, they develop ....
J. C. Mogul and A. Borg. The effect of context switches on cache performance. Tech. Note TN-16, DEC WRL, Dec. 1990. 20
....Tunix interleaves the traces generated by multiple tasks into a global trace buffer that is periodically emptied by a trace processing program. These researchers also experimented with instrumenting the Tunix kernel itself, although they do not report any results obtained from these traces [Mogul91]. Chen continued this work by porting a version of epoxie to a MIPS based DECstation running Ultrix and Mach 3.0 to produce traces from single task workloads including the user level X and BSD servers, and the kernel itself [Chen93a; Chen93c] As a rule, static code instrumentation cannot handle ....
Mogul, J. C. and Borg, A. The effect of context switches on cache performance. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, ACM, 75-84, 1991.
....among concurrent processes. of cache sizes that are large enough to have a considerable number of conflicts but not large enough to hold all the working sets. However, these models work only for long enough time quanta, and require information that is hard to collect on line. Mogul and Borg [12] studied the effect of context switches through trace driven simulations. Using a timesharing system simulator, their research shows that system calls, page faults, and a scheduler are the main sources of context switches. They also evaluated the effect of context switches on cycles per ....
J. C. Mogul and A. Borg. The effect of context switches on cache performance. In the fourth international conference on Architectural support for programming languages and operating systems, 1991.
....how overall system performance is affected. Our work, however, focuses on whether the modeling of multiprogramming affects predictor behavior. Since we find that it typically does not, it is unnecessary (and prohibitively expensive) to obtain IPC measurements. Similar work by Mogul and Borg [16] has explored caching in the face of context switching. They examined how cache hit rate varies after a context switch and noted that the cost of context switches, in terms of how it affects cache performance, can guide cache design. Hwu and Conte [10] studied the worst case susceptibility of ....
J. C. Mogul and A. Borg. The effect of context switches on cache performance. Tech. Note TN-16, DEC WRL, Dec. 1990.
....accurate results, but simulation time is often long. Hardware monitoring can dramatically speed up the process [26] however, it is limited to the particular cache configuration. As a result, both simulations and hardware monitoring can only be used to evaluate the effect of context switches [14, 10]. Moreover, simulations and monitoring rarely provide intuitive understanding making it difficult to improve cache performance. To provide both performance prediction and insight into improving performance, analytical cache models are required. We use our model to determine the best cache ....
....are noticeable for a mid range of cache sizes that are large enough to have a considerable number of conflicts but not large enough to hold all the working sets. However, these models work only for long enough time quanta, and require information that is hard to collect on line. Mogul and Borg [14] studied the effect of context switches through trace driven simulations. Using a timesharing system simulator, their research shows that system calls, page faults, and a scheduler are the main sources of context switches. They also evaluated the effect of context switches on cycles per ....
J. C. Mogul and A. Borg. The effect of context switches on cache performance. In the fourth international conference on Architectural support for programming languages and operating systems, 1991.
.... this problem is to add code specially designed to execute the loop a constant few iterations (Hwu calls this type of structure a superblock in [75] There is also a secondary cost of loop unrolling in some architectures caused by the additional cache misses due to the increased code size [115][116][40] 171] The efficiency of loop unrolling quickly drops in relation to the size of original loop inefficiency and the unroll count. It is easy to see why this is the case. Each additional time the loop is unrolled, the idle portion of one iteration is removed. The idleness reduces at the rate ....
.... . MMMMMMMMMM 0 10 50 100 110 120 130 140 150 200 0 10000000 20000000 30000000 40000000 50000000 Unroll Size maximum insns B 128w J 256w H 512w F 1k 2k . 4k 8k 16k M 32k 64k 156 4. 4 Context Switch Effects In [116] Mogul and Borg find a performance degradation due to context switching of 1 to 7 depending on the program mix and cache design. This study, and another by Steenkiste [171] show that the additional effect of having a larger code size when context switching might be 10 of cost of context ....
J. C. Mogul, A. Borg, The Effect of Context Switches on Cache Performance, Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 1991, vol. 19, pp. 75-84.
....CPU resources. Industrial developers are today in a great need of real values to implement new software based products on highperformance processors. To the best of our knowledge the only work that has been presented to measure the CRPD is Mogul and Borg s trace driven simulation of a UNIX system [9]. Mogul and Borg measured the delay (#)to### # ##### of a task. The traces were however not taken from a real time system and all the time slices were of equal size. The cache memories are today larger and more complex than those the simulations were performed at. Performance estimation on cache ....
Jeffrey C. Mogul and Anita Borg. The effect of context switches on cache performance. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 75--84, Santa Clara, CA, USA, April 1991.
....Thus past studies have shown significant dependence of overall performance on cache misses and we observe that cache misses could be 15 to 35 higher if operating system activity was considered. Others have studied the effect of context switches on cache performance in non x86 environments [16, 17]. Rosenblum et al.# studied the execution of some applications under the effects of a full system simulation [8] SimOS has intensified research using full system traces, however, it has not been particularly helpful for the x86 domain. Evers et al.# analyzed the effect that context switches have ....
J. C. Mogul and A. Borg, "The effect of context switches on cache performance, " Tech. Rep. TN-16, Digital Western Research Lab, Palo Alto, CA, USA, Dec 1990.
....0.040 Table 10.3: Reduction in application throughput due to clock interrupts as a function of frequency and cache state of re establishing the working set can be almost 3 ms an enormous penalty when compared to the in kernel context switch cost on the order of 10 s. In 1991 Mogul and Borg [62] showed that the cache performance cost of a context switch could dominate overall context switch performance. Furthermore, they speculated that in the future the increasing cost of memory accesses in terms of CPU cycle times would make the impact of the cache performance part of context switch ....
Jeff Mogul and Anita Borg. The Effect of Context Switches on Cache Performance. In Proc. of the 4th International Conf. on Architectural Support for Programming Languages and Operating Systems, pages 75--84, Santa Clara, CA, April 1991. Bibliography 167
....He studies the effects on the cache of the interaction between the kernel process and user processes in a VAX system. His results show that about 51 of the overall cache misses of user processes are caused by interference induced by the kernel process. Similar research was done by Mogul and Borg [12]. They quantify the additional cache misses caused by context switching in a UNIX environment. Their results show up to 8 difference in Clocks Per Instruction (CPI) between the system with a private cache for each process and the system where a single cache is shared by all the processes. ....
J. C. Mogul and A. Borg. The effect of context switches on cache performance. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 75--84, 1991.
....cache misses. This is because preemption due to unrelated, even lower priority, activities can occur frequently and at arbitrary instants. Not only does this result in loss of CPU capacity to unnecessary context switches, it also increases the likelihood of disturbing the footprint in the cache [30], unless the cache is suitably partitioned [31] This is particularly true for preemption caused by external events such as network interrupts. One can account for the cache miss penalty due to preemption via careful schedulability analysis [32] but frequent preemption still degrades available ....
J. Mogul and A. Borg, "The effect of context switches on cache performance," in Proc. Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 75--85, April 1991. 29
....element (PE) it is executing upon. Depending on the underlying OS and hardware platform, performing a context switch may involve dozens to hundreds of instructions due to the flushing of register windows, instruction and data caches, instruction pipelines, and translation look aside buffers [26]. Synchronization mechanisms are necessary to serialize access to shared objects (such as messages, message queues, protocol context records, and demultiplexing tables) related to protocol processing. Certain methods of parallelizing protocol stacks incur significant synchronization overhead from ....
J. C. Mogul and A. Borg, "The Effects of Context Switches on Cache Performance," in Proceedings of the 4 th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), (Santa Clara, CA), ACM, Apr. 1991.
....fine grained parallel programs. These latencies are due to both communication and synchronization among parallel computations. Counteracting the benefits of multithreading is the cost of context switching, which includes both direct overhead and the indirect cost of impaired cache performance [MB91] Two different approaches to lowering the costs of frequent context switches are bringing the programming model closer to the architecture (as is done by Active Messages [vECGS92] and bringing the architecture closer to the programming model s needs (as is done by the MIT J Machine [DFK ....
Jeffrey C. Mogul and Anita Borg. The effect of context switches on cache performance. In Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, April 1991.
....which is organized as a 4 way set associative with line size of 32 words. The time slice for a trace to interleave is 50,000 clock cycles and the probability that the task will migrate at the end of its time slice is 5 . Since only a few of the previous papers regarding cache coherent protocols [8] take into consideration the effect of process migration, we start our comparison by looking at how the performance of the system changes as a function of the number of processors in a system where no migration is assumed. Later on we will present the effect of the task migration on these ....
J. C. Mogul and a. Borg. "The Effect of Context Switches on Cache Performance". In the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 75-84, April 1991.
....usage appropriately for each application. In [Ander91] the technique is called scheduler activation s and in [Tucker93] it is called process control . The significant negative performance impact of thread imbalance on these commodity processor based parallel processing systems was identified in [Ander91,Tucker93,MoBo90]. In [Tucker93] the negative performance impact was broken down into its component causes and carefully measured using the SPLASH [SPLASH] benchmark applications. These issues are explored further in Chapter 2. Recent work in [YueLilja95,YueLilja95a,LiuLilja96] is most closely related to the ....
....overall efficiency is improved if the program with poor speedup operates with fewer threads. 39 As the speed of the CPU s has increased and the increasing reliance on data resident in cache, the problem of a context switch corrupting cache has become an increasing performance impact. In [MoBo90], when a compute bound process was context switched on a cache based system, the performance of the application was significantly impacted for the next 100,000 cycles after the process regained the CPU. The context switch still had a small negative impact on performance up to 400,000 cycles after ....
J. C. Mogul and A. Borg, The Effect of Context Switches on Cache Performance, DEC Western Research Laboratory TN-16, Dec., 1990. http://www.research.digital.com/wrl/techreports/ /abstracts/TN-16.html
.... is a function of the design of the network adapter and the packet input mechanism adopted [151, 173] Further, since the communication subsystem typically shares processing resources with application threads, additional scheduling and context switching overheads, and associated cache misses [124], may be incurred. Additional sources of overhead include buffer management, timer management, and error recovery mechanisms such as retransmissions. As highlighted in the remaining chapters of this dissertation, support for QoS sensitive handling of data imposes new overheads and demands on ....
....cache misses. This is because preemption due to unrelated, even lower priority, activities can occur frequently and at arbitrary instants. Not only does this result in loss of CPU capacity to unnecessary context switches, it also increases the likelihood of disturbing the footprint in the cache [124], unless the cache is suitably partitioned [110] This is particularly true for preemption caused by external events such as network interrupts. One can account for the cache miss penalty due to preemption via careful schedulability analysis [105] but frequent preemption still degrades available ....
J. Mogul and A. Borg, "The effect of context switches on cache performance," in Proc. Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 75--85, April 1991.
....no means to control the quantity of resource handed out. In order to provide fine grained timeliness guarantees to applications which are latency sensitive, higher rates of context switching are unavoidable. The effects of context switches on cache and memory system performance are analysed in [Mogul91]. It is shown that a high rate of context switching leads to excessive numbers of cache and TLB misses reducing the performance of the entire system. The use of a single address space in Nemesis removes the need to flush a virtually addressed cache on a context switch, and the process ID fields ....
Jeffrey C. Mogul and Anita Borg. The Effect of Context Switches on Cache Performance. In Proceedings of the 18th International Symposium on Computer Architecture, 1991. (p 35)
....the response will be handled by the leader, thereby minimizing context switching and synchronization overhead. One drawback with the FIFO promotion protocol, however, is that the thread that is promoted next is the thread that has been waiting the longest, thereby minimizing CPU cache affinity [5, 17]. Thus, it is likely that state information, such as translation lookaside buffers, register windows, instructions, and data, residing within the CPU cache for this thread will have been flushed. ffl Specific order: This ordering is common when implementing a bound thread pool, where it is ....
J. C. Mogul and A. Borg, "The Effects of Context Switches on Cache Performance," in Proceedings of the 4 th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), (Santa Clara, CA), ACM, Apr. 1991.
....increments a counter. The instrumentation is usually done on the executable file, although it can also be done on the assembly file. While there are some differences among the different tools, this broad category includes, for example, ATOM [16] and EEL QPT [12] as well as many other older systems [7, 9, 10, 13]. The new tools have been highly tuned and run very efficiently. However, they are designed to instrument a single application. Some of them, like 2 ATOM, have been applied to a uniprocessor operating system. However, these are hardly the right tools to capture complete traces of a ....
J. Mogul and A. Borg. The Effect of Context Switches on Cache Performance. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 75--84, April 1991.
....3) Unnecessary context switch overhead, and 4) Corruption of caches due to context switches. In addition, they provide an excellent survey of related work. As the speed of the CPU s has increased, the problem of a context switch corrupting cache has become an increasing performance impact. In [3], when a compute bound process was context switched on a cache based system, the performance of the application was significantly impacted for the next 100,000 cycles after the process regained the CPU. The context switch still had a small negative impact on performance up to 400,000 cycles after ....
....It is important to note that this phenomenon is not unique to the Exemplar architecture. This performance effect is due to the nature of cache based parallel processors. It is well known that cache misses can dramatically affect the performance of today s high performance RISC processors [3]. In a multi threaded, parallel processing system experiencing thread imbalance, the probability of cache misses is very high. Other RISC, cache based parallelprocessing systems exhibit the same behavior under thread imbalance [5] Thread imbalance induced cache thrashing is a significant ....
J. C. Mogul and A. Borg, The Effect of Context Switches on Cache Performance, DEC Western Research Laboratory TN-16, Dec., 1990. http://www.research.digital.com/wrl/techreports /abstracts/TN-16.html
....problems is by causing important data to be flushed as a side effect of context switching. This effect becomes more noticeable the smaller the memory, and is therefore especially relevant for caches. A number of studies have demonstrated the adverse effect of context switching on cache performance [408, 249, 598]. Affinity scheduling, where threads are scheduled back onto previously used PEs so as to benefit from data that may still reside in their caches, tries to counter this effect [598, 157, 38, 544, 576, 106] It has even been proposed that affinity hints be included in the programming language ....
J. C. Mogul and A. Borg, "The effect of context switches on cache performance". In 4th Intl. Conf. Architect. Support for Prog. Lang. & Operating Syst., pp. 75--84, Apr 1991.
....reduce or eliminate false sharing. However, this is successful only if the program structure is statically deducible. Context Switches: Studies by Mogul and Borg have shown that context switching is responsible for a significant number (40 80 ) of cache misses immediately following the switch [MB91] Since context switching is inevitable in multitasking operating systems, the only possible solution is to use larger or more set associative caches that can simultaneously accommodate more than one working set. Thus, remote memory accesses are unavoidable, and with long latencies, even a few ....
J. C. Mogul and A. Borg. The effect of context switches on cache performance. Fourth Conference on Architectural Support for Programming Languages and Operating Systems, pages 75--84, April 1991.
....from one PE to another, leaving data behind, compromise the locality of reference [294, 232, 233] The other is by causing important data to be flushed as a side effect of context switching. This effect becomes more noticeable the smaller the memory, and is therefore especially relevant for caches [249, 152, 348]. Affinity scheduling, where threads are scheduled back onto previously used PEs so as to benefit from data that may still reside in their caches, tries to counter this effect [348, 86, 25, 328, 339, 57] Effect on communication locality Communication is an important aspect of parallel ....
J. C. Mogul and A. Borg, "The effect of context switches on cache performance". In 4th Intl. Conf. Architect. Support for Prog. Lang. & Operating Syst., pp. 75--84, Apr 1991.
....the same cache. Code and data from the kernel can interfere with those of user processes and vice versa, and user processes can interfere with each other. Previous work suggests that the cost of cache interference between processes can dwarf all other costs associated with context switches [1] [9] [11] Cache misses are sometimes classified by the three C s model, which decomposes cache behavior into compulsory, capacity, and conflict misses. In a multiprocessor, there is also a fourth C, coherency misses. One way to view the interference resulting from contention amongst multiple ....
....them to record both user and kernel references, the maximum size of their traces was limited to approximately a million references per benchmark. Furthermore, the VAX architecture examined in the study, while popular at the time, does not reflect contemporary system designs. Mogul and Borg [9] measured the effect of context switches on cache performance. They computed the cache cost of context switching on a DECstation 5000 to be in the range of 10 to 400 s. They compared this to a measured OS context switch time of 70 s [10] and an estimated minimum cost of 7.4 s [2] which shows that ....
[Article contains additional citation context not shown here]
MOGUL, J. C., AND BORG, A. The effect of context switches on cache performance. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, CA, Apr. 1991), pp. 75--84.
....efficiently. When an operating system deschedules one process, and starts another running, the assumption of locality, on which good cache performance depends, may be violated because the instructions and data of the newly scheduled process may no longer be in the cache or caches. Mogul and Borg [45] used address traces from a variety of real workloads to study the impact of having to periodically reload the working set of a process into the processor cache, owing to the process being descheduled and then later re scheduled. They found cache reload overheads of up to 8 of the execution ....
J. C. Mogul, A. Borg, "The Effect of Context Switches on Cache Performance", Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, April 1991, pp. 75-84.
....are constantly experiencing cold start. As illustrated by Figure 3, miss ratios for the SPEC benchmarks are considerably below those for any workloads with significant OS activity. Similar differences in cache performance between compute bound and multiprogrammed environments are reported in [Mogu91]. The SPEC floating point benchmark miss ratios are quite close to the DTMRs, the data from [Agar88] and the VAX 11 780 measurements, and for large cache sizes are also very close to the Amdahl 470 user program miss ratios. The SPEC integer benchmark miss ratios are lowest. 4. Conclusions The ....
J. C. Mogul, and Anita Borg, "The Effects of Context Switches on Cache Performance," Proc. ASPLOS-IV, April, 1991, Santa Clara, CA, pp. 75-84.
....providing sufficient information for the system to manage the data transfer by itself without requiring user process execution. Thus, context switch operations for data transfer are eliminated. Context switches consume CPU resources, degrade cache performance by reducing locality of reference [MB91], and effect the performance of virtual memory by requiring TLB invalidations [BALL90] Additional motivations for addressing the issue of context switching performance are mentioned in [PCMI91] CJRS89] and [CHKM88] 1 A similar technique is used in the 4.3BSD Reno NFS Implementation ....
Jeffrey Mogul and Anita Borg. The effect of context switches on cache performance. Proc. ASPLOS-IV, pages 75--84, April 1991.
....as opposed to just the average miss ratio. In other words cache misses tend to occur in bursts between which are relatively long periods virtually free of misses. Figure 4: Cache behaviour after a context switch. To determine the extent of the impact of multiprogramming on cache performance, Mogul and Borg [1991] looked at how the cache hit rate varies after a context switch. Address traces from a multi tasking operating system were generated and fed on the fly to a cache simulator. By marking the output of the simulation whenever a context switch occurred and then aggregating miss rate time reload ....
....time) are optimally used. However, context switching impacts negatively on caches and this effect needs to be understood, particularly as an increased rate of context switching is advocated in this research. The reason for bad cache performance in the presence of context switches in the study by Mogul and Borg [1991] is due to conflict misses between processes and capacity misses because the cache is too small to hold the working sets of all the processes. From the study by Thiebaut and Stone [1987] increasing the cache size (to accommodate more process footprints) should help to solve this problem. In ....
J Mogul and A Borg. The Effect of Context Switches on Cache Performance, Proc. 4th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA, 1991, pp 75-84.
....however a rare occurrence since SML programs tend to do few assignments (see Section 4.3) and most writes are to sequential locations. 2. Ignoring the effects of context switches and system calls. Context switches (especially those caused by system calls) can affect cache performance significantly [36]. We ignore this because it is an operating system issue that affects all programs, not just programs that are allocation intensive. 3. Pessimistic simulation of partial word writes. Most memory subsystems use a word as the smallest addressable unit and also maintain error checking information on ....
....and 64 entry TLB corresponds closely to the DECStation 5000 200 with the following important differences: ffl The simulations ignored the effects of context switches and system calls. Thus, actual program runs suffered more data and instruction cache misses than those reported by the simulations [36]. ffl The simulations assumed a virtual address=physical address mapping. Kessler and Hill [29] show that random mapping (as used in the actual runs) can have many more conflict misses than a careful mapping (such as that assumed by the simulations) Thus, the actual runs probably suffered more ....
Mogul, J. C., and Borg, A. The effect of context switches on cache performance. In Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, California, Apr. 1991), pp. 75--84.
....wall clock delays that may be caused by time sharing, the context switches themselves take a small amount of time and may result in partial drainage of the cache and translation buffers 1 , giving a higher miss rate than expected. Context switches in relation to cache performance is studied in [24]. 1 Context switching, or merely the handling of interrupts, also affects branch prediction because the instruction stream is broken. CHAPTER 2. MEMORY LATENCY 7 4. Garbage collection produces delays in the execution. Again the cache is effected because this activity involves transfers to and ....
J. C. MOGUL AND A. BORG, The effect of context switches on cache performance, in Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara (1991), 75--84.
....process scheduler, and the effects of time dilation on scheduler policy is minimized by focusing on single process and client server workloads where context switches are driven by the applications and not by the scheduler policy. Similar techniques were employed by Agarwal [2] and Mogul and Borg [20]. Mazieres and Smith [19] describe another multitasking tracing tool based on the QPT instrumentation tool [18] that performs late code modification. Unlike Chen [4] their research is interested in the analysis and evaluation of I O bound applications such as network applications. Therefore, they ....
J. C. Mogul and A. Borg, "The effect of context switches on cache performance, " Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, (Santa Clara, CA), 1991, pp. 75-84.
....opposed to the 2N cells that would be required if each cell were separately acknowledged. 5. 4 Control Transfer Context switching causes a significant portion of the overhead in RPC [58] in addition, there is a substantial impact on processor performance due to cache misses after a context switch [49]. An RPC call typically requires four context switches: switching the client out, switching the server in, switching the server out, and finally switching the client back in. Two of these switching the client or the server out can be overlapped with the transmission of the packet. Systems with ....
Jeffrey C. Mogul and Anita Borg. The effect of context switches on cache performance. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 75--84, April 1991.
....a number of the major problems with having too many threads including: 1. Preemption during spin lock critical section, 2. Preemption of the wrong thread in a producerconsumer relationship, 3. Unnecessary context switch overhead, and 4. Corruption of caches due to context switches (also see [4]) The general topic of scheduling for parallel loops is one that is well studied. The basic approach of these techniques is to partition the iterations of a parallel loop among a number of executing threads in a parallel process. The goal is to have balanced execution times on the processors ....
J. C. Mogul and A. Borg, The Effect of Context Switches on Cache Performance, DEC Western Research Laboratory TN-16, Dec., 1990. http://www.research.digital.com/wrl/techreports /abstracts/TN-16.html
....are rarely a shared resource 2. Egalitarian scheduling policies have made real time awkward 3. Interrupt handling overhead is large ( for example, a save restore of the RISC System 6000 s registers is 256 bytes versus the 48 byte ATM payload) and effects a significant reduction in cache [17] effectiveness. Full interrupt service per ATM cell would severely limit the workstation s network bandwidth. 4. The general solution to this problem is to use more aggressive I O device management policies and scheduling strategies. An example would using an interrupt only as an event indicator. ....
Jeffrey C. Mogul and Anita Borg, "The effect of context switches on cache performance," in In Proceedings, Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), Santa Clara, CA (April 8-11, 1991), pp. 75-85.
....The next problem associated with conventional coscheduling is the cache reloading effect. When the processes in the system are sharing the service in a round robin manner, the caches may lose any useful contents related to an earlier computation, which will result in a very low hit ratio [6, 11, 13]. Cache size has greatly been increased recently with the rapid progress of VLSI technology, which may alleviate the cache reloading effect to some extent. When there is a large number of processes in the system, however, the cache reloading effect can still be significant. To alleviate this ....
J. C. Mogul and A. Borg, The effect of context switches on cache performance, Proceedings of 4th International Conference on Architect. Support for Prog. Lang. and Operating Systems Apr. 1991, pp.75-84.
....basis; these address spaces are largely a protection mechanism, and the individual virtual address space provides this protection at a considerable performance penalty, e.g. when context switches are required. Context switches are traditionally an expensive operation and are performed quite often [26]. One way to reduce this expense is to form extremely lightweight threads, and this is one UPWARDS approach. The UPWARDS interprocess communication mechanism is shared memory, upon which other mechanisms such as message passing or RPC can be constructed. We have shown experimentally, for example, ....
J. C. Mogul and A. Borg. The effect of context switches on cache performance. In Proc. Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), Santa Clara, CA, April 1991.
....instruction, than on Ultrix. In the context of our traces, we past experiences [16] extrapolated microbenchmarks explore a number of popular assertions about the memory [9, 31] and extensive measurements of real systems runsystem behavior of modern operating systems, paying spening real programs [3, 4, 5, 14, 15, 28, 35, 36]. Our cial attention to the effect that Mach s microkernel arevaluation relies on combined system and user memory chitecture has on system performance. Our results indicate reference traces generated through software instrumenthat many, but not all of the assertions are true, and that a tation of ....
....software instrumenthat many, but not all of the assertions are true, and that a tation of the systems running a broad selection of few, while true, have only negligible impact on real sysworkloads. tem performance. Previous trace based studies have focused on variations in memory system structure [2, 3, 4, 10, 13, 28, 32], multiprocessors and multiprocessor workloads [35, 36] or subcomponents of the memory system [30] In contrast, our goal is to explore the impact of operating system struc This research was sponsored in part by the Advanced Research Projects Agency, Information ....
[Article contains additional citation context not shown here]
Jeffrey C. Mogul and Anita Borg. The Effect of Context Switches on Cache Performance. The Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, April, 1991, pp. 75-84.
No context found.
J. C. Mogul and A. Gorg, "The Effect of Context Switches on Cache Performance, " Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 75--84, April 1991.
No context found.
J. C. Mogul and A. Borg, "The Effects of Context Switches on Cache Performance," in Proceedings of the 4 th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), (Santa Clara, CA), ACM, Apr. 1991.
No context found.
Jeffrey C. Mogul and Anita Borg. The effect of context switches on cache performance. In Proceedings of Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, CA), pages 75--84, 1991.
No context found.
J. C. Mogul and A. Borg. The effect of context switches on cache performance. In Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, pages 75--84. ACM Press, 1991. 14
No context found.
Jeffrey C. Mogul and Anita Borg. The effect of context switches on cache performance. In Proceedings of Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, CA), pages 75--84, 1991.
No context found.
Jeffrey C. Mogul and Anita Borg. The effect of context switches on cache performance. In ASPLOS [ASPLOS1991], pages 75--84.
No context found.
J. C. Mogul and A. Borg. The Effect of Context Switches on Cache Performance. In the 4th Int'l Conference on Architectural Support for Programming Languages and Operating Systems, pp. 75-84, April 1991.
No context found.
Mogul, J.C., and Borg, A. "The Effect of Context Switches on Cache Performance". ACM ASPLOS-IV, Sigplan Not. 26,4 (April 1991),75-84.
No context found.
J. C. Mogul and A. Borg, "The Effects of Context Switches on Cache Performance," in Proceedings of the 4 th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), (Santa Clara, CA), ACM, Apr. 1991.
No context found.
Jeffrey C. Mogul and Anita Borg, "The Effect of Context Switches on Cache Performance", Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Operating Systems Review, vol. 25, no. Special Issue, pp. 75-84, Santa Clara, California, April 1991.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC