6 citations found. Retrieving documents...
XIA, C., AND TORRELLAS, J. Improving the data cache performance of multiprocessor operating systems. In Proceedings of the International Symposium on High-Performance Computer Architecture (February 1996).

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
An Analysis of Software Interface Issues for SMT Processors - Redstone (2002)   (1 citation)  (Correct)

....such instrumentation can capture all memory references, it perturbs workload execution [16] Other studies employed bus monitors [26] which have the drawback of capturing only memory activity reaching the bus. To overcome this, some have used a combination of instrumentation and bus monitors [78, 88, 79, 14]. As an example of more recent studies, Torrellas, Gupta, and Hennessy [78] measured L2 cache misses on an SMP of MIPS R3000 processors; they report sharing and invalidation misses and distinguish between user and kernel conflict misses. Maynard, Donnelly, and Olszewski [48] looked at a ....

XIA, C., AND TORRELLAS, J. Improving the data cache performance of multiprocessor operating systems. In Proceedings of the International Symposium on High-Performance Computer Architecture (February 1996).


An Analysis of Operating System Behavior on a.. - Redstone, Eggers, Levy (2000)   (9 citations)  (Correct)

....while such instrumentation can capture all memory references, it perturbs workload execution [7] Other studies employed bus monitors [12] which have the drawback of capturing only memory activity reaching the bus. To overcome this, some have used a combination of instrumentation and bus monitors [5, 39, 46, 40]. As an example of more recent studies, Torrellas, Gupta, and Hennessy [39] measured L2 cache misses on an SMP of MIPS R3000 processors; they report sharing and invalidation misses and distinguish between user and kernel conflict misses. Maynard, Donnelly, and Olszewski [25] looked at a ....

C. Xia and J. Torrellas. Improving the data cache performance of multiprocessor operating systems. In Proceedings of the Second International Symposium on HighPerformance Computer Architecture), February 1996.


Performance Issues for Multiprocessor Operating Systems - Gamsa, Krieger, Parsons.. (1995)   (Correct)

....both for sharing misses and upgrade misses. 3 Some results suggest that there is often sufficient locality that the locks can generally be assumed to reside in the cache [30] However, later work (by the same author) suggests that coherence traffic due to locking is a significant problem [35]. The effects of false sharing may well explain this apparent contradiction. 4 Even the Digital 8400 multiprocessor, which is expressly optimized for read miss latency, has a 50 percent higher latency than Digital s earlier, lower performance, uniprocessors [10] in early work on Mach on the ....

....system data structure must take into account the expected access pattern, degree of sharing, and synchronization requirements. Nevertheless, we have found a number of principles and design strategies that have repeatedly been useful. These include principles previously proposed by us and others [7, 32, 33, 35], refinements of previously proposed principles [14] to address the specific needs of system software, and a number of new principles. 3.1 Structuring data for caches When frequently accessed data is shared, it is important to consider how the data is mapped to hardware cache lines and how the ....

[Article contains additional citation context not shown here]

Chun Xia and Josep Torrellas. Improving the data cache performance of multiprocessor operating systems. In To appear in HPCA-2, 1996.


Instruction Prefetching of Systems Codes With Layout Optimized.. - Chun Xia (1996)   (8 citations)  Self-citation (Xia Torrellas)   (Correct)

....use a multiprocessor to capture a larger range of systems activity, including multiprocessor scheduling and cross processor interrupts. In this section, we discuss the hardware and software setup used and the workloads traced. More details on the setup and workload characteristics can be found in [12, 14]. 2.1 Hardware and Software Setup We gather the traces from a 4 processor bus based Alliant FX 8 multiprocessor. The operating system running in the machine is a slightly modified version of Alliant s Concentrix 3.0. Concentrix is multithreaded, symmetric, and is based on Unix BSD 4.2. We use a ....

C. Xia and J. Torrellas. Improving the Data Cache Performance of Multiprocesor Operating Systems. In Proceedings of the 2nd International Symposium on High-Performance Computer Architecture, pages 85--94, February 1996.


Exploiting Multiprocessor Memory Hierarchies For Operating Systems - Xia (1996)   (1 citation)  Self-citation (Xia)   (Correct)

....corresponding solutions; Chapter 7 combines all effective optimization schemes together and discusses the cost performance trade offs among them; and, finally, Chapter 8 concludes this work and discusses issues to be considered in future work 1 . 1 Most part of Chapter 3 to 6 can be found in [42, 45, 44]. The electronic form of this thesis and papers is available at http: www.csrd.uiuc.edu iacoma iacomapapers.html. Chapter 2 Experiment Method and Setup 2.1 Methodology This thesis study is conducted using empirical methodology. We carry out a series of experiments using the following ....

C. Xia and J. Torrellas. Improving the Data Cache Performance of Multiprocesor Operating Systems. In Proceedings of the 2nd International Symposium on HighPerformance Computer Architecture, pages 85--94, February 1996.


Comprehensive Hardware and Software Support for Operating.. - Xia, Torrellas (1999)   (4 citations)  Self-citation (Xia Torrellas)   (Correct)

....this question directly or completely. A large group of researchers have examined the cache performance of the operating system without focusing much on proposing optimizations [1, 2, 5, 6, 9, 10, 11, 12] There is some work specifically focused on optimizing the performance of the operating system [13, 15, 16]. However, it examines part of the problem only, for example, instruction accesses only or prefetching only. The combined effect of all the optimizations proposed is unknown, especially for very advanced multiprocessor memory hierarchies. Finally, there is a lot of other work proposing ....

....and Traffic Occur To understand why the operating system exercises the memory hierarchy in this way, we examine address traces of the operating system. In the following, we discuss the instruction and data access patterns observed. A more detailed discussion can be found in two previous papers [13, 15]. 3.2.1 Instruction Access Patterns It is well known that the instruction access patterns in the operating system are different from those in typical technical applications. In the former, tight loops account for a relatively small fraction of the execution time. A close examination of operating ....

[Article contains additional citation context not shown here]

C. Xia and J. Torrellas. Improving the Data Cache Performance of Multiprocesor Operating Systems. In Proceedings of the 2nd International Symposium on High-Performance Computer Architecture, pages 85--94, February 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC