Results 1 -
6 of
6
Disco: Running commodity operating systems on scalable multiprocessors
- ACM Transactions on Computer Systems
, 1997
"... In this paper we examine the problem of extending modern operating systems to run efficiently on large-scale shared memory multiprocessors without a large implementation effort. Our approach brings back an idea popular in the 1970s, virtual machine monitors. We use virtual machines to run multiple c ..."
Abstract
-
Cited by 164 (6 self)
- Add to MetaCart
In this paper we examine the problem of extending modern operating systems to run efficiently on large-scale shared memory multiprocessors without a large implementation effort. Our approach brings back an idea popular in the 1970s, virtual machine monitors. We use virtual machines to run multiple commodity operating systems on a scalable multiprocessor. This solution addresses many of the challenges facing the system software for these machines. We demonstrate our approach with a prototype called Disco that can run multiple copies of Silicon Graphics ’ IRIX operating system on a multiprocessor. Our experience shows that the overheads of the monitor are small and that the approach provides scalability as well as the ability to deal with the non-uniform memory access time of these systems. To reduce the memory overheads associated with running multiple operating systems, we have developed techniques where the virtual machines transparently share major data structures such as the program code and the file system buffer cache. We use the distributed system support of modern operating systems to export a partial single system image to the users. The overall solution achieves most of the benefits of operating systems customized for scalable multiprocessors yet it can be achieved with a significantly smaller implementation effort. 1
Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP
- IEEE Transactions on Parallel and Distributed Systems
, 2000
"... Exploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We propose a memory-layout oriented technique to exploit cache locality of parallel loops at ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Exploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We propose a memory-layout oriented technique to exploit cache locality of parallel loops at runtime on Symmetric Multiprocessor (SMP) systems. Guided by application-dependent and targeted architecture-dependent hints, our system, called Cacheminer, reorganizes and partitions a parallel loop using the memoryaccess space of its execution. Through effective runtime transformations, our system maximizes the data reuse in each partitioned data region assigned in a cache, and minimizes the data sharing among the partitioned data regions assigned to all caches. The executions of tasks in the partitions are scheduled in an adaptive and locality-preserved way to minimize the execution time of programs by trading off load balance and locality. We have implemented the Cacheminer runtime ...
Integrating Complete-System and User-level Performance/Power Simulators: The SimWattch Approach
- In Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS
, 2003
"... Most applications driving the advancements in microarchitecture and memory system research have a non-negligible interaction with the operating system. Yet, most architectural investigations are based on user-level simulators in which operating system activity is not modelled. This has motivated ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Most applications driving the advancements in microarchitecture and memory system research have a non-negligible interaction with the operating system. Yet, most architectural investigations are based on user-level simulators in which operating system activity is not modelled. This has motivated us to design SimWattch, a microarchitectural modeling infrastructure. SimWattch is based on Simics -- a system-level simulation tool -- and Wattch (SimpleScalar extended with power modeling) -- a flexible user-level simulation tool. As a result, it can analyze performance and power dissipation in microarchitectures at the cycle level for complex workloads running on commodity operating systems.
BEE3: Revitalizing Computer Architecture Research
, 2009
"... In recent years, advances in computer architecture have slowed dramatically with most simulation results demonstrating only incremental architectural innovation. This is further exacerbated by increased processor and system complexity spurred by a seemingly unlimited number of transistors at compute ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
In recent years, advances in computer architecture have slowed dramatically with most simulation results demonstrating only incremental architectural innovation. This is further exacerbated by increased processor and system complexity spurred by a seemingly unlimited number of transistors at computer architect’s disposal. Computer architects produce a myopic view of their systems through the lens of slow, highly-detailed software simulation or fast, coarse-grained software simulation, with fidelity always in question. By leveraging silicon technology scaling in Field Programmable Gate Arrays (FPGAs), hardware can be used to accelerate simulation, emulation, or prototyping of systems. Furthermore, because the base components are reconfigurable, the same system can be used for a variety of research projects, amortizing the cost, both in dollars and in learning time. In this paper, we present the third generation of the Berkeley Emulation Engine or BEE3 system. We demonstrate a new collaboration methodology between academia and industry and compare the industrial and academic system design process. The BEE3 is a production multi-FPGA system with up to 64 GB of DRAM and several I/O subsystems that can be used to enable faster, larger and higher fidelity computer architecture or other systems research. Using a widely available hardware platform also facilitates a software community that can generate and share software modules, thereby enabling rapid system development for computer architecture research. 1
Processor Models for Retargetable Tools
- Proceedings eleventh IEEE International Workshop on Rapid Systems Prototyping
, 2000
"... This paper describes a methodology for developing processor specific tools such as assemblers, disassemblers, processor simulators, compilers etc., using processor models in a generic way. The processor models are written in a language called Sim-nML [1] which is powerful enough to capture the instr ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper describes a methodology for developing processor specific tools such as assemblers, disassemblers, processor simulators, compilers etc., using processor models in a generic way. The processor models are written in a language called Sim-nML [1] which is powerful enough to capture the instruction set architecture of a processor. We describe a few tools in this paper which can be retargeted to any processor using the high level Sim-nML model of the processor. 1.
unknown title
"... This paper describes a methodology for developing processor specific tools such as assemblers, disassemblers, processor simulators, compilers etc., using processor models in a generic way. The processor models are written in a language called Sim-nML [1] which is powerful enough to capture the instr ..."
Abstract
- Add to MetaCart
This paper describes a methodology for developing processor specific tools such as assemblers, disassemblers, processor simulators, compilers etc., using processor models in a generic way. The processor models are written in a language called Sim-nML [1] which is powerful enough to capture the instruction set architecture of a processor. We describe a few tools in this paper which can be retargeted to any processor using the high level Sim-nML model of the processor. 1.

