Results 1 - 10
of
12
An Efficient Meta-lock for Implementing Ubiquitous Synchronization
, 1999
"... Programs written in concurrent object-oriented languages, espe-cially ones that employ thread-safe reusable class libraries, can execute synchronization operations (lock, notify, etc.) at an amaz-ing rate. Unless implemented with utmost care, synchronization can become a performance bottleneck. Furt ..."
Abstract
-
Cited by 60 (1 self)
- Add to MetaCart
Programs written in concurrent object-oriented languages, espe-cially ones that employ thread-safe reusable class libraries, can execute synchronization operations (lock, notify, etc.) at an amaz-ing rate. Unless implemented with utmost care, synchronization can become a performance bottleneck. Furthermore, in languages where every object may have its own monitor, per-object space overhead must be minimized. To address these concerns, we have developed a meta-lock to mediate access to synchronization data. The meta-lock is fast (lock + unlock executes in 11 SPARCTM architecture instructions), compact (uses only two bits of space), robust under contention (no busy-waiting), and flexible (supports a variety of higher-level synchronization operations). We have vali-dated the meta-lock with an implementation of the synchronization operations in a high-performance product-quality JavaTM virtual machine and report performance data for several large programs.
Integrating Kernel Activations in a Multithreaded Runtime System on top of Linux
, 2000
"... Clusters of SMP machines are frequently used to perform heavy parallel computations, and the concepts of multithreading have proved suitable for exploiting SMP architectures. Generally, the programmer uses a thread library to write this kind of program. Such a library schedules the threads or as ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
Clusters of SMP machines are frequently used to perform heavy parallel computations, and the concepts of multithreading have proved suitable for exploiting SMP architectures. Generally, the programmer uses a thread library to write this kind of program. Such a library schedules the threads or asks the OS to do it, but both of these approaches have problems. Anderson et al. have introduced another approach which relies on cooperation between the OS scheduler and the user application using activations and upcalls. We have modified the Linux kernel and adapted the Marcel thread library (from the programming environment PM²) to use activations. Improved performance was observed and problems caused by blocking system calls were removed.
Linux Kernel Activations To Support Multithreading
- IN PROC. 18TH IASTED INTERNATIONAL CONFERENCE ON APPLIED INFORMATICS (AI 2000
, 2000
"... This paper describes a modification to the Linux operating system kernel that implements a mechanism called "scheduler activations" for supporting user-level multithreading. The key idea is to allow a user-level thread package to implement its own scheduling and to provide a mechanism whereby the ke ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
This paper describes a modification to the Linux operating system kernel that implements a mechanism called "scheduler activations" for supporting user-level multithreading. The key idea is to allow a user-level thread package to implement its own scheduling and to provide a mechanism whereby the kernel will notify this package whenever it (the kernel) makes a scheduling decision that effects one of the kernel-level threads utilized by the package. Based on this notication, the user-level scheduler decides which user-level threads to run. Tests show that a scheduler adapted to utilize this new mechanism gives performance comparable to the best performance of previous versions of the same scheduler and to kernel-level threads.
A Thread-Aware Debugger with an Open Interface
- In ACM International Symposium on Software Testing and Analysis
, 2000
"... While threads have become an accepted and standardized model for expressing concurrency and exploiting parallelism for the shared-memory model, debugging threads is still poorly supported. This paper identies challenges in debugging threads and oers solutions to them. The contributions of this paper ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
While threads have become an accepted and standardized model for expressing concurrency and exploiting parallelism for the shared-memory model, debugging threads is still poorly supported. This paper identies challenges in debugging threads and oers solutions to them. The contributions of this paper are threefold. First, an open interface for debugging as an extension to thread implementations is proposed. Second, extensions for thread-aware debugging are identied and implemented within the Gnu Debugger to provide additional features beyond the scope of existing debuggers. Third, an active debugging framework is proposed that includes a language-independent protocol to communicate between debugger and application via relational queries ensuring that the enhancements of the debugger are independent of actual thread implementations. Partial or complete implementations of the interface for debugging can be added to thread implementations to work in unison with the enhanced debugger wit...
Providing a Linux API on the Scalable K42 Kernel
"... K42 is an open-source research kernel targeted for 64bit cache-coherent multiprocessor systems. It was designed to scale up to multiprocessor systems containing hundreds or thousands of processors and to scale down to perform well on 2- to 4-way multiprocessors. K42's goal was to re-design the core ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
K42 is an open-source research kernel targeted for 64bit cache-coherent multiprocessor systems. It was designed to scale up to multiprocessor systems containing hundreds or thousands of processors and to scale down to perform well on 2- to 4-way multiprocessors. K42's goal was to re-design the core of an operating system, but not an entire application environment. We wanted to use a commonly available interface with a large established code base. Because Linux is open source and widely available, we chose to support its application environment by supporting the Linux API and ABI. There were some interesting complications as well as advantages that arose from K42's structure because our implementation of the Linux application environment was done primarily in user space, had to interface with K42's object-oriented technology, and used fine-grained locking. Other research systems efforts directed at achieving a high degree of scalability and maintainability exhibit similar structural characteristics. In this
A Decoupled Architecture for Application-Specific File Prefetching
, 2002
"... Data-intensive applications such as multimedia and data mining programs may exhibit sophisticated access patterns that are difficult to predict from past reference history and are different from one application to another. This paper presents the design, implementation, and evaluation of an automati ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Data-intensive applications such as multimedia and data mining programs may exhibit sophisticated access patterns that are difficult to predict from past reference history and are different from one application to another. This paper presents the design, implementation, and evaluation of an automatic application-specific file prefetching (AASFP) mechanism that is designed to improve the disk I/O performance of application programs with such complicated access patterns. The key idea of AASFP is to convert an application into two threads: a computation thread, which is the original program containing both computation and disk I/O, and a prefetch thread, which contains all the instructions in the original program that are related to disk accesses. At run time, the prefetch thread is scheduled to run suciently far ahead of the computation thread, so that disk blocks can be prefetched and put in the file buffer cache before the computation thread needs them. Through a source-to-source translator, the conversion of a given application into two such threads is made completely automatic. Measurements on an initial AASFP prototype under Linux show that it provides as much as 54% overall performance improvement for a volume visualization application.
Extending the Linux Kernel with Activations for Better Support of Multithreaded Programs and Integration in PM²
, 1999
"... Nowadays, cluster of SMP machines are used a lot to perform heavy parallel computations. The new concepts around multithreading have been prooved suitable for the SMP architecture. Generally, the programmer uses a thread library to write this kind of programs. Such a library schedules the threads or ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Nowadays, cluster of SMP machines are used a lot to perform heavy parallel computations. The new concepts around multithreading have been prooved suitable for the SMP architecture. Generally, the programmer uses a thread library to write this kind of programs. Such a library schedules the threads or asks the OS to do it, but both of these approaches still have problems. We introduce here another approach which relies on cooperation between the OS scheduler and the user application using activations and upcalls. This approach has been presented in an article[2] written in 1989. We implemented this model with Linux and adapted the thread library Marcel (from the programming environment PM ) to use the activations. The performance observed was improved: there was no loss in speed but the problems caused by the blocking system calls were removed.
Ecient Dynamic Parallelism with OpenMP on Linux SMPs
- In Proc. of the 2000 International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas
, 2000
"... Abstract In this paper we present an integrated environment for the efficient support of dynamic parallelism with OpenMP on top of Linux-based SMPs. This environment consists of an OpenMPcompliant Fortran77 compiler, a run-time threads library and a modified Linux kernel. The functionality provided ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract In this paper we present an integrated environment for the efficient support of dynamic parallelism with OpenMP on top of Linux-based SMPs. This environment consists of an OpenMPcompliant Fortran77 compiler, a run-time threads library and a modified Linux kernel. The functionality provided by our run-time threads library is used by the NanosCompiler, which converts OpenMP Fortran77 programs to equivalent Fortran77 programs with calls to the library. The NanosCompiler generated applications use a shared arena as a communication path with the OS kernel. This kind of communication facilitates the support of dynamic parallelism, resulting to performance scalability under multiprogramming. In order to evaluate the efficiency of our approach, we have used a subset of an OpenMP implementation of the NAS benchmarks. We compared the performance of our environment with that of OmniMP. OmniMP is a free source-to-source compiler, that converts OpenMP programs written in C or Fortran77 to equivalent C programs using POSIX threads. Our environment achieves up to 6.3 times higher throughput under the presence of multiprogramming. Moreover, it performs better even on dedicated machines.
Application-Specific File Prefetching For Multimedia Programs
- In Proc. Int. Conf. on Multimedia and Expo
, 2000
"... This paper describes the design, implementation, and evaluation of an automatic application-specific file prefetching mechanism that is designed to improve the I/O performance of multimedia programs with complicated access patterns. The key idea of the proposed approach is to convert an application ..."
Abstract
- Add to MetaCart
This paper describes the design, implementation, and evaluation of an automatic application-specific file prefetching mechanism that is designed to improve the I/O performance of multimedia programs with complicated access patterns. The key idea of the proposed approach is to convert an application into two threads: a computation thread, which is the original program containing both computation and disk I/O, and a prefetch thread, which contains all the instructions in the original program that are related to disk accesses. At run time, the prefetch thread is scheduled to run far ahead of the computation thread, so that disk blocks can be prefetched and put in the file system buffer cache before the computation thread needs them. A source-to-source translator is developed to automatically generate the prefetch and computation thread from a given application program without any user intervention. We have successfully implemented a prototype of this automatic application-specific file pr...
Simulation of Embedded Micro-Kernels over Pthreads
"... This work describes the design and implementation of a simulation environment for an open-source embedded micro-kernel and an intuitive user interface to complement it. The study stresses the suitability of POSIX Threads (Pthreads) to resemble micro-kernel operations in the simulation environment. I ..."
Abstract
- Add to MetaCart
This work describes the design and implementation of a simulation environment for an open-source embedded micro-kernel and an intuitive user interface to complement it. The study stresses the suitability of POSIX Threads (Pthreads) to resemble micro-kernel operations in the simulation environment. It species the prerequisites for using Pthreads as a means to resemble embedded task execution and suggests an I/Obased representation of device information. The experience gained with a sample implementation stresses the importance of a proper match between a Pthreads implementation and an embedded micro-kernel. It also shows the adequacy of both the simulation environment and a graphical user interface to aid program development and debugging. Furthermore, the separation of the simulation component from the user interface provides opportunities to utilize each component separately or even combine them with other components. 1. Introduction Recently, embedded systems have increased rapidly...

