Abstract:
iii Considerable research has produced a plethora of efficient methods of exploiting parallelism on dedicated machines. On typical real systems, however, some of the important assumptions that lead to efficiency on a dedicated machine either do not hold or cause other problems on a machine which is time or space shared. Foremost among these assumptions is that the number of processors available to a parallel job is constant throughout the execution of the job. Maintaining such consistency in a real multiprogramming system can lead to poor utilization of the machine. This thesis will address issues involving the efficient exploitation of parallelism in a multiprogramming environment including OS support for user level scheduling and dynamic granularity control. Implementation of some of these techniques in the nanoThreads thread library will be discussed, as well as other details of the implementation of nanoThreads. iv For my brother, Duane Schouten, who in his way encouraged my scientific curiosity by causing me to take nothing for granted, but to question everything. v
Citations
|
672
|
The program dependence graph and its use in optimization
– Ferrante, Ottenstein, et al.
- 1987
|
|
462
|
The NAS Parallel Benchmarks
– Bailey, Barton, et al.
- 1991
|
|
441
|
Optimizing Supercompilers for Supercomputers
– Wolfe
- 1989
|
|
409
|
Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism
– Anderson, Bershad, et al.
- 1992
|
|
127
|
The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors
– Anderson, Lazowska, et al.
|
|
114
|
Using Continuations to Implement Thread Management and Communication in Operating Systems
– Draves, Bershad, et al.
- 1991
|
|
91
|
C Threads
– Cooper, Draves
- 1988
|
|
74
|
On the design of Chant: A talking threads package
– Haines, Cronk, et al.
- 1994
|
|
66
|
An Application of Bin-Packing to Multiprocessor Scheduling
– Coffman, Garey, et al.
- 1978
|
|
63
|
Optimal scheduling for two-processor systems
– Coffman, Graham
- 1972
|
|
59
|
Choices (Class Hierarchical Open Interface for Custom Embedded Systems). Operating Systems Review
– Campbell, Johnston, et al.
- 1987
|
|
42
|
NanoThreads: A User-Level Threads Architecture
– Polychronopoulos, Bitar, et al.
- 1993
|
|
38
|
Low-overhead scheduling of nested parallelism
– Hummel, Schonberg
- 1991
|
|
36
|
Filaments: Efficient support for fine-grain parallelism
– Engler, Andrews, et al.
|
|
25
|
Chores: enhanced run-time support for shared-memory parallel computing
– Eager, Zahorjan
- 1993
|
|
20
|
Functional parallelism: theoretical foundations and implementation
– Girkar
- 1991
|
|
17
|
Hardware and Software for Functional and Fine Grain Parallelism
– Beckmann
- 1993
|
|
17
|
Parallel program graphs and their classification
– Sarkar, Simons
- 1993
|
|
15
|
Parallel supercomputing today and the CEDAR approach
– Kuck, Davidson, et al.
- 1986
|
|
15
|
On the Implementation and Effectiveness of Autoscheduling for SharedMemory Multiprocessors
– Moreira
- 1995
|
|
15
|
Auto-scheduling: Control flow and data flow come together
– Polychronopoulos
- 1990
|
|
13
|
Wasted resources in gang scheduling
– Feitelson, Rudolph
- 1990
|
|
12
|
A concurrent execution semantics for parallel program graphs and program dependence graphs
– Sarkar
- 1992
|
|
10
|
III. Switch-stacks: A scheme for microtasking nested parallel loops
– Chow, Harrison
- 1990
|
|
9
|
Viresh Rustagi. Experience with a prototype of the POSIX "minimal realtime system profile
– Baker, Mueller
- 1994
|
|
7
|
Large-scale computer simulation of fully developed channel flow with heat transfer
– Lyons, Hanratty, et al.
- 1991
|
|
3
|
User's guide to AWESIME-II: A widely extensible simulation environment
– Grunwald
- 1992
|
|
1
|
Early experience with posix 1003.4 and 1003.4a
– Gallmeister, Lanier
- 1991
|
|
1
|
Efficient Schdeuling on Multiprogrammed Shared-Memory Multiprocessors
– Tucker
- 1993
|