c fl N. J. Davies, In this thesis we develop an analytical performance model for parallel computer systems. This model is built on three abstract performance elements; loading intensity, contention, and delay. These elements correspond to performance measures that are the outcome of features of both software and hardware components of a computing system. The profile of these components can in turn be derived from an analysis of the performance-related behaviour of the individual processes that constitute a complete system. We show how such models of particular systems can be used for performance prediction. They can be used to predict the performance of a specified number of processors, derive the maximum expectation in terms of performance, as the number of processors increases, and predict the amount of latency hiding required to achieve a particular performance profile. We illustrate the use of the model with a particular concrete application. The interaction between the three performance elements is examined, with particular attention being paid to the relationship between loading intensity and delay in certain classes of parallel system. We examine the consequences that this relationship has on the ability to measure certain types of performance data. In examining the relationship between delay and performance elements we also look at the effect of allocating many processes to each processor. As a consequence of using this modelling technique on a particular parallel system, individual behaviour characteristics of the real system may need to be approximated. We examine the effects of this approximation, looking at the particular circumstances under which our model may not give appropriate quantitative results for the individual system. ii
|
2762
|
Communication and Concurrency
– Milner
- 1989
|
|
970
|
A bridging model for parallel computation
– Valiant
- 1997
|
|
555
|
Petri Net Theory and the Modeling of Systems
– Peterson
- 1981
|
|
469
|
nets: an introduction
– Petri
- 1985
|
|
404
|
A Compositional Approach to Performance Modelling
– Hillston
- 1996
|
|
269
|
Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities
– Amdahl
- 1967
|
|
257
|
A calculus of duration
– Chaochen, Hoare, et al.
- 1991
|
|
257
|
Performance analysis of k-ary n-cube interconnection networks
– Dally
- 1990
|
|
230
|
Partitioning and Scheduling Parallel Programs for Multiprocessors
– Sarkar
- 1989
|
|
224
|
Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model
– Archibald, Baer
- 1986
|
|
220
|
Genaral purpose parallel architectures
– Valiant
- 1990
|
|
179
|
closed and mixed networks of queues with different classes of customers
– Baskett, Chandy, et al.
- 1975
|
|
176
|
1975),Queueing systems, Volume 1: Theory
– Kleinrock
|
|
157
|
Reevaluating Amdahl’s law
– Gustafson
- 1988
|
|
130
|
Speedup versus efficiency in parallel systems
– Eager, Zahorjan, et al.
- 1989
|
|
121
|
Adaptive software cache management for distributed shared memory architectures
– Bennett, Carter, et al.
- 1990
|
|
113
|
Characterizations of parallelism in applications and their use in scheduling
– Sevcik
- 1989
|
|
104
|
Performance Models of Multiprocessor Systems
– Ajmone-Marsan, Balbo, et al.
- 1986
|
|
97
|
Development of parallel methods for a 1024 processor hypercube
– Gustafson, Montry, et al.
- 1988
|
|
90
|
Queueing theory with Computer Science Applications
– Probability
- 1990
|
|
86
|
Highly Parallel Computing
– Almasi, Gottlieb
- 1989
|
|
76
|
A complexity theory of efficient parallel algorithms
– Kruskal, Rudolph, et al.
- 1990
|
|
69
|
A decomposition approach for stochastic reward net models
– Ciardo, Trivedi
- 1993
|
|
68
|
Queueing Systems, Volume 2: Computer Applications
– Kleinrock
- 1976
|
|
64
|
Stochastic Modeling and Analysis: A Computational Approach
– Tijms
- 1986
|
|
62
|
Steady-state simulation of queueing processes: A survey of problems and solutions
– Pawlikowski
- 1990
|
|
53
|
Performance parameters and benchmarking of supercomputers. Parallel Computing
– Hockney
|
|
52
|
Toward a better parallel performance metric
– Sun, Gustafson
- 1991
|
|
51
|
A methodology for solving Markov models of parallel systems
– Plateau, Fourneau
- 1991
|
|
47
|
Measuring Parallel processor Performance
– Karp, Flatt
- 1990
|
|
46
|
Load Balancing in Large Networks: A Comparative Study
– Luling, Monien, et al.
- 1991
|
|
43
|
Analytic queueing models for programs with internal concurrency
– Heidelberger, Trivedi
- 1983
|
|
43
|
Bulk-Synchronous Parallel Computers
– Valiant
- 1989
|
|
42
|
Performance of parallel processors
– Flatt, Kennedy
- 1989
|
|
39
|
Parallel Processing for Computer Graphics
– Green
- 1991
|
|
38
|
Comparison of Hardware and Software Cache Coherence Schemes
– Adve, Adve, et al.
- 1991
|
|
38
|
Data communication in parallel architectures
– Saad, Schultz
- 1989
|
|
37
|
Finite Representation of CCS and TCSP Programs by Automata and
– Taubner
- 1989
|
|
35
|
Process coordination with fetch-and-increment
– Freudenthal, Gottlieb
- 1991
|
|
35
|
Approximate analysis of fork/join synchronization in parallel queues
– Nelson, Tantawi
- 1988
|
|
32
|
Computer and Communication System Performance Modeling
– King
- 1990
|
|
31
|
Pepa: Performance enhanced process algebra
– Hillston
- 1993
|
|
31
|
Performance Analysis of Parallel Processing Systems
– Nelson, Towsley, et al.
- 1988
|
|
29
|
Process synchronization: Design and performance evaluation of distributed algorithms
– Bagrodia
- 1989
|
|
28
|
An Analytical Approach to Performance/Cost Modeling of Parallel Computers
– Andrews, Polychronopoulos
- 1991
|
|
26
|
Vector computer memory bank contention
– Bailey
- 1987
|
|
26
|
The Operational Analysis of Queueing Network Models
– Denning, Buzen
- 1978
|
|
26
|
Multiprocessor Performance
– Gelenbe
- 1989
|
|
26
|
Clustering Task Graphs for Message Passing Architectures
– Gerasoulis, Venugopal, et al.
- 1990
|
|
25
|
Predicting the Performance of Software Systems
– Rolia
- 1992
|