An Analytic Performance Model Of Parallel Systems That Perform N tasks Using P Processors That Can Fail (2001) [5 citations — 4 self]
Abstract:
We present a family of Markov models for analyzing the performance of parallel /distributed processors that execute a job consisting of N independent tasks in parallel using P processors. The model is a Markov Chain with states representing service and failure rates with k (0! k P) active processors. The task-times and processor failures are both exponentially distributed. We derive a number of expressions to determine the mean execution time, probability of success, work, and other measurable quantities, all conditioned on the job finishing successfully. A prototype, implemented using an extended version of ACMPI, is used for actual experiments that are based on simulated task-times and processor failures. We present our results comparing the analytic model with the prototype for a range of values of processor failure rates. We then discuss extensions of the model and issues related to communication costs, approximations and effect of task-time distributions. 1
Citations
| 649 | An Introduction to Probability Theory and Its – Feller - 1968 |
| 269 | Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities – Amdahl - 1967 |
| 107 | Probability & Statistics with Reliability – Trivedi - 1982 |
| 73 | LogP: A practical model of parallel computation – Culler, Karp, et al. - 1996 |
| 65 | Queueing Theory: A Linear Algebraic Approach – Lipsky - 1992 |
| 44 | Fault-Tolerant Parallel Computation – Kanellakis, Shvartsman - 1997 |
| 27 | The importance of power-tail distributions for modeling queueing systems – Greiner, Jobmann, et al. - 1999 |
| 8 | On The Performance of Parallel Computers: Order Statistics and Amdahl's Law – Lipsky, Zhang, et al. - 1996 |
| 4 | An Asynchronous Model of Communication and Computation for MPI – Weerasinghe, Greenshields - 2000 |
| 4 | A Distributed Fault-Tolerant Asynchronous Algorithm for Performing – Weerasinghe, Lipsky - 2001 |

