20 citations found. Retrieving documents...
E. Gelenbe. Multiprocessor Performance. John Wiley & Sons, Chichester, 1989.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
How to Achieve Worst-Case Performance - Greenstreet, de Alwis (2001)   (Correct)

....we can view the resulting task graph as a percolation network. If this percolation network has an infinite, connected component, then the original self timed system operates with worst case performance. 2. 1 Task Graphs A task graph is a way of visualizing the operations of a a parallel process [3]. The graph has one vertex for each operation of each processor. In particular, let vertex v i;j correspond to the j operation of processor i. Edges in the graph are directed, with an edge from vertex v i 1 ;j 1 to vertex v i 2 ;j 2 if the j 2 of processor i 2 depends directly on the j 1 ....

E. Gelenbe. Multiprocessor Performance. John Wiley and Sons, 1989.


Systematic Approach for Workload Characterization of Parallel.. - Ferscha, al. (1994)   (Correct)

.... the application programmer and the target architecture in the following way: parallel program code produced by the programmer is translated or abstracted to a higher level behavior description and usually represented by a directed, mostly acyclic and sometimes stochastic program graph (DAGs [Gele 89] integrating an execution precedence relation, a computation cost model, and or a communication cost model and or a contention model. On the other hand, an abstracted hardware model, generally also a graph model reflecting topological aspects of the underlying hardware, serves as input to a ....

....not only the parallel and serial fraction, but also other aspects of the program. In [Sun 91] new metrics for speedup are presented, relating the parallel and the sequential work (sizeup) and the parallel and sequential speed (generalized speedup) instead of the time fractions. Gelenbe [Gele 89] proposes to consider the probability, that a parallel program cannot effectively use all processors. A distribution function is given for the probability ffl i , that a program uses only i processors. In addition, an imbalance factor ffi and a communication time c(i) are introduced. Using these ....

E. Gelenbe. Multiprocessor Performance. John Wiley & Sons, Chichester, 1989.


Performance Metrics for Embedded Parallel Pipelines - Fleury, Downton, Clark   (Correct)

.... been employed in directed acyclic graph (DAG) models of parallelism, stemming from [ 50 ] 3 The general properties of series parallel graphs, SPG, of the DAG variety with unconstrained numbers of nodes and probabilistic branching have been studied from the standpoint of queueing theory in [ 51 ] 15 ] is a practically oriented study of the SPG model for parallel pipelines, though not using queueing theory or order statistics. Queueing theory is not normally helpful for the performance of individual applications as it gives rise to means not maxima. The linear form of pipelines in PPF ....

E. Gelenbe. Multiprocessor Performance. Wiley, Chichester, UK, 1989.


Task Graph Performance Bounds Through Comparison Methods - Salamon (2001)   (Correct)

....The direction of the precedence constraints is implicitly top to bottom, so that every successor of a task is below it in the graph. Task graphs in this work are regarded as having static precedence structure, as opposed to task graphs with probabilistic structure such as task graph families [25, 83]. A workload is an assignment of a duration to each task in the task graph. This dissertation follows the significant body of published work that considers static task durations [29, 34 36, 56, 67] This is in contrast to the use of stochastic task durations in other work [12, 13, 41, 46, 57, ....

....components: a su#cient number of identical processors, task durations that include overheads, and nonpreemptive work conserving scheduling. This is an idealized model but has found wide acceptance in the literature, forming the basis of significant work on task graph models of parallel programs [1,12,18,25,49,50,56,60,61,63,65,70,83]. This execution model is first defined and its usual assumptions are discussed. Section 3.2 lays out the basic definitions and notation that are needed in this work. All these concepts are common in the literature, but standardization is applied, since some concepts are graph theoretic, while ....

[Article contains additional citation context not shown here]

E. Gelenbe. Multiprocessor Performance. Wiley, 1989.


An Analytical Method for Predicting the Performance of Parallel.. - Juhasz (1998)   (6 citations)  (Correct)

....execution mechanism in order to study the behaviour of the target system. Since any level of detail can be incorporated into the simulation model, these methods can provide very accurate results. They can predict performance and the e ects of virtually all parameter combinations can be examined [7]. Setting up simulation models, however, is a demanding task for programmers, and executing simulations is very time consuming. For large problems simulation models usually become too complex to be feasible. Analytical methods use mathematical models of the architecture and the execution of the ....

E. Gelenbe, Multiprocessor Performance, John Wiley & Sons, 1989.


Comparative Performance Evaluation of Hot Spot Contention.. - Zhang, Yan, Castaneda (1995)   (2 citations)  (Correct)

....up their path. It then selects an alternative route, and after a random delay, tries again. Zhang and Qin [16] present a remote access delay model for a non blocking MIN architecture where the behavior of a remote memory access is described by a state transition diagram called the drop approach [6]. Here a processor makes a remote memory access by formulating requests for access to the set of switches along that path. If it cannot obtain a switch, it abandons its request at that point and will try again at some later time. In Figure 1, state 0 represents some processor in quiescent state, ....

E. Gelenbe, Multiprocessor Performance, John Wiley and Sons, 1989.


Deriving Parallelism Profiles from Structural Parallelism.. - Braun, Haring, Kotsis (1996)   (Correct)

.... Fer92] and the approach by Mitchele Thiel [Her92, MT93] In most of these approaches, graph models have been used to characterize a parallel application (see for example [CS91] Task graphs are a convenient model for representing the structure and the dependencies within a parallel application [Gel89] It is well known, that restrictions of the task graph either with respect to their structure (series parallel graphs) ST85] or with respect to the type of distribution allowed for the task execution time (stochastic task graphs) will make the model easier to analyze but at the costs of ....

E. Gelenbe. Multiprocessor Performance. John Wiley & Sons, Chichester, 1989.


Performance Prediction of Parallel Low-level Image.. - Zoltan Juhasz Dept   (Correct)

....not provide insight into the details of the implementation and it requires the use of the full scale parallel system. Simulation methods are sometimes used to predict the performance of parallel programs but they are very time consuming and for large problems they become too complex to be feasible [2]. Analytical methods use models of the architecture and the algorithm, and express the execution time in a closed form expression [1] 3 5] They allow examining the effect of system parameter changes and predicting performance using only a few processor system. Accurate prediction is especially ....

E. Gelenbe, Multiprocessor Performance, John Wiley & Sons, 1989.


Performance Prediction for Parallel Reconfigurable Low-Level .. - Martin Fleury And (1994)   (Correct)

....the vacation by the link S is the service length of the links. N is the mean number of packets in a queue, which is related to the mean waiting time by Little s theorem (W = N= ae = E[S] is the availability, and the arrival rate. This result, which has been used for other multiprocessor systems [1], requires the vacation time distribution S to be known and limits service to one packet each time. It is possible to find this (see Section 4) but we have found it more convenient to use a result previously applied to token ring networks inter alia. The revised model is [3] W = ME[S 2 ] ....

E. Gelenbe. Multiprocessor Performance. John Wiley and Sons, 1989.


Queueing-Theoretic Solution Methods for Models of Parallel.. - Boxma, Koole, Liu (1996)   (Correct)

....As far as references is concerned, we have restricted ourselves in the text mainly to key references that make a methodological contribution, and to surveys that give the reader further access to the literature; we apologize for any inadvertent omissions. The reader is referred to Gelenbe s book [65] for a general introduction to the area of multiprocessor performance modeling and analysis. Stochastic Petri nets provide another formalism for modeling and performance analysis of discrete event systems. The reader is referred to the survey paper of Murata [127] for results of their qualitative ....

E. Gelenbe. Multiprocessor Performance. Wiley, New York, 1989.


Performance Prediction and Evaluation of Parallel Processing on .. - Zhang, Qin (1991)   (4 citations)  (Correct)

....a target NUMA system, the BBN GP1000. Their system experiments conclude that the placement and movement of code and data are crucial to NUMA performance. The performance of a general multistage interconnection network, such as the Omega network has been evaluated analytically (see e.g. 6] 7] [15]) The analysis work is independent on the NUMA architecture although the multistage interconnection network is commonly used for a NUMA system. The performance factors of NUMA multiprocessors (for example, the BBN GP1000) such as the data access time, scheduling overhead and others have been ....

....all but the first message retreat back to its source and free up their path. It then selects an alternative route, and after a random delay, tries again. Our model is based on the nonblocking network architecture, which is used for most NUMA systems. 4. 2 A model of remote memory access Gelenbe [15] describes the behavior of a remote memory access in a nonblocking multi stage interconnection network by a state transition diagram called drop approach (see Figure 5) Here a processor makes a remote memory access by formulating requests for access to the set of switches Figure 5: State ....

E. Gelenbe, Multiprocessor Performance, John Wiley and Sons, 1989.


Workload Models for Parallel Computer Systems - Kotsis (1994)   (Correct)

....defined as E(p) S(p) p the ration of speedup to the number of processors. The efficacy j(p) of a parallel program is defined as j(p) S(p) E(p) the ratio of speedup to efficiency and quantifies the benefits (increase in speedup) versus costs (decrease in effiency) The processor working set [Gele 89] pws of a parallel program is defined as pws = fpjmax i j(i) j(p)g the optimum number of processors with respect to efficacy. These characteristics may be applied in system design studies (estimating the appropriate size of the system) or in scheduling and in mapping problems to determine ....

E. Gelenbe. Multiprocessor Performance. John Wiley & Sons, Chichester, 1989.


The Performance and Scalability of Parallel Systems - Davies (1994)   (Correct)

....is that the technique for modelling the amount of potential concurrency is probabilistic. Typical examples of problems that lie in this area are those of macro dataflow execution and parallel execution models of declarative languages. The influence of such program structures has also been studied [43] and predictive models of concurrency have been developed for the last two classes above for certain sets of applications; these models use stochastic techniques to derive averages and distributions for the potential concurrency [62, 93, 94, 104] T sc The approaches to this component exhibit the ....

Erol Gelenbe. Multiprocessor Performance. John Wiley & Sons, 1989.


OPPSI: A Performance Prediction Tool For Parallel Systems - Cubaud, Pekergin, Taleb (1994)   (1 citation)  (Correct)

....a graphical user interface, a simulator and a range of analytical solvers. INTRODUCTION The study of quantitative models for workloadarchitecture (in)adequacy is of central importance in the context of parallel and distributed programming. It is a very active field in the modeling community (Gelenbe 1989). New models, that generalize the classical Queuing Network approach, are developed in order to include both concurrency and synchronization primitives. These models exhibit a very rich dynamic and scarcely lead to an easy analysis, such as Product Form results. The OPPSI (Outil de Pr diction de ....

Gelenbe, E. 1989. Multiprocessor performance. Wiley.


Workload Modeling for Parallel Processing Systems - Kotsis (1995)   (Correct)

.... and parallel time fraction for scaled workload size Generalized Speedup [Sun 91a] sequential and parallel execution speed, total overhead Sizeup [Sun 91a] sequential and parallel work for fixed time, total overhead [Klei 92] sequential and parallel work (time) characterized as stochastic values [Gele 89] sequential and parallel work, overhead due to inhomogeneous processor usage, load imbalance, and communication [Chri 91] sequential and parallel time, communication overhead, delays due to interconnection topology [Mari 93] sequential and parallel work (time) redundant work, blocking delays, ....

....of architecture scalability. Several other approaches can be found in literature, which are based on the evaluation of efficiency as proposed in this work, but they differ in the methods to characterize the amount of work or the execution times. A selection of approaches is discussed below. Gele 89] One of the earlier approaches for analytical models of parallel execution time and speedup was proposed by Gelenbe. In this approach the ideal parallelism profile P 1 (t) assuming an unlimited number of PEs) with the ideal execution time of T opt and the derived average DOP are used for a ....

E. Gelenbe. Multiprocessor Performance. John Wiley & Sons, Chichester, 1989.


Task Assignment and Transaction Clustering Heuristics for.. - Aguilar, Gelenbe (1997)   (4 citations)  Self-citation (Gelenbe)   (Correct)

....of K processors with distributed memory, i.e. with sufficient memory at each processor so that any one task can be executed. The processors are fully interconnected via a reliable high speed network. A parallel program which will be executed in this environment is represented by a Task Graph [1] which is denoted by: Pi = N, A, e, C ) where: N = f1; ng is the set of n tasks that compose the program, A= f a ij g is the incidence matrix which describes the graph, and e , C are the amount of work related to task execution and to communication between tasks. Thus e i ....

E. Gelenbe, Multiprocessor Performance,, John Wiley Ltd., New York and Chichester, 1989.


Deriving Parallelism Profiles from Structural Parallelism.. - Braun, Haring, Kotsis (1996)   (Correct)

No context found.

E. Gelenbe. Multiprocessor Performance. John Wiley & Sons, Chichester, 1989.


Efficient Parallel Implementations of Multipole Based N-Body.. - Rankin (1997)   (1 citation)  (Correct)

No context found.

E. Gelenbe. Multiprocessor Performance. John Wiley & Sons, New York, 1989.


Optimal Load Balancing on Distributed Homogeneous Unreliable.. - Liu, Righter   (1 citation)  (Correct)

No context found.

E. Gelenbe, Multiprocessor Performance. Wiley, New York, 1989.


Parallel Reconfiguration in an Image Processing Context - Fleury, Hayat, Clark (1997)   (Correct)

No context found.

E. Gelenbe. Multiprocessor Performance. Wiley, Chichester, UK, 1989. 21

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC