39 citations found. Retrieving documents...
T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Performance Estimation for Embedded Systems - van Gemund, Gautama (2000)   (Correct)

.... max operators in the deterministic case . The low cost, symbolic property of SP task graph analysis is typically employed in static compile time cost prediction techniques in which either numeric or symbolic cost expressions are directly derived from the intermediate program representation [7, 9, 17, 18, 34, 44]. Despite the attractive low cost and symbolic properties of SP task graph analysis, however, their inherent inability to model mutual exclusion makes them generally unsuitable as the basis for a generalpurpose performance modeling technique. 2.3 Approach Recently, a symbolic performance ....

....generate symbolic expressions or not, are essentially based on critical path analysis of SP graphs. Approaches based on deterministic SP graph analysis in the flavor of Eq. 3. 1 include the work of Atapattu and Gannon [7] Balasundaram, Fox, Kenedy, and Kremer [9] Clement and Quinn [10] Fahringer [17], Mendes, Yang and Reed [34] and Wang [43, 44] Approaches based on stochastic SP graphs include the work of Sahner and Trivedi [37] and Lester [31] which, similar to us, uses a modeling language, called PEL (Performance Evaluation Language) Although similar from SP graph analysis point of ....

[Article contains additional citation context not shown here]

T. Fahringer, "Estimating and optimizing performance for parallel programs," IEEE Computer, Nov. 1995, pp. 47--56.


Symbolic Cost Estimation of Parallel Applications - van Gemund   (Correct)

....reasons of convenience as explained later on. In the following we briefly describe the translation process. A more detailed background can be found in [8] The analytic approach underlying the translation process is based on critical path analysis of the delays due to condition synchronization [5, 12, 16] ( task synchronization ) combined with a lower bound approximation of the delays due to mutual exclusion synchronization ( queuing delay ) as a result of resource contention [8] In the following we assume a PAMELA model in which all expressions have already been substituted as the result of ....

T. Fahringer, "Estimating and optimizing performance for parallel programs," IEEE Computer, Nov. 1995, pp. 47--56.


Automatic Cost Estimation of High-Performance Applications - van Gemund   (Correct)

....by its argument (base 0) For instance, the expression 10 unitvec(3) will be compiled to [0,0,0,10] 3. 2 Symbolic Compilation The analytic approach underlying the translation process is based on a combination of compile time critical path analysis of the delays due to task synchronization [3, 5, 7, 15, 20], and a lower bound approximation of the delays due to queuing delay. A more detailed background can be found in [9] A PAMELA model is translated to a time domain performance model by substituting every process equation by a numeric equation that models the execution time associated with the ....

T. Fahringer, "Estimating and optimizing performance for parallel programs," IEEE Computer, Nov. 1995, pp. 47--56.


Visual Assistance for Concurrent Processing - Erbacher (2000)   (Correct)

....approaches and not consider modeling during this research. Pancake et al. Panca95b] discuss the problems of monitoring and modeling performance and the limitations of each. Tools are becoming available which model or estimate the performance of sections of parallel code, such as P T [Fahri95] and SimOS [Rosen95] For certain programs these tools are beneficial in that they can identify when and where bottlenecks are occurring and can aid in determining data distribution strategies. Unfortunately, these tools aid only in performance tuning. They cannot aid in application comprehension ....

Thomas Fahringer, Estimating and Optimizing Performance for Parallel Programs, IEEE Computer, Vol. 28, No. 11, November 1995, pp. 47-56.


Performance Analysis of Parallel Systems.. - Reed, Aydt.. (1998)   (2 citations)  (Correct)

....loop adaptive controls systems. 4 Related Work A large number of a priori performance prediction and aposteriori performance measurement and analysis tools havebeendeveloped, targeting both sequential and parallel systems far more than can be summarized here. Notable examples include P 3 T[10] for performance prediction, together with Paradyn [25] and AIMS [36] for performance measurement. Each has exposed key research issues in performance measurement and analysis. Similarly,several systems have been built that support application behavior steering (i.e. guiding a computation toward ....

Fahringer, T. Estimating and Optimizing Performance for Parallel programs. IEEE Computer 28, 11 (November 1995), 47--56.


A Common Workload Interface for the Performance.. - Papaefstathiou.. (1998)   (Correct)

.... Code SUIF Format ACT Parallelisation Layer Application Layer CHIP3S Script Profiler Program Unknowns Figure 3 Model creation process with ACT During the implementation stage, when the parallel source code is available, ACT can be employed as a static performance prediction tool [1], Figure 4. The performance of the application can then be analysed for several parallel platforms, provided they are available as hardware objects. PACE allows the development of models even when parts of the source code are not available. Performance prototyping is the terminology that is used ....

T. Fahringer, Estimating and Optimizing Performance for Parallel Programs, IEEE Computer, Vol. 28(11), November 1995.


Web-Based Performance Visualization Of Distributed.. - Elmaghraby, Elfayoumy (1999)   (Correct)

.... Paragraph is an animation tool used to trace the dynamic behavior of the program (Heath, and Etheridge 1991) and Paradyn is a tool for measuring performance of a large scale parallel system (Mller et al. 1995) P 3 T is a performance estimator tool that achieves high estimation accuracy (Fahringer 1995). Avtar is a virtual data environment (Reed et al. 1995) that allows users to explore parallel performance data and modify application and system parameters to see how performance is affected. Lilith Lights (Evensky, Gentile, and Wyckoff 1998) is a visualization tool for monitoring and debugging ....

Fahringer, T. 1995. Estimating and Optimizing Performance for Parallel Programs. Computer 28(11):47-56.


Modeling the Communication Behavior of Distributed Memory.. - Foschia, Rauber, Rünger   (Correct)

....research effort to build modeling tools because such tools are imperative to derive efficient implementations. The significant work includes the work related to the Fortran D compiler [4] the Paradigm compiler [5] the Suif compiler [2] the Fx compiler [34, 33] and the Vienna Fortran Compiler [11, 12]. Other approaches include the use of petri nets [13] queuing networks, and Markov chains [35] The Fortran D compiler contains an interactive tool that allows the programmer to select regions of the sequential input program. The tool responds with a data decomposition scheme and diagnostic ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47--56, 1995.


Quantitative Modelling And Analysis Of Business Processes - Henk Jonkers And (1996)   (Correct)

....prediction methods exist, generally associated with different modelling formalisms. These methods mainly differ in the position they occupy on the trade off between prediction accuracy and (computational) efficiency. Static techniques are used to quickly obtain first order performance estimates (Fahringer 1995; Gemund 1996) Techniques based on (timed, stochastic) extensions of Petri nets (Ajmone Marsan et al. 1986) yield accurate results, but are timeconsuming due to a state explosion (which results in a complexity which is exponential in the model size) Between these two extremes, several techniques ....

Fahringer, T. 1995. "Estimating and optimizing performance for parallel programs." IEEE Computer, Nov., 47-56.


The Application Of Hybrid Modelling Techniques For Business.. - Jonkers (1997)   (Correct)

....languages. These methods mainly differ in the position they occupy on the trade off between prediction accuracy and computational efficiency. Static techniques, e.g. based on symbolic expressions or simple critical path algorithms, are used to efficiently obtain first order performance estimates (Fahringer 1995; Gemund 1996) Techniques based on timed or stochastic extensions of Petri nets (Ajmone Marsan et al. 1986) yield accurate results, but are timeconsuming due to a state space explosion (resulting in a complexity which is exponential in the model size) In the remainder of this paper, we will ....

Fahringer, T. 1995. "Estimating and optimizing performance for parallel programs", IEEE Computer, Nov., 47-56.


P³T+: A Performance Estimator for Distributed and.. - Fahringer, Pozgaj (1999)   Self-citation (Fahringer)   (Correct)

No context found.

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


Efficient Symbolic Analysis for Parallelizing Compilers and.. - Fahringer (1998)   (5 citations)  Self-citation (Fahringer)   (Correct)

No context found.

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995. Postscript file available via http://www.par.univie.ac.at/~tf/papers/p3t/ieee-mag.ps.


Estimating Cache Performance for Sequential and Data Parallel.. - Fahringer (1997)   (2 citations)  Self-citation (Fahringer)   (Correct)

No context found.

Thomas Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995. Postscript file available via http://www.par.univie.ac.at/~tf/papers/p3t/ieee-mag.ps.


P³T+: A Performance Estimator for Distributed and.. - Fahringer, Pozgaj (2001)   Self-citation (Fahringer)   (Correct)

No context found.

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


Compile-Time Estimation of Communication Costs for - Data Parallel Programs (1997)   Self-citation (Fahringer)   (Correct)

No context found.

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


Efficient Symbolic Analysis for Parallelizing Compilers and.. - Fahringer (1997)   (5 citations)  Self-citation (Fahringer)   (Correct)

....J 1 =1,N DO J 2 =N (2 J 1 ) N S1: IF (J 1 J 2 N) THEN A(J 1 ; J 2 ) ENDIF . ENDDO ENDDO Detecting zero trip loops [14] is a similar problem which tries to determine whether the loop body of a given loop nest is ever executed. Counting the number of loop iterations has been shown [11, 12, 14, 21] to be crucial for many performance analyses such as modeling work distribution [10] data locality [11] and communication overhead [9] All of these problems can be formulated as queries based on a set of linear and non linear constraints I defined over loop variables and parameters (loop ....

....remarks are given in Section 8. 2 Preliminaries The following notations and definitions are used in the remainder of this paper: ffl Our symbolic analysis has been implemented and is currently being integrated with VFCS [2] a HighPerformance Fortran style parallelizing compiler and P T [8, 12], and with a performance estimator for data parallel programs on distributed memory parallel architectures. The VFCS paralleliziation strategy is based on data decomposition in conjunction with the single program, multiple data programming model. With this method, each array is partitioned and ....

Thomas Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


P³T+: A Performance Estimator for Distributed and.. - Fahringer, Pozgaj   Self-citation (Fahringer)   (Correct)

....models are commonly used to assume a more or less virtual and often unrealistic application behavior. Moreover, very few performance estimators actually consider code transformations and optimizations applied by a compiler. In this paper we introduce P 3 T , the successor tool of P 3 T [22, 15, 16], which models programs, code transformations, and parallel and distributed architectures. The input programs of P 3 T are written in High Performance Fortran [27, 1] which represents the de facto standard of high level data parallel programming. Moreover, P 3 T analyzes Fortran90 message ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


Evaluation of P³T+: A Performance Estimator.. - Fahringer, Pozgaj, .. (2000)   Self-citation (Fahringer)   (Correct)

....on this architecture. Statistical models are often used to assume a more or less virtual and often unrealistic application behavior. Moreover, very few performance estimators actually consider code transformations and optimizations applied by a compiler. P 3 T , the successor tool of P 3 T [5, 6], is a performance estimator for distributed and parallel programs which models programs, code transformations, and parallel and distributed architectures. The input programs of P 3 T are written in High Performance Fortran [10] which represents the de facto standard of high level data parallel ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


P³T+: A Performance Estimator for Distributed and.. - Fahringer, Pozgaj (1999)   Self-citation (Fahringer)   (Correct)

....models are commonly used to assume a more or less virtual and often unrealistic application behavior. Moreover, very few performance estimators actually consider code transformations and optimizations applied by a compiler. In this paper we introduce P 3 T , the successor tool of P 3 T [20, 13, 14], which models programs, code transformations, and parallel and distributed architectures. The input programs of P 3 T are written in High Performance Fortran [25, 1] which represents the de facto standard of high level data parallel programming. Moreover, P 3 T analyzes Fortran90 message ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


A Unified Symbolic Evaluation Framework for Parallelizing.. - Fahringer, Scholz (1999)   (4 citations)  Self-citation (Fahringer)   (Correct)

....communication vectorization and elimination of redundant communication. We have implemented a prototype of our symbolic evaluation framework which is used as part of the Vienna Fortran Compilation System (VFCS) 5] a parallelizing compiler for distributed memory architectures and P 3 T [21, 22] a performance estimator to parallelize and optimize High Performance Fortran programs [34, 5] for distributed memory architectures. The organization of this paper is as follows. Preliminaries are presented in Section 2. In Section 3, we describe our symbolic evaluation framework. This ....

....our method to support symbolic dependence testing and various optimizations (including communication vectorization and elimination of redundant communication) which can result in significant performance improvements of parallel programs. Symbolic evaluation is also being used as part of P 3 T [21, 22], a state of the art performance estimator, in order to estimate the work distribution [23] of parallel programs as a parameterized function defined over unknown problem sizes. Currently, we are extending several compiler optimizations for distributed memory architectures to exploit the prototype ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


Evaluation of P³T+: A Performance Estimator.. - Fahringer, Pozgaj, .. (1999)   Self-citation (Fahringer)   (Correct)

....on this architecture. Statistical models are commonly used to assume a more or less virtual and often unrealistic application behavior. Moreover, very few performance estimators actually consider code transformations and optimizations applied by a compiler. P 3 T , the successor tool of P 3 T [15, 11, 12], is a performance estimator for distributed and parallel programs which models programs, code transformations, and parallel and distributed architectures. The input programs of P 3 T are written in High Performance Fortran [17, 1] which represents the de facto standard of high level data ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


Buffer-Safe and Cost-Driven Communication Optimization - Fahringer, Mehofer (1999)   Self-citation (Fahringer)   (Correct)

....the set of SENDs that cover u 2 U . Uses(s) defines the set of non local uses that are associated with a specific SEND s 2 S. 2. 2 Performance prediction In order to support eliminating communication buffer conflicts and finding the best out of a variety of communication placements we use P 3 T [7, 8, 6, 9], an accurate and effective performance estimation tool for distributed memory parallel programs. P 3 T is a static performance estimator that analytically estimates the performance of data parallel programs (subset of Vienna Fortran [31] High Performance Fortran [20] Fortran90 and Fortran77) ....

....arrays. We have extended P 3 T to cover also general buffer communication where data is received into a buffer that is allocated dynamically, and the array reference that led to communication is replaced by a reference to the buffer. For detailed description of P 3 T , the reader may refer to [7, 8, 6, 9]. 4 3 Buffer Safe Communication Latency Hiding and Message Coalescing In this section we describe our communication optimization strategy. First, we hoist SENDs to the earliest possible program points without considering communication buffer constraints. Second, we aggressively coalesce SENDs ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


Symbolic Expression Evaluation to Support Parallelizing Compilers - Fahringer (1997)   Self-citation (Fahringer)   (Correct)

....(dead code elimination) Detecting 1 zero trip loops [8] is a similar problem which tries to determine whether the loop body of a given loop nest is ever executed. Other related problems require loop iteration or statement execution counts which are key figures to estimate a program s performance [7, 4]. All of these problems can be formulated as a set of linear and non linear constraints I defined over loop variables and parameters (loop invariants) which are commonly derived from loop bounds and conditional statements. For instance, I is given by f1 I1 N , N= 2 I1) I2 N , I1 I2 Ng ....

....this algorithm can be used to compare symbolic expressions for equality and inequality ( relationships, examine non linear array index functions for data dependences, and detect redundant inequalities in a set of constraints. We have implemented the algorithm and use it as part of P 3 T [5, 4], a performance estimator for parallel programs, and VFCS [1] a parallelizing compiler for data parallel programs on distributed memory parallel architectures. Experiments will be shown that demonstrate the usefulness of our approach. This paper is organized as follows: In Section 2 we present ....

T. Fahringer. Estimating and Optimizing Performance for Parallel Programs. IEEE Computer, 28(11):47 -- 56, November 1995.


SvPablo: A Multi-Language Architecture-Independent performance .. - De Rose, Reed (2000)   (6 citations)  (Correct)

No context found.

FAHRINGER, T. Estimating and Optimizing Performance for Parallel programs. IEEE


Is Predictive Tracing Too Late For HPC Users? - Kerbyson, Papaefstathiou.. (1998)   (Correct)

No context found.

T. Fahringer, Estimating and Optimizing Performance for Parallel Programs, IEEE Computer, Vol. 28(11), pp. 47-56 (1995).

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC