6 citations found. Retrieving documents...
J. Vetter. Performance Analysis of Distributed Applications using Automatic Classification of Communication Inefficencies. In Proc. of the 14th International Conference on Supercomputing, pages 245--254, Santa Fe, New Mexico, May 2000.

 Home/Search   Document Details and Download   Summary   ACM   TOC   Related Articles   Check  

This paper is cited in the following contexts:
Automatic Search for Performance Problems in Parallel.. - Fahringer, Seragiotto, .. (2002)   (Correct)

....for performance properties in message passing trace files in combination with source code analysis. Expert also covers OpenMP and mixed parallel programs, and uses the concept of performance properties organized in a hierarchy. Performance properties are also used in the Peridot project [6] In [10] an approach is described that uses machine learning to detect performance problems in message passing codes. A decision tree trained for different taxget architectures is employed to detect individual communication performance problems. All of the previously mentioned tools concentrate ....

Jeffrey Vetter. Performance Analysis of Distributed Applications using Automatic Classification of Cormnunication inefficiencies. In Proceedings qf the 14th lntet7ational Conference on Supercomputing, pp. 245-254, Santa Fe, New Mexico, May, 2000.


Scalable Analysis Techniques for Microprocessor Performance.. - Ahn, Vetter (2002)   (2 citations)  Self-citation (Vetter)   (Correct)

....raw data immediately with a visualization tool like VGV [7] or Paraver [4] feature extraction tools and rule based recommender systems [10, 16] can support the visualization process. For example, at step #, a decision tree algorithm could identify those messages that are performing abnormally [19] and identify them in the visualization with a special glyph or color. Very simply, this paper addresses these concerns by evaluating several multivariate statistical techniques on these datasets. We find that techniques such as statistical clustering offer promise for automatically ....

J.S. Vetter, Performance Analysis of Distributed Applications using Automatic Classification of Communication Inefficiencies, Proc. ACM Int'l Conf. Supercomputing (ICS), 2000, pp. 245 - 54.


Dynamic Statistical Profiling of Communication Activity in.. - Vetter (2002)   Self-citation (Vetter)   (Correct)

....systems, which will have thousands, if not millions, of processors [1] is quickly outstripping the capabilities of traditional performance analysis techniques. While traditional trace based techniques for analyzing communication performance of distributed applications have demonstrated advantages [6, 8, 12 14, 17, 18, 20, 21, 24, 26], their operation on terascale platforms presents several challenges. In particular, these techniques require post mortem analysis of potentially massive tracefiles, which, in turn, can lead to high instrumentation overhead and flawed performance observations. Put simply, this paper proposes a ....

....performance analysis, highlight its limitations, and argue for statistical profiling of communication activity via message sampling. Trace based performance analysis of distributed applications is very useful because it provides users with detailed chronology of their application s execution [6, 8, 13, 14, 17, 18, 20, 21, 24, 26]. As illustrated in Figure 1, the typical operation of a tracebased tool for analyzing communication operations on a distributed application is a multi step process. To make this process more concrete, we applied a widely used MPI tracing tool to SMG2000 on 48 tasks. This application sets up and ....

J.S. Vetter, "Performance Analysis of Distributed Applications using Automatic Classification of Communication Inefficiencies," Proc. ACM Int'l Conf. Supercomputing (ICS), 2000, pp. 245 - 54.


Statistical Scalability Analysis of Communication Operations .. - Vetter, McCracken   Self-citation (Vetter)   (Correct)

....1 10 100 1000 Aggregate Time (millisecs) 1e 0 1e 1 1e 2 1e 3 1e 4 1e 5 1e 6 1e 7 Allreduce:main.f:1392 Wait:bdrys.f:1247 Wait:bdrys.f:1248 Allreduce:main.f:1025 Wait:bdrys.f:2101 Figure 12: Select call sites for sPPM. Our analysis of sPPM confirmed our earlier investigation [22] and reinforced the experience of others [15] sPPM scales very well. As illustrated in Table 3, our rank correlation identifies Allreduce:main.f:1392 main culprit while a host of Waits dominate the other positively correlated operations. As Figure 12 illustrates, Allreduce:main.f:1392 rockets ....

J.S. Vetter, "Performance Analysis of Distributed Applications using Automatic Classification of Communication Inefficiencies," Proc. ACM Int'l Conf. Supercomputing (ICS), 2000, pp. 245 - 54.


Automatic Performance Analysis on Parallel Computers with SMP Nodes - Wolf (2002)   (1 citation)  (Correct)

No context found.

J. Vetter. Performance Analysis of Distributed Applications using Automatic Classification of Communication Inefficencies. In Proc. of the 14th International Conference on Supercomputing, pages 245--254, Santa Fe, New Mexico, May 2000.


Modeling Application Performance by Convolving Machine.. - Snavely, Wolter.. (2001)   (6 citations)  (Correct)

No context found.

J.S. Vetter, "Performance Analysis of Distributed Applications Using Automatic Classification of Communication Inefficiencies", Proc. ACM Int'l Conf. Supercomputing (ICS), 2000.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC