| J. Hollingsworth and B. Miller. Dynamic Control of Performance Monitoring of Large Scale Parallel Systems. In Proceedings of Super Computing (SC), Tokyo, Japan, July 1993. |
....herve erttempted to use progrerm informertion to reduce the volume of informertion meersured ert runtime. Sequentierl profiling tools such ers QPT [1] use controlflow ernerlysis to reduce the volume of profiling or trercing derter. Dynermic perrerllel instrumentertion in the W 3 Seerrch Model [13] reduces the instrumentertion derter volume by using sermpled performernce informertion to selectively insert instrumentertion in interesting perrts of er program ert runtime, in order to ernswer specific performernce queries. In contrerst to these genererl purpose erpproerches, we cern exploit ....
J. K. HOLLINGSWORTH AND B. P. MILLER, Dynamic control of performance monitoring on large scale parallel systems, in International Conference on Supercomputing, Tokyo, Jul 1993.
....devoted to traditional hard fault management issues in cluster environments. There have also been e#orts exploring performance issues in parallel program execution. These e#orts include evaluation of network e#ects [2, 7] performance analysis using application and kernel code instrumentation [4, 9, 11], performance prediction [8, 10] and program steering [3, 13] Additionally, adaptive techniques have been explored for predictive signaling and control in cluster environments for performance management [12] and in highly distributed networks for use in fault management [5] Our focus, however, ....
J. Hollingsworth and B. Miller. Dynamic control of performance monitoring on large scale parallel systems. In International Conference on Supercomputing, July 1993.
....specific tools to help users instrument, analyze, predict, and tune the performance of their parallel and distributed applications using broad range of approaches. Most of the developed tools tend to be used at runtime, using performance data gathered by instrumenting the application program [2,3,4,5,8]. A survey conducted by Pancake and Cook [6] has revealed the fact that tool use is still appallingly low among the high performance computing community. Cherri et al. 7] have cited the following three critical causes for this situation; 1) current tools are difficult to understand by scientific ....
J.K. Hollingsworth and B.P. Miller, "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems," Proc. Int'l Conf. on Supercomputing, Tokyo, Jul 1993.
....is encountered, allowing indirect callees to be included in an optimized group [5] Another candidate for future work is removal of user involvement in the initial step of choosing the group s root function, thus allowing all steps to be performed automatically. Paradyn s Performance Consultant [5, 11] has shown that bottlenecks can be automatically located for non threaded user programs, via a call graph traversal. Other than emitting all hot chunks before any cold chunks, the relative placement of functions within a group is arbitrary. With future work, basic block positioning can be ....
J.K. Hollingsworth and B.P. Miller. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. Seventh ACM International Conference on Supercomputing (ICS), Tokyo, July 1993.
....95 0868 problems of the application and shows them to the application programmer, together with source code references of the problem found, and indications on how to overcome the problem. The main difference between the KAPPA PI tool and the existing automatic performance analyisis tools [2] [3] [4] is that the code of the analysed application is checked to propose alternatives for a new behaviour. Analysis first considers the study of the trace file in order to locate the most important performance problems occurring at the execution. Once those problematic execution intervals have been ....
Hollingsworth, J. K., Miller B, P. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. International Conference on Supercomputing (Tokyo, July 19-23, 1993).
....scheme for processor utilization that adds processor idle times to the processor utilization of the procedures that are active on other processors at the same time. This helps a user in optimizing his program by showing him the procedures that are relevant for optimization. The W 3 search model [8] introduces three axes along which performance bottlenecks can be tracked down. The why axis gives a hierarchy of potential performance bottlenecks. Along this axis it can be examined whether a program execution shows the symptoms of any of the performance bottlenecks on the axis. The where ....
J.K. Hollingsworth and B.P. Miller. Dynamic control of performance monitoring on large scale parallel systems. In 7th ACM International Conference on Supercomputing, July 1993.
....deal with the presentation of large volumes of resulting data that may be produced by massively parallel applications. The trace collection issue is not a recent one. A number of user tools are heavily based on traces, including correctness debuggers [5, 17, 26, 16, 22, 15] performance debuggers[9, 10, 19, 18, 8], trace driven simulators [3, 28, 6, 24, 4] etc. A broad classification [25] of tracing techniques distinguishes four basic classes: hardware based methods which use a hardware monitor to record all requests on the address bus of a processor, interrupt based methods which cause an interrupt on ....
J. K. Hollingsworth and B. P. Miller. "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems". Technical Report 1133, University of Madison-Wisconsin, 1993.
....and visualization for evaluating and tuning the performance of a final product. Where they address model development at all, it is to determine resource demands with little attention to model structure. Examples include: Paragraph [19] SIMPLE [12] ChaosMON [30] AIMS [51] PABLO [2] and W 3 [23]. There are three main contributions presented by this paper. First, TLC extends the model construction technique of [25] to handle interactions by asynchronous messages. Secondly, TLC formalizes the earlier, ad hoc model builder by using a rule based graph grammar approach for manipulating the ....
J. Hollingsworth and B. Miller. "Dynamic control of performance monitoring on large scale parallel systems." Proceedings of International Conference on Supercomputing, pages 19-- 23, July 1993.
....imposes the problems described in the previous section. The simulation of an asynchronous architecture has two main objectives: 1. The testing and debugging of the architecture. 2. The evaluation of the architecture s performance. 7 Different techniques such as event record filtering [Holl93], event record clustering [Mohr90] or event abstraction [Bast94] have been proposed to reduce the amount of monitoring information in distributed systems. CHAPTER 7. SIMULATION ISSUES 176 7.3.1.1 Debugging For the debugging of the architecture (as well as the simulation model) it is necessary to ....
Hollingsworth, J., Miller, P. B., "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems", report accessible by anonymous ftp in grilled.cs.wisc.edu: technical papers/w3search.ps.Z.
....application. Instead of capturing data for the duration of execution, their system provides the capability to capture data for the interesting intervals (those requiring optimization) while ignoring those portions that are not problematic or are well understood. Hollingsworth and Miller [6] also describe an expert system called Performance Consultant which guides parallel application developers toward those areas of interest. During intervals when performance data capture is enabled, data is gathered from all nodes in the system. In contrast, the ETRUSCA process as described in this ....
Hollingsworth, J. K., and Miller, B. P. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. In 7th ACM International Conference on Supercomputing (Tokyo, July 1993), Association for Computing Machinery, pp. 185--194. 89
....statements. Recently, Cray Research released a new tool, the MPP Apprentice [6] that recognises the performance problems in message passing programs on the Cray T3D. The MPP Apprentice s approach is similar to ATExpert, but it supports a more general programming model. The W 3 Search Model [7] tries to answer three questions: why is the application performance poor (identifying the type of bottleneck, e.g. synchronisation, I O, and computation) where is the bottleneck (isolating a performance bottleneck to a specific resource, e.g. a synchronisation variable, a disk system, or a ....
J. K. Hollingsworth and B. P. Miller, "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems", Proceedings of the 7th ACM International Conference on Supercomputing, Tokyo, July 1993, pp. 185--194.
....diagnosis. Accurate performance diagnosis of parallel and distributed programs is a difficult and time consuming task. In a recent survey of scientists actively engaged in parallel performance tuning[33] 50 reported an average time per tuning task of several weeks or longer. Recent research [28, 32, 70, 77] has examined possible approaches for automating, and thereby simplifying, the process of diagnosing a single program run. We present a novel approach to automated diagnosis that uses application data gathered in previous executions to guide the search for performance bottlenecks. This method ....
....hierarchies, and specific parts of a program are identified using a focus. This is a simpler, single execution version of our representational scheme as described in Chapter 3; in fact, it was the starting point we used in developing our multi execution model. Paradyn s Performance Consultant (PC) [32] capitalizes on dynamic instrumentation to automate bottleneck detection during a program execution. The PC starts searching for bottlenecks by issuing instrumentation requests to collect data for a set of pre defined performance hypotheses for the root focus or whole program. Each hypothesis is ....
J.K. Hollingsworth and B.P. Miller. Dynamic control of performance monitoring on large scale parallel systems. Proceedings of the International Conference on Supercomputing, Tokyo, July 1993.
....accuracy. 1 Introduction Automating any part of the performance tuning cycle is a valuable activity, especially where intrinsically complex and non deterministic distributed programs are concerned. Our previous research has developed techniques to automate the location of performance bottlenecks [4,9], and other tools can even make suggestions as to how to fix the program to improve its performance [3,8,10] The Performance Consultant (PC) in the Paradyn Parallel Performance Tools has been used for several years to help automate the location of bottlenecks. The basic interface is a one button ....
....(see Figure 1b) This simpler instrumentation also generates less run time overhead. A start stop pair of timer calls takes 56.5 s on a SGI Origin. The savings become more significant in functions that contain many call sites. 2. 2 The Performance Consultant Paradyn s Performance Consultant (PC) [4,7] dynamically instruments a program with timer start and stop primitives to automate bottleneck detection during program execu (a) Exclusive Time (b) Inclusive Time Figure 1 Timing instrumentation for function foo. foo( stopTimer(t) startTimer(t) bar( car( stopTimer(t) ....
J.K. Hollingsworth and B.P. Miller, "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems", 7th Int'l Conf. on Supercomputing, Tokyo, Japan, July 1993.
....inserted into and removed from the running process s code. The interface between the tool and the dynamic instrumentation library dyninst is clearly defined and published [3] On the higher level, Paradyn uses the so called W 3 model to automatically determine performance bottlenecks [4, 10]. The W 3 model in particular profits from dyninst because intrusion can be kept low by removing unnecessary instrumentation. The above tools are two of the very few that define an explicit interface between the monitoring system and the tool itself. By clearly defining this interface it ....
J. K. Hollingsworth and B. P. Miller. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. In Intl. Conf. on Supercomputing, Tokio, July 1993.
....application and IS process behavior with respect to the Paradyn IS in the following subsections. Behavior of Application Processes A Paradyn daemon dynamically inserts instrumentation code in the binary image of an executing process as needed by the W 3 search algorithm executed by its mPp [11]. The instrumentation code is removed when the algorithm no longer needs to collect instrumentation data from that application process. Therefore, an application process alternates between the periods of instrumentation and no instrumentation during its execution. An alternating renewal process is ....
Hollingsworth, J. K. and B. P. Miller, "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems," Proc. of Int. Con. on Supercomputing, Tokyo, Japan, July 19--23, 1993.
....to deal with the presentation of large volumes of result data that may be produced by massively parallel applications. The trace collection issue is not a recent one. A number of user tools are heavily based on traces, including correctness debuggers [4, 15, 24, 14, 20, 13] performance debuggers[8, 9, 17, 16, 7], trace driven simulators [2, 26, 5, 22, 3] etc. A broad classification [23] of tracing techniques distinguishes four basic classes: hardware based methods which use a hardware monitor to record all requests on the address bus of a processor, interrupt based methods which cause an interrupt on ....
J. K. Hollingsworth and B. P. Miller. "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems". Technical Report 1133, University of Madison-Wisconsin, 1993.
....requirements for several reasons: 1 This research is supported by DARPA under Rome Labs contract AF 30602 92 C 0135. R1) Providing a user (program level) view. Tool development has been dominated by efforts directed at the execution level (e.g. efficient implementation of monitoring [1]) In consequence, tool users are given little support for translating program level semantics to and from low level, execution measurements and runtime data. R2) Support for high level, parallel programming languages. The development of advanced parallel languages (e.g. HPF [6] and pC [14] ....
J. Hollingsworth and B. Miller. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. Proceedings of the 1993 International Conference on Supercomputing, July 1993.
....removed from the running process code. The interface between the tool and the dynamic instrumentation library dyninst is clearly defined and published [1] On the higher level, Paradyn uses the so called W 3 model to automatically determine performance bottlenecks in the running application [2, 8]. The W 3 model in particular profits from dynamic instrumentation because intrusion can be kept low by removing instrumentation that is no longer necessary. In summary, Paradyn clearly represents the stateof the art of performance analysis. In contrast to earlier developments, the above two ....
J. K. Hollingsworth and B. P. Miller. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. In International Conference on Supercomputing, Tokio, July 1993.
....in reduced data volume. Conversely, if the data volume falls below a certain level, the system will start generating more detailed information. In either case, the nature of the performance data being collected will not change; only the level of detail will be affected. Hollingsworth and Miller [18] developed a system that incorporates on the fly decision making about what performance data to collect. This allows the volume of performance data collected to be dynamically controlled based on the current analysis task at hand. The closest analogs to our work within the virtual reality ....
Hollingsworth, J. K., and Miller, B. P. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. University of Wisconsin-Madison, 1993.
....visualizations, and search tools. Profile metrics[1, 6, 15, 22] associate a value with each component of a distributed or parallel application (frequently procedures) and are presented as sorted tables. Visualizations[8, 13, 14, 18, 23] explain application performance using pictures. Search tools[10, 17, 21] help users to manage performance data information overload by treating the problem of finding a performance bottleneck as a search problem. However, all of these tools focus on the measurement and analysis of a specific program for a single execution. One type of tool that permits programmers to ....
J. K. Hollingsworth and B. P. Miller, "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems," 7th ACM International Conf. on Supercomputing. July 1993, Tokyo, pp. 185194.
....for a performance bottleneck, for example) to instrument only a few top level routines at a time. If the results show a bottleneck in a function, then the instrumentation for it can be removed and applied to the function s callees. This strategy can be used to automate the search for bottlenecks [14, 42], with little run time overhead, because at any given time only a few functions are instrumented. This algorithm is prohibitive with any static instrumentation system, because the program would have to be re run every time the instrumentation changes. Static instrumentation systems typically ....
....because there can be many callees of any one indirect call instruction. Another candidate for future work is to remove user involvement in the first code positioning step (choosing the group s root function) thus allowing all steps to be performed automatically. Paradyn s Performance Consultant [14, 42] has already shown that bottlenecks can be automatically located for non threaded user programs, through a call graph traversal. It should be possible to develop a multi thread aware Performance Consultant, and then adapt it to kperfmon, which sysacct (acct(2) syscall) 6 4.5 0.8 clone (Clone ....
[Article contains additional citation context not shown here]
J.K. Hollingsworth and B.P. Miller. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. Seventh ACM International Conference on Supercomputing (ICS), Tokyo, July 1993.
....in a program; a performance problem is a part of the program that contributes a significant amount of time to its execution. A single execution of a program may contain several problems. To assist in finding performance problems, Paradyn has a well defined model, called the W 3 Search Model [3], that organizes information about a program s performance. Paradyn s Performance Consultant module employs the W 3 Search Model to automate the identification of performance problems by using data gathered via dynamic instrumentation. 11 4.1. THE W 3 SEARCH MODEL The W 3 ....
J. K. Hollingsworth and B. P. Miller, "Dynamic Control of Performance Monitoring on Large Scale Parallel Systems", 7th ACM International Conf. on Supercomputing, Tokyo, July 1993, pp. 185-194.
....to identify bottlenecks, decreases the amount of unhelpful instrumentation, and improves the usefulness of the information obtained from a diagnostic session. 1 INTRODUCTION Accurate performance diagnosis of parallel and distributed programs is a difficult and time consuming task. Recent research [1, 2, 14, 3, 4] examines possible approaches for automating, and thereby simplifying, the process of diagnosing a single program run. This paper describes how historical performance data, i.e. data gathered in one or more previous executions of an application, can be used to increase the effectiveness of ....
....that uses dynamic instrumentation to insert and delete measurement instrumentation as a program runs. This approach results in a relatively small amount of data, in contrast to most tracing methods that may result in (possibly unusably) large data files. Paradyn s Performance Consultant (PC) [2] capitalizes on this dynamic instrumentation to automate bottleneck detection during a program execution. The PC starts searching for bottlenecks by issuing instrumentation requests to collect data for a set of pre defined performance hypotheses for the whole program. Each hypothesis is based on a ....
J. K. Hollingsworth and B. P. Miller. Dynamic control of performance monitoring on large scale parallel systems. In Proceedings of the International Conference on Supercomputing, Tokyo, July 1993.
....can be huge. However, in practice a small amount of information is often sufficient to reveal the key bottlenecks. Performance debuggers exist to help programmers find the gems of understanding among the large space of available performance data. In this section we review the W 3 Search Model[4], a system that provides a structured way for programmers to quickly and precisely isolate a performance problem without having to examine a large amount of extraneous information. It is based on answering three separate questions: why is the application performing poorly, where is the bottleneck, ....
J. K. Hollingsworth and B. P. Miller, Dynamic Control of Performance Monitoring on Large Scale Parallel Systems, in Proceedings of the 7th ACM International Conference on Supercomputing, Tokyo, July 1993, pp. 185--194.
No context found.
J. Hollingsworth and B. Miller. Dynamic Control of Performance Monitoring of Large Scale Parallel Systems. In Proceedings of Super Computing (SC), Tokyo, Japan, July 1993.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC