| David J. Taylor. The Use of Process Clustering in Distributed - System Event Displays. In Proceedings of the 1993. |
....this regard. Any user of such a monitoring debugging tool cannot be expected to extract meaningful information out of a few thousand traces even if they could all be displayed in one screen. Some forms of abstraction are needed to overcome this problem. While some work has been done in this area [5, 28, 13, 14, 15, 16, 17], it is still quite limited. This is, unfortunately, beyond the scope of this paper. There is a second problem, equally hard, in scaling with respect to the number of processes that is the focus of this paper. In order to build partial order displays efficiently it is necessary to be able to ....
David J. Taylor. The Use of Process Clustering in Distributed - System Event Displays. In Proceedings of the 1993.
....one reason why an automatic predicate detector is required. It would be difficult for a person to watch the display of a large event history and manually detect when particular patterns occur. To aid in the understanding of complex displays two forms of abstraction can be used. Process clustering [26] allows multiple processes or traces to be collapsed together, hiding communication and events internal to the cluster. Event abstraction [19] allows abstract events (described in Chapter 2) to be created from multiple primitive events and displayed more compactly than by showing each primitive ....
David J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of the 1993.
....different loads. A particular logic error may only occur on one run out of ten. To isolate and correct such an error, it may be necessary to reproduce the error run many times. To help reproduce such non deterministic errors, a parallel debugger can include an execution replay mechanism (such as [24]) Execution replay works as follows. The program is 16 executed with event logging until the error occurs. The debugger then forces the program to re execute, following the same event order as the event log for the error run. The program can be re executed in this order as many times as are ....
D. Taylor. The Use of Process Clustering in Distributed-System Event Displays. CASCON '92, vol. 1, IBM Toronto, pp. 29-42, 1992.
....debugger has an event engine consisting of an event collector that collects and stores events that are generated by the distributed application and an event monitor that displays the events collected by the event collector. The event engine is based on David Taylor s event monitoring tool [Tay93] from the University of Waterloo. The remote server of the p2d2 debugger is somewhat like the back end of the IBM distributed debugger. Each remote server of the p2d2 debugger is merely an instance of GDB. The back end of the IBM distributed debugger consists of a debug demon and a debug engine. ....
David J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of CASCON '93, pages 505--512, Toronto, Oct 25-28 1993.
....After a method for single stepping is defined for post mortem debugging, 4 CHAPTER 1. INTRODUCTION it is extended to replay mode. Furthermore, as the amount of collected event information in distributed executions is potentially large, abstraction methods are common in event visualisation tools [24, 32, 36, 41]. When such methods are used during a debugging session, single stepping must remain available to the user. Therefore, single stepping is also extended to deal with such abstractions. Finally, two main types of displays can be used by event visualisation tools: partialorder displays and real time ....
....for distributed applications. It is defined not only for post mortem debugging but also for replay mode. A discussion of the difficulties faced when dealing with normal execution mode is also given. Furthermore, single stepping is defined to work with two methods of abstraction: process clustering [36] and event abstraction [24] Single stepping is also described for the case of a real time ordering. Most of concepts described in this thesis are currently implemented in Poet, a Partial Order Event Tracer, developed at the University of Waterloo [35] 1.2. THESIS CONTRIBUTIONS 5 The thesis is ....
[Article contains additional citation context not shown here]
D. J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of the 1993 CAS Conference, pages 505--512, October 1993.
....the source code and compiler output, and resolution of run time type information occurs at debug time instead of run time. The second issue with an event based approach is the large quantity of events in a visualization. Techniques for simplifying the visualization include process abstraction [27] and event abstraction [15] Process abstraction reduces multiple processes in a process time diagram into a single trace on the display. Similarly, event abstraction is a technique for reducing multiple events into a single event on the display. An approach for simplifying the visualization ....
....not require any a priori knowledge of an application. The reverse engineering approach attempts to reconstruct the design of an application using only the software system. Problems with maintaining partial orders and keeping the user aware of concurrent events within a cluster are discussed in [27]. Process clustering has not been pursued in this thesis for a couple of reasons. First, Poet al..ready has facilities for manually clustering processes. Second, although there has been previous work [15] on automatic process clustering of Hermes [21] applications, there is little to learn by ....
David Taylor. The use of process clustering in distributed--system event displays. In Proceedings of IBM CASCON, pages 505--512, Toronto, Ontario, October 1993.
....facilities for event capture or recording time differences. PVM [8] provides a special service layer (like PICL) and distributed control but lacks event collection. MIDAS [1] provides control and instrumentation but only runs in a simulated environment, rather than on a real network. POET [9, 10] provides event collection but not experiment control. In fact, none of the other tools provides such complete control of an experiment as DECALS, as listed above. The next section outlines the main features of DECALS, then its structure and operation are briefly described. The range of potential ....
David J. Taylor. The use of process clustering in Distributed-System event displays. In CASCON 93, pages 505--512, 1993.
....DCE interfaces interfaces socket usage Figure 4. Sensor class hierarchy coordinator socket DCE RPC POET Figure 5. Management coordinator class hierarchy (POET, Partial Order Event Tracer, is a tool for collecting and visualising event traces from the execution of distributed applications [19]. ffl Registration sensors These sensors allow processes and applications to be registered with the management system so that it is aware of their existence. ffl Fault detection sensors These sensors encapsulate information about remote procedure call timeouts and response times. ffl Resource ....
D. J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of CASCON '93, Vol. 1, Software Engineering, pages 505--512, Toronto, Canada, October 25--28 1993.
....DCE interfaces interfaces socket usage Figure 4. Sensor class hierarchy coordinator socket DCE RPC POET Figure 5. Management coordinator class hierarchy (POET, Partial Order Event Tracer, is a tool for collecting and visualising event traces from the execution of distributed applications [19]. instrumentation, to route internal messages, and to manage sensors and actuators. A scheduler routine was added to the coordinator to handle time based events; this was implemented as a separate execution thread. In the current prototype, the instrumentation library includes the following ....
D. J. Taylor. The use of process clustering in distributed-system event displays. In Proc. CASCON '93, Vol. 1, Software Engineering, pages 505-- 512, Toronto, Canada, October 25--28 1993.
....are discussed. The implementation of the facilitators is based on extending POET to include debugging functions. POET was developed by David Taylor at the University of Waterloo and can be best described as a tool for instrumenting a distributed application to collect event traces of its execution [Tayl92, Tayl93]. Using POET as the foundation, the implementation described in this thesis has added distributed breakpoint and execution based replay facilities which collectively make up the facilitator manager. 5.1 Replay Facilitator The behaviour of a distribution application is non deterministic because of ....
D.J. Taylor, `The use of process clustering in distributed-system event displays', Proceedings of the 1993 CASCON, pp. 505-512, October 1993.
....and line style are target specific. For each individual event, the user can pop up an information box which includes the event s type and any descriptive text. To assist in the understanding of large event traces, processes can be clustered together (thus hiding events internal to the cluster) [39], and primitive events can be grouped together into abstract events, either manually or through the application of automatic pattern detection mechanisms [24, 34] Cluster traces are drawn on a shaded background. Abstract events are denoted by a rectangular box, which contains a filled square ....
David J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of the 1993 CAS Conference, pages 505--512, 1993.
....task. This paper describes a well defined method for single stepping in such a tool, which allows the user to better understand the behavior of the execution. The concepts explained here are currently implemented in Poet, a Partial Order Event Tracer, developed at the University of Waterloo [12, 13]. An example of Poet during the visualization of a PVM (Parallel Virtual Machine) 4] application is given in Figure 1. 1 Within Poet, at the simplest level, each entity exhibiting sequential behavior is represented by a horizontal trace line. These entities can be processes, tasks, threads, ....
....behavior in a distributed execution, such as the sending or receiving of a message, or the creation or termination of a process. On the display, pairwise related events are joined by arrows, such as sending a message from one process to another. At a higher level of complexity, process clustering [13] and event abstraction [10] are used to simplify the view of the execution. We will discuss these concepts further in Section 6. A fundamental difference between traditional source code debuggers and eventvisualization tools is that, unlike code statements, the events of an execution are not ....
[Article contains additional citation context not shown here]
D. J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of the 1993 CAS Conference, pages 505--512. IBM Canada Ltd. Laboratory and National Research Council of Canada, October 1993.
....cluster are displayed, while events purely internal to a cluster are ignored. Process clusters therefore not only reduce the display space in the process dimension, but also in the time dimension. The cluster is always represented by one or more solid lines: the display details are discussed in [17]. In this figure, all processes are shown, either by displaying them individually or as part of a process cluster. Using such a high level visualization, it becomes far easier to analyze and reason about the execution. Figure 5: Intermediate process cluster view of the makehermes execution. Figure ....
David J. Taylor. The Use of Process Clustering in Distributed--System Event Displays. In Proceedings of the 1993 CAS Conference, pages 505--512, Toronto, Ont., Canada, October 1993. IBM Canada Ltd. Laboratory, Centre for Advanced Studies.
....in general, not possible. Instead, a set of totally ordered traces is needed, with the number of traces being equal to the dimension of the partial order of the events in the cluster interface. Some of the problems and their solutions in constructing cluster interface traces are discussed in [19, 60, 62]. A second problem is the identification of the appropriate processes and subclusters to be combined. The tool we developed clusters processes using semantic information that characterizes application processes and information about the actual interprocess communication at runtime. A first ....
....compiler is invoked. This execution creates a total of 175 processes, and the event trace contains 2534 primitive events. Figure 8: Low level Visualization of the makehermes Execution Figure 8 depicts a segment of the execution history at the lowest abstraction level, using the tool described in [61, 62]. This tool draws a set of horizontal lines, one for each process, placing a symbol on the appropriate line for each event. Time flows from left to right, and a scrollbar allows for scrolling in the vertical (process) dimension. Scrolling in the time dimension is more complex since it depends on ....
[Article contains additional citation context not shown here]
David J. Taylor. The Use of Process Clustering in Distributed--System Event Displays. In Proceedings of the 1993 CAS Conference, pages 505--512, Toronto, Ont., Canada, October 1993. IBM Canada Ltd. Laboratory, Centre for Advanced Studies.
....is allowed. The communication channels may or may not have the FIFO property. Processes can be created and terminated dynamically. The distributed application behaviour is frequently depicted using process time diagrams [9, 25] Figure 1 shows the visualization provided by the tool described in [27, 28]. This tool draws a set of horizontal lines, one for each process, placing a symbol on the appropriate line for each event. Time flows from left to right, and a scrollbar allows for scrolling in the vertical (process) dimension. Scrolling in the time dimension is more complex since it depends on ....
....cluster are displayed, while events purely internal to a cluster are ignored. Process clusters therefore not only reduce the display space in the process dimension, but also in the time dimension. The cluster is always represented by one or more solid lines, the display details are discussed in [28]. In this figure, all processes are shown, either by displaying them individually or as part of a process cluster. Figure 6 shows the following sequence of actions. After scanning the source of the definition module, process make knows all sources this module imports. It repeatedly invokes GETDEP ....
David J. Taylor. The Use of Process Clustering in Distributed--System Event Displays. In Proceedings of the 1993 CAS Conference, pages 505--512, Toronto, Ont., Canada, October 1993. IBM Canada Ltd. Laboratory, Centre for Advanced Studies.
No context found.
D.J. Taylor. The use of process clustering in distributed --system event displays. In Proc. of the 1993 CAS Conf., pages 505--512, Toronto, Canada, Oct. 1993.
....its components. 3.1 Management Applications To validate the management services as well as to investigate aspects of management applications, it was important to have, as part of the prototype system, several management applications. 3.1. 1 Event Visualisation POET, Partial Order Event Tracer [49], is a tool for collecting and visualising event traces from the execution of distributed applications. Although POET was originally intended for use as a debugging tool [50] its event displays are also useful for visualising the operation of an application in production use. POET s notion of an ....
....into abstract events and traces into clusters. These abstractions can, in turn, be grouped again into higher level abstractions, leading to tree structured abstraction hierarchies. The user can navigate the abstraction hierarchies to visualise the execution at an appropriate abstraction level [29, 49]. The abstraction hierarchies can be built manually or by using automatic tools [28, 30] Various forms of pattern specification and matching to automatically scan the incoming event trace are also supported. Some examples of these facilities are the flagging of potential race situations, the ....
D. J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of CASCON '93, Vol. 1, Software Engineering, pages 505--512, Toronto, Canada, October 25--28 1993.
....has been used in two slightly different ways, to handle what might be described as known and unknown target specific problems. An example of a known target specific problem is default clustering. Process clustering can be used to group traces and hide activity internal to sets of traces [8]. In some target environments, parts of the run time system are exposed in the event display, but are normally of little or no value to an application developer. Thus, a default clustering facility is provided that puts the exposed system traces into a cluster, removing them from view. The ....
D. J. Taylor. The use of process clustering in distributed-system event displays. In Proceedings of CASCON '93, Volume I, pages 505--512, October 24--28 1993.
....representing some activity performed by a process and considered to take place at an instant in time. Typically, the lowest level of observed behaviour consists of events representing process interactions, such as sending and receiving messages and process creation and termination. Our tool, Poet [11, 12], displays processes and events using two dimensional process time diagrams. The placement of events along the time axis is based on either their occurrence in real time or their relationship to other events in the partial order introduced by Lamport [9] Each display mode has value: for example, ....
....primitive events. To assist in understanding such executions, abstract visualizations are provided in which processes are grouped into process clusters and primitive events are grouped into abstract events. Such abstractions are either derived automatically [4, 7, 8] or created manually by a user [5, 12]. Poet al..lows a user to navigate the resulting hierarchy of abstract views, to collect increasingly detailed information for smaller parts of the execution, for example. Currently, Poet runs in a variety of target environments, such as OSF DCE, Hermes, ABC , C , and SR. This paper describes ....
David J. Taylor. The use of process clustering in distributed--system event displays. In Proceedings of the 1993 CAS Conference, pages 505--512, Toronto, Ont., Canada, October 1993. IBM Canada Ltd. Laboratory, Centre for Advanced Studies.
....ordered traces is needed, with the number of traces being equal to the dimension of the partial order of the events in the cluster interface. Discussions of some of the problems in constructing cluster interface traces can be found in [9] and [19] Our prototype takes an even simpler approach [20]: as concurrency is encountered during display drawing, the number of trace lines associated with a cluster is increased as necessary. Automatic clustering. A second problem is the identification of the appropriate processes and subclusters to be combined. For large distributed applications, the ....
....across the cluster hierarchy tree, including exactly one node on each root to leaf path. The prototype allows the user to modify the focus by simply selecting a node that should be placed in the focus. A minimal set of other changes is then made to create a legitimate focus containing that node [20]. Program Target Target Program Disk Records point CheckFile event RawSession Debug Process Checkpoint Server Figure 3: The Architecture of the Shoshin Debugger 3.3 Architecture of the Prototype The architecture adopted for the prototype is shown in Figure 3. The debugger consists of three ....
D. J. Taylor. The use of process clustering in distributed-system event displays. In Proc. CASCON '93, Vol. 1, Software Engineering, pages 505--512, Toronto, Ontario, October 25--28 1993.
....research. 2 Overview of the Replay Facilitator This paper is based on work done to add debugging functions to POET. POET can best be described as a tool for instrumenting a distributed application to collect and display event traces showing the execution of its component processes or threads [5, 6, 7]. Because POET can be used with a variety of application environments, the monitored entities with sequential behaviour are referred to as traces and may represent processes, threads, monitors, semaphores, etc. depending on the application environment. In OSF DCE, each trace represents a thread ....
D. Taylor. "The use of process clustering in distributed-system event displays". Proceedings of the 1993 CAS Conference, pages 505--512, October 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC