8 citations found. Retrieving documents...
Robert H. B. Netzer, Sairam Subramanian and Jian Xu, "CriticalPath -Based Message Logging for Incremental Replay of Message-Passing Programs," Internation al Conference on Distributed Computing Systems, 1994. 21

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Debugging Parallel Systems: A State of the Art Report - Huselius (2002)   (Correct)

....It is not necessarily so, that the lengths of that timespan is the same for all types of entries, wherefore other structures could be preferable (see Section 6.5) 4.3. 1 Checkpointing The reason for making checkpoints of a system is to be able to start over with the execution at some later point [37, 65]. There are to our knowledge three main applications for this ability: The first case applies to systems that can sense an error in their execution, and as a response to this can decide to roll back and try again. The second case applies to systems that have some source of non determinism in them; ....

....messages in the reproduction of the system execution. Zambonelli and Netzer et al. discuss the situation in Critical Path Based Message Logging for Incremental Replay of Message Passing Programs and An E#cient Logging Algorithm for Incremental Replay of Message Passing Applications [37, 65]. The authors state that logging all messages is resource demanding during the reference execution, but recreating all messages during the reproduction can be very demanding during that process. This is therefore a trade o# situation. They discuss whether it would be possible to make a compromise: ....

[Article contains additional citation context not shown here]

Robert Netzer et al. Critical-Path-Based Message Logging for Incremental Replay of Message-Passing Programs. In Proceedings of the 14th International Conference on Distributed Computing Systems, pages 404 -- 413. IEEE, June 1994.


Deterministic Replay of Distributed Java Applications - Konuru, Srinivasan, Choi (2000)   (7 citations)  (Correct)

....global counter. Our scheme is, thereby, much simpler and more efficient than theirs on a uniprocessor system. Neither of these addresses replaying distributed applications. Netzer et al. address the issue of how to balance the overhead of logging during the record phase with the replay time [7]. Even for a closed world system, they store contents of messages selectively to avoid executing the program from the start. Combined with checkpointing [10] storing contents of messages allows for bounded time replay to arbitrary program points. 8. Conclusions We have developed a record replay ....

R. Netzer, S. Subramanian, and X. Jian. Critical-path-based message logging for incremental replay of message-passing programs. In Proceedings of the 14th IEEE International Conference on Distributed Computing Systems, June 1994.


Checkpointing with Multicast Communication - Lumpp, Jr., Dieter (1998)   (Correct)

....by non deterministic events. During normal execution all non deterministic events such as joining and leaving groups and sending and receiving messages are logged. Methods that reduce the message logging overhead by reexecuting previous checkpoint intervals to generate lost messages apply [12] [11]. Each process independently saves its state in local checkpoints periodically. When a process, P i , fails, it rolls back to its most recent local checkpoint and restarts its execution from the checkpoint. Note that the state of the program when P i rolls back to its most recent checkpoint may ....

Robert H. B. Netzer, Sairam Subramanian, and Jian Xu. Critical-path-based message logging for incremental replay of message-passing programs. In Proceedings of the International Conference on Distributed Computing Systems, June 1994.


Fault Recovery for Distributed Shared Memory Systems - Dieter, Lumpp, Jr.   (Correct)

.... trying to prevent rollbacks is to reduce the amount of time required to recover from a failure, while adding as little overhead as possible during failure free execution techniques have been developed that bound the restart time and reduce the number of messages that need to be logged [30] 29] [28]. It is not necessary to log every message. It is sufficient to log only those messages that would cause rollback propagation because unlogged messages can be computed on the fly by executing parts of other processes. Some checkpointing systems use this information to reduce the number of messages ....

....to log every message. It is sufficient to log only those messages that would cause rollback propagation because unlogged messages can be computed on the fly by executing parts of other processes. Some checkpointing systems use this information to reduce the number of messages logged [30] [28] or to put a practical upper bound on the amount of time required to restart a process [29] The advantage of this method is its low failure free overhead and its near bounded playback time. An alternative to allowing processes to checkpoint independently is to force them to coordinate when they ....

Robert H. B. Netzer, Sairam Subramanian, and Jian Xu. Critical-path-based message logging for incremental replay of message-passing programs. In Proceedings of the International Conference on Distributed Computing Systems, June 1994.


Optimal Run-Time Tracing of Message-Passing Programs - Karmarkar, Vaidya, Netzer   Self-citation (Netzer)   (Correct)

No context found.

Robert H. B. Netzer, Sairam Subramanian and Jian Xu, "CriticalPath -Based Message Logging for Incremental Replay of Message-Passing Programs," Internation al Conference on Distributed Computing Systems, 1994. 21


An Efficient Logging Algorithm for Incremental Replay of.. - Zambonelli, Netzer (1999)   (2 citations)  Self-citation (Netzer)   (Correct)

....largest one, divided by the total number of processes) measure the average and the worst case off line replay costs, respectively. One could criticize different metrics need to be introduced toward the effectiveness of the replay, such as the length of the longest sequential path needed to replay [6]. However, most of todays parallel and distributed architectures are not widely available at a cheap cost, and the replay activity cannot assume the availability of parallel executing resources. That makes it preferable to limit the total amount of computation rather than the parallel execution ....

....reference to figure 6 and the replay of , one can resume (in addition to from ) from and let its execution proceed until needed, i.e. over and , until the replay of completes. This scheme, which is the one proposed in [7] and in [6], can also be applied in the case of the full informed algorithm. Though simple and elegant, Replay Scheme 2 tends to waste execution resources. Firstly, the execution of some process could also proceed over the right frontier of the replay set. In addition, the replay cannot skip the execution ....

[Article contains additional citation context not shown here]

R. Netzer, S. Subramanian, and J. Xu. Critical-path-based message logging for incremental replay of message passing programs. In International Conference on Distributed Computing Systems, pages 404--413, June 1994.


An Efficient Logging Algorithm for Incremental Replay of.. - Zambonelli, Netzer (1999)   (2 citations)  Self-citation (Netzer)   (Correct)

....largest one, divided by the total number of processes) measure the average and the worst case off line replay costs, respectively. One could criticize different metrics need to be introduced toward the effectiveness of the replay, such as the length of the longest sequential path needed to replay [6]. However, most of todays parallel and distributed architectures are not widely available at a cheap cost, and the replay activity cannot assume the availability of parallel executing resources. That makes it preferable to limit the total amount of computation rather than the parallel execution ....

....of each interval. With reference to figure 6 and the replay of I 1;1 , one can resume (in addition to P 1 from C 1;1 ) P 2 from C 2;0 and let its execution proceed until needed, i.e. over C 2;1 and C 2;2 , until the replay of I 1;1 completes. This scheme, which is the one proposed in [7] and in [6], can also be applied in the case of the full informed algorithm. Though simple and elegant, Replay Scheme 2 tends to waste execution resources. Firstly, the execution of some process could also proceed over the right frontier of the replay set. In addition, the replay cannot skip the execution of ....

[Article contains additional citation context not shown here]

R. Netzer, S. Subramanian, and J. Xu. Critical-path-based message logging for incremental replay of message passing programs. In International Conference on Distributed Computing Systems, pages 404--413, June 1994.


Compressed Differences: An Algorithm for Fast Incremental.. - Plank, Xu, Netzer (1995)   (6 citations)  Self-citation (Netzer Xu)   (Correct)

....differences should improve the overhead of incremental checkpointing whenever F comp 0:16. We tested the performance of checkpointing on a variety of long running programs: LOG is a simulation program that computes the messages to be logged according to a critical path based logging algorithm [24]. CELL is a program that executes a 2048 Theta 2048 grid of cellular automata for fifteen generations. CONTOUR calculates altitude contours on a 2816 Theta 2179 byte map. SOLVE uses the dgesv subroutine from LAPACK [25] to solve a linear system with 1000 equations, 1000 unknowns, and 500 ....

R. H. B. Netzer, S. Subramanian, and J. Xu, "Critical-path-based message logging for incremental replay of message-passing programs," in 14th International Conference on Distributed Computing Systems, (Poznan, Poland), June 1994.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC