Results 1 - 10
of
744
A Survey of Rollback-Recovery Protocols in Message-Passing Systems
, 1996
"... this paper, we use the terms event logging and message logging interchangeably ..."
Abstract
-
Cited by 716 (22 self)
- Add to MetaCart
this paper, we use the terms event logging and message logging interchangeably
Lightweight causal and atomic group multicast
- ACM TRANSACTIONS ON COMPUTER SYSTEMS
, 1991
"... ..."
(Show Context)
Evaluation of Release Consistent Software Distributed Shared Memory on Emerging Network Technology
"... We evaluate the effect of processor speed, network characteristics, and software overhead on the performance of release-consistent software distributed shared memory. We examine five different protocols for implementing release consistency: eager update, eager invalidate, lazy update, lazy invalidat ..."
Abstract
-
Cited by 467 (43 self)
- Add to MetaCart
(Show Context)
We evaluate the effect of processor speed, network characteristics, and software overhead on the performance of release-consistent software distributed shared memory. We examine five different protocols for implementing release consistency: eager update, eager invalidate, lazy update, lazy invalidate, and a new protocol called lazy hybrid. This lazy hybrid protocol combines the benefits of both lazy update and lazy invalidate. Our simulations indicate that with the processors and networks that are becoming available, coarse-grained applications such as Jacobi and TSP perform well, more or less independent of the protocol used. Medium-grained applications, such as Water, can achieve good performance, but the choice of protocol is critical. For sixteen processors, the best protocol, lazy hybrid, performed more than three times better than the worst, the eager update. Fine-grained applications such as Cholesky achieve little speedup regardless of the protocol used because of the frequency of synchronization operations and the high latency involved. While the use of relaxed memory models, lazy implementations, and multiple-writer protocols has reduced the impact of false sharing, synchronization latency remains a serious problem for software distributed shared memory systems. These results suggest that future work on software DSMs should concentrate on reducing the amount ofsynchronization or its effect.
A Framework for Comparing Models of Computation
- IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
, 1998
"... We give a denotational framework (a “meta model”) within which certain properties of models of computation can be compared. It describes concurrent processes in general terms as sets of possible behaviors. A process is determinate if, given the constraints imposed by the inputs, there are exactly o ..."
Abstract
-
Cited by 322 (67 self)
- Add to MetaCart
(Show Context)
We give a denotational framework (a “meta model”) within which certain properties of models of computation can be compared. It describes concurrent processes in general terms as sets of possible behaviors. A process is determinate if, given the constraints imposed by the inputs, there are exactly one or exactly zero behaviors. Compositions of processes are processes with behaviors in the intersection of the behaviors of the component processes. The interaction between processes is through signals, which are collections of events. Each event is a value-tag pair, where the tags can come from a partially ordered or totally ordered set. Timed models are where the set of tags is totally ordered. Synchronous events share the same tag, and synchronous signals contain events with the same set of tags. Synchronous processes have only synchronous signals as behaviors. Strict causality (in timed tag systems) and continuity (in untimed tag systems) ensure determinacy under certain technical conditions. The framework is used to compare certain essential features of various models of computation, including Kahn process networks, dataflow, sequential processes, concurrent sequential processes with rendezvous, Petri nets, and discrete-event systems.
Optimistic replication
- ACM COMPUTING SURVEYS
, 2005
"... Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failure ..."
Abstract
-
Cited by 290 (19 self)
- Add to MetaCart
Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence — and provides a comprehensive survey of techniques developed for addressing these challenges.
Detecting Causal Relationships in Distributed Computations: In Search of the Holy Grail
- IN SEARCH OF THE HOLY GRAIL. DISTRIBUTED COMPUTING
, 1994
"... The paper shows that characterizing the causal relationship between significant events is an important but non-trivial aspect for understanding the behavior of distributed programs. An introduction to the notion of causality and its relation to logical time is given; some fundamental results concern ..."
Abstract
-
Cited by 230 (3 self)
- Add to MetaCart
The paper shows that characterizing the causal relationship between significant events is an important but non-trivial aspect for understanding the behavior of distributed programs. An introduction to the notion of causality and its relation to logical time is given; some fundamental results concerning the characterization of causality are presented. Recent work on the detection of causal relationships in distributed computations is surveyed. The issue of observing distributed computations in a causally consistent way and the basic problems of detecting global predicates are discussed. To illustrate the major difficulties, some typical monitoring and debugging approaches are assessed, and it is demonstrated how their feasibility is severely limited by the fundamental problem to master the complexity of causal relationships.
Building Secure and Reliable Network Applications
, 1996
"... ly, the remote procedure call problem, which an RPC protocol undertakes to solve, consists of emulating LPC using message passing. LPC has a number of "properties" -- a single procedure invocation results in exactly one execution of the procedure body, the result returned is reliably deliv ..."
Abstract
-
Cited by 230 (16 self)
- Add to MetaCart
ly, the remote procedure call problem, which an RPC protocol undertakes to solve, consists of emulating LPC using message passing. LPC has a number of "properties" -- a single procedure invocation results in exactly one execution of the procedure body, the result returned is reliably delivered to the invoker, and exceptions are raised if (and only if) an error occurs. Given a completely reliable communication environment, which never loses, duplicates, or reorders messages, and given client and server processes that never fail, RPC would be trivial to solve. The sender would merely package the invocation into one or more messages, and transmit these to the server. The server would unpack the data into local variables, perform the desired operation, and send back the result (or an indication of any exception that occurred) in a reply message. The challenge, then, is created by failures. Were it not for the possibility of process and machine crashes, an RPC protocol capable of overcomi...
Time Synchronization in Ad Hoc Networks
- IN ACM SYMPOSIUM ON MOBILE AD HOC NETWORKING AND COMPUTING (MOBIHOC 01
, 2001
"... Ubiquitous computing environments are typically based upon ad hoc networks of mobile computing devices. These devices may be equipped with sensor hardware to sense the physical environment and may be attached to real world artifacts to form so{called smart things. The data sensed by various smart th ..."
Abstract
-
Cited by 210 (13 self)
- Add to MetaCart
Ubiquitous computing environments are typically based upon ad hoc networks of mobile computing devices. These devices may be equipped with sensor hardware to sense the physical environment and may be attached to real world artifacts to form so{called smart things. The data sensed by various smart things can then be combined to derive knowledge about the environment, which in turn enables the smart things to "react" intelligently to their environment. For this so{called sensor fusion, temporal relationships (X happened before Y) and real{time issues (X and Y happened within a certain time interval) play an important role. Thus physical time and clock synchronization are crucial in such environments. However, due to the characteristics of sparse ad hoc networks, classical clock synchronization algorithms are not applicable in this setting. We present a time synchronization scheme that is appropriate for sparse ad hoc networks.
Wireless Sensor Networks: A New Regime for Time Synchronization
- IN PROCEEDINGS OF THE FIRST WORKSHOP ON HOT TOPICS IN NETWORKS (HOTNETS-I
, 2002
"... Wireless sensor networks (WSNs) consist of large populations of wirelessly connected nodes, capable of computation, communication, and sensing. Sensor nodes cooperate in order to merge individual sensor readings into a high-level sensing result, such as integrating a time series of position measurem ..."
Abstract
-
Cited by 198 (9 self)
- Add to MetaCart
(Show Context)
Wireless sensor networks (WSNs) consist of large populations of wirelessly connected nodes, capable of computation, communication, and sensing. Sensor nodes cooperate in order to merge individual sensor readings into a high-level sensing result, such as integrating a time series of position measurements into a velocity estimate. The physical time of sensor readings is a key element in this process called data fusion. Hence, time synchronization is a crucial component of WSNs. We argue that time synchronization schemes developed for traditional networks such as NTP [21] are ill-suited for WSNs and suggest more appropriate approaches.
FastTrack: Efficient and Precise Dynamic Race Detection
"... Multithreaded programs are notoriously prone to race conditions. Prior work on dynamic race detectors includes fast but imprecise race detectors that report false alarms, as well as slow but precise race detectors that never report false alarms. The latter typically use expensive vector clock operat ..."
Abstract
-
Cited by 172 (8 self)
- Add to MetaCart
(Show Context)
Multithreaded programs are notoriously prone to race conditions. Prior work on dynamic race detectors includes fast but imprecise race detectors that report false alarms, as well as slow but precise race detectors that never report false alarms. The latter typically use expensive vector clock operations that require time linear in the number of program threads. This paper exploits the insight that the full generality of vector clocks is unnecessary in most cases. That is, we can replace heavyweight vector clocks with an adaptive lightweight representation that, for almost all operations of the target program, requires only constant space and supports constant-time operations. This representation change significantly improves time and space performance, with no loss in precision. Experimental results on Java benchmarks including the Eclipse development environment show that our FASTTRACK race detector is an order of magnitude faster than a traditional vector-clock race detector, and roughly twice as fast as the high-performance DJIT + algorithm. FASTTRACK is even comparable in speed to ERASER on our Java benchmarks, while never reporting false alarms.