• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Elements of distributed computing (2002)

by V K Garg
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 67
Next 10 →

Computation Slicing: Techniques and Theory

by Neeraj Mittal, Vijay K. Garg - In Proceedings of the Symposium on Distributed Computing (DISC , 2001
"... Abstract. We generalize the notion of slice introduced in our earlier paper [6]. A slice of a distributed computation with respect to a global predicate is the smallest computation that contains all consistent cuts of the original computation that satisfy the predicate. We prove that slice exists fo ..."
Abstract - Cited by 23 (10 self) - Add to MetaCart
Abstract. We generalize the notion of slice introduced in our earlier paper [6]. A slice of a distributed computation with respect to a global predicate is the smallest computation that contains all consistent cuts of the original computation that satisfy the predicate. We prove that slice exists for all global predicates. We also establish that it is, in general, NP-complete to compute the slice. An optimal algorithm to compute slices for special cases of predicates is provided. Further, we present an efficient algorithm to graft two slices, that is, given two slices, either compute the smallest slice that contains all consistent cuts that are common to both slices or compute the smallest slice that contains all consistent cuts that belong to at least one of the slices. We give application of slicing in general and grafting in particular to global property evaluation of distributed programs. Finally, we show that the results pertaining to consistent global checkpoints [14, 18] can be derived as special cases of computation slicing. 1
(Show Context)

Citation Context

...er Engineering The University of Texas at Austin, Austin, TX 78712, USA garg@ece.utexas.edu http://www.ece.utexas.edu/~garg Abstract. We generalize the notion of slice introduced in our earlier paper =-=[6]-=-. A slice of a distributed computation with respect to a global predicate is the smallest computation that contains all consistent cuts of the original computation that satisfy the predicate. We prove...

Partial Order Trace Analyzer (POTA) for Distributed Programs

by Alper Sen, Vijay K. Garg - In Proceedings of the Third Workshop on Runtime Verification (RV , 2003
"... Checking the correctness of software is a growing challenge. In this paper, we present a prototype implementation of Partial Order Trace Analyzer (POTA), a tool for checking execution traces of both message passing and shared memory programs using temporal logic. So far runtime verification tools ha ..."
Abstract - Cited by 22 (6 self) - Add to MetaCart
Checking the correctness of software is a growing challenge. In this paper, we present a prototype implementation of Partial Order Trace Analyzer (POTA), a tool for checking execution traces of both message passing and shared memory programs using temporal logic. So far runtime verification tools have used the total order model of an execution trace, whereas POTA uses a partial order model. The partial order model enables us to capture possibly exponential number of interleavings and, in turn, this allows us to find bugs that are not found using a total order model. However, verification in partial order model suffers from the state explosion problem – the number of possible global states in a program increases exponentially with the number of processes. POTA employs an effective abstraction technique called computation slicing. A slice of a computation (execution trace) with respect to a predicate is the computation with the least number of global states that contains all global states of the original computation for which the predicate evaluates to true. The advantage of this technique is that, it mitigates the state explosion problem by reasoning only on the part of the global state space that is of interest. In POTA, we implement computing slicing algorithms for temporal logic predicates from a subset of CTL. The overall complexity of evaluating a predicate in this logic upon using computation slicing becomes polynomial in the number of processes compared to exponential without slicing. We illustrate the effectiveness of our techniques in POTA on test cases such as the General Inter-ORB Protocol (GIOP) [18]. POTA also contains a module that translates execution traces to Promela [16] (input language SPIN). This module enables us to compare our results on execution traces with SPIN. In some cases, we were able to verify traces with 250 processes compared to only 10 processes using SPIN.
(Show Context)

Citation Context

... those global states for which the predicate evaluates to true. Regular predicates widely occur in practice during verication. Some examples of regular predicates are conjunction of local predicates [=-=10,17] such as \-=-all processes are in red state", certain channel predicates [10] such as \at most k messages are in transit from process P i to P j ", and some relational predicates [10]. To illustrate pred...

Detecting temporal logic predicates in distributed programs using computation slicing

by Vinit A. Ogale, Vijay K. Garg - IN: 7TH INTERNATIONAL CONFERENCE ON PRINCIPLES OF DISTRIBUTED SYSTEMS , 2003
"... We examine the problem of detecting nested temporal predicates given the execution trace of a distributed program. We present a technique that allows efficient detection of a reasonably large class of predicates which we call the Basic Temporal Logic or BTL. Examples of valid BTL predicates are ne ..."
Abstract - Cited by 17 (5 self) - Add to MetaCart
We examine the problem of detecting nested temporal predicates given the execution trace of a distributed program. We present a technique that allows efficient detection of a reasonably large class of predicates which we call the Basic Temporal Logic or BTL. Examples of valid BTL predicates are nested temporal predicates based on local variables with arbitrary negations, disjunctions, conjunctions and the possibly (EF or ♦) and invariant(AG or ✷) temporal operators. We introduce the concept of a basis, a compact representation of all global cuts which satisfy the predicate. We present an algorithm to compute a basis of a computation given any BTL predicate and prove that its time complexity is polynomial with respect to the number of processes and events in the trace although it is not polynomial in the size of the formula. We do not know of any other technique which detects a similar class of predicates with a time complexity that is polynomial in the number of processes and events in the system. We have implemented a predicate detection toolkit based on our algorithm that accepts offline traces from any distributed program.

Detecting Temporal Logic Predicates on the Happened-Before Model

by Alper Sen, Vijay K. Garg - In Proc. of the International Parallel and Distributed Processing Symposium (IPDPS), Fort , 2001
"... in distributed computing. In this paper we describe new predicate detection algorithms for certain temporal logic predicates. We use a temporal logic, CTL, for specifying properties of a distributed computation and interpret it on a finite lattice of global states. We present solutions to the predic ..."
Abstract - Cited by 14 (6 self) - Add to MetaCart
in distributed computing. In this paper we describe new predicate detection algorithms for certain temporal logic predicates. We use a temporal logic, CTL, for specifying properties of a distributed computation and interpret it on a finite lattice of global states. We present solutions to the predicate detection of linear and observer-independent predicates under EG and AG operators of CTL. For linear predicates we develop polynomial-time predicate detection algorithms which exploit the structure of finite distributive lattices. For observer-independent predicates we prove that predicate detection is NP-complete under EG operator and co-NP-complete under AG operator. We also present polynomial-time algorithms for a CTL operator called until , for which such algorithms did not exist. Finally, our work unifies many earlier results in predicate detection in a single framework.
(Show Context)

Citation Context

...me examples of the predicates for which the predicate detection can be solved efficiently are: conjunctive [10, 13], disjunctive [10], stable [2], observer-independent [3, 4], linear [4], and regular =-=[9, 18]-=- predicates. Our work is different from model checking [8, 14], which checks that a predicate is satisfied for all computations of a program. We check that a predicate is satisfied for a single comput...

Formal Verification of Simulation Traces Using Computation Slicing

by Alper Sen, Vijay K. Garg - IEEE Trans. Computers , 2007
"... Abstract—Concurrent and distributed systems, such as System-on-Chips (SoCs), present an immense challenge for verification due to their complexity and inherent concurrency. Traditional approaches for eliminating errors in concurrent and distributed systems include formal methods and simulation. We p ..."
Abstract - Cited by 12 (5 self) - Add to MetaCart
Abstract—Concurrent and distributed systems, such as System-on-Chips (SoCs), present an immense challenge for verification due to their complexity and inherent concurrency. Traditional approaches for eliminating errors in concurrent and distributed systems include formal methods and simulation. We present an approach toward combining formal methods and simulation in a technique called Predicate Detection (aka Runtime Verification), while avoiding the complexity of formal methods and the pitfalls of ad hoc simulation. Our technique enables efficient formal verification on execution traces of actual scalable systems. Traditional simulation methodologies are woefully inadequate in the presence of concurrency and subtle synchronization. The bug in the system may appear only when the ordering of concurrent events is different from the ordering in the simulation trace. We use a Partial Order Trace Model rather than the traditional total order trace model and we get the benefit of properly dealing with concurrent events and especially of detecting errors from analyzing successful total order traces. Surprisingly, checking properties, even on a finite partial order trace, is NP-complete in the size of the trace description (aka state-explosion problem). Our approach to ameliorating state explosion in partial order trace model uses two techniques: 1) slicing and 2) exploiting the structure of the property itself—by imposing restrictions—to evaluate its value efficiently for a given execution trace. Intuitively, the slice of a trace with respect to a property is a subtrace that contains all of the global states of the trace that satisfy the property such that it is computed efficiently (without traversing the state space) and represented concisely (without explicit representation of individual states). We present temporal slicing algorithms with respect to properties in temporal logic RCTL+. We show how to use the slicing algorithms for efficient predicate detection of design
(Show Context)

Citation Context

...iques: 1) slicing and 2) exploiting the structure of the predicate itself—by imposing restrictions—to evaluate its value efficiently for a given execution trace. Computation slicing was introduced in =-=[9]-=-, [10], [11], [12] as an abstraction technique for analyzing traces of distributed programs. Intuitively, a slice of a trace with respect to a specification p is a subtrace that contains all the state...

Optimal early stopping uniform consensus in synchronous systems with process omission failures

by Michel Raynal - In Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures , 2004
"... Consensus is a central problem of fault-tolerant distributed computing that, in the context of synchronous distributed systems, has received a lot of attention in the crash failure model and in the Byzantine failure model. This paper considers synchronous distributed systems made up of n processes, ..."
Abstract - Cited by 8 (1 self) - Add to MetaCart
Consensus is a central problem of fault-tolerant distributed computing that, in the context of synchronous distributed systems, has received a lot of attention in the crash failure model and in the Byzantine failure model. This paper considers synchronous distributed systems made up of n processes, where up to t can commit failures by crashing or omitting to send or receive messages when they should (“process omission”failure model). It presents a protocol solving uniform consensus in such a context. This protocol has several noteworthy features. First, it is particularly simple. Then, it is optimal both in (1) the number of communication steps needed for processes to decide and stop, namely, min(f +2,t + 1) where f is the actual number of faulty processes, and (2) the number of processes that can be faulty, namely t < n/2. Moreover, (3) it ensures that no process (be it correct or faulty) executes more than min(f +2,t+1) rounds, thereby extending the decision lower bound to the full completion time. The design of a uniform consensus protocol with such optimality requirements was an open problem. Interestingly, as min(f +2,t +1) is a lower bound to solve uniform consensus in the synchronous crash failure model, the proposed protocol shows that uniform consensus is not “harder”in the omission failure model than in the crash failure model. The protocol is also message size efficient as, in addition to values, a message has to piggyback only n bits of control information.
(Show Context)

Citation Context

...When a process pi attains r such that r>|suspectedi| (line 15), then it learns that (1) it knows all the values that can be known at r, and (2) consequently no more values can be learnt in the future =-=[10]-=-. We then say that pi becomes locked. When this occurs, pi adds its name to the set lockedi. In addition to the new values it learnt during the round r − 1, a process pi propagates its set lockedi dur...

Composable Reliability for Asynchronous Systems

by Sunghwan Yoo, Steven Plite
"... Distributed systems often employ replication to solve two different kinds of availability problems. First, to prevent the loss of data through the permanent destruction or disconnection of a distributed node, and second, to allow prompt retrieval of data when some distributed nodes respond slowly. F ..."
Abstract - Cited by 8 (3 self) - Add to MetaCart
Distributed systems often employ replication to solve two different kinds of availability problems. First, to prevent the loss of data through the permanent destruction or disconnection of a distributed node, and second, to allow prompt retrieval of data when some distributed nodes respond slowly. For simplicity, many systems further handle crash-restart failures and timeouts by treating them as a permanent disconnection followed by the birth of a new node, relying on peer replication rather than persistent storage to preserve data. We posit that for applications deployed in modern managed infrastructures, delays are typically transient and failed processes and machines are likely to be restarted promptly, so it is often desirable to resume crashed processes from persistent checkpoints. In this paper we present MaceKen, a synthesis of complementary techniques including Ken, a lightweight and decentralized rollback-recovery protocol that transparently masks crash-restart failures by careful handling of messages and state checkpoints; and Mace, a programming toolkit supporting development of distributed applications and application-specific availability via replication. MaceKen requires near-zero additional developer effort—systems implemented in Mace can immediately benefit from the Ken protocol by virtue of following the Mace execution model. Moreover, Ken allows multiple, independently developed application components to be seamlessly composed, preserving strong global reliability guarantees. Our implementation is available as open source software. 1
(Show Context)

Citation Context

...explain how Ken provides distributed consistency, output validity, and composable reliability. To understand these concepts, consider Figure 1, illustrating standard concepts of distributed computing =-=[12]-=-. In the figure, time advances from left to right. Distributed computing processes p1, p2, and p3 are represented by horizontal lines. Processes can exchange messages with each other, represented by d...

Techniques and Applications of Computation Slicing

by Neeraj Mittal, Vijay K. Garg - DISTRIBUTED COMPUTING
"... Writing correct distributed programs is hard. In spite of extensive testing and debugging, software faults persist even in commercial grade software. Many distributed systems should be able to operate properly even in the presence of software faults. Monitoring the execution of a distributed system, ..."
Abstract - Cited by 7 (2 self) - Add to MetaCart
Writing correct distributed programs is hard. In spite of extensive testing and debugging, software faults persist even in commercial grade software. Many distributed systems should be able to operate properly even in the presence of software faults. Monitoring the execution of a distributed system, and, on detecting a fault, initiating the appropriate corrective action is an important way to tolerate such faults. This gives rise to the predicate detection problem which requires finding whether there exists a consistent cut of a given computation that satisfies a given global predicate. Detecting a predicate in a computation is, however, an NP-complete problem in general. In order to ameliorate the associated combinatorial explosion problem, we introduce the notion of computation slice. Formally,
(Show Context)

Citation Context

... (of the lattice of consistent cuts). Some examples of regular predicates are: conjunctive predicates, which can be expressed as conjunction of local predicates, like “all processes are in red state” =-=[Gar02b]-=-, and monotonic channel predicates such as “all control messages have been received” [Gar02b]. We prove that the class of regular predicates is closed under conjunction, that is, the conjunction of tw...

Software Fault Tolerance of Distributed Programs using Computation Slicing

by Neeraj Mittal - In Proceedings of the 23rd IEEE International Conference on Distributed Computing Systems (ICDCS , 2003
"... Writing correct distributed programs is hard. In spite of extensive testing and debugging, software faults persist even in commercial grade software. Many distributed systems, especially those employed in safety-critical environments, should be able to operate properly even in the presence of softwa ..."
Abstract - Cited by 7 (5 self) - Add to MetaCart
Writing correct distributed programs is hard. In spite of extensive testing and debugging, software faults persist even in commercial grade software. Many distributed systems, especially those employed in safety-critical environments, should be able to operate properly even in the presence of software faults. Monitoring the execution of a distributed system, and, on detecting a fault, initiating the appropriate corrective action is an important way to tolerate such faults. This gives rise to the predicate detection problem which involves finding a consistent cut of a distributed computation, if it exists, that satisfies the given global predicate. Detecting a predicate in a computation is, however, an NP-complete problem. To ameliorate the associated combinatorial explosion problem, we introduce the notion of computation slice in our earlier papers [5, 10]. Intuitively, slice is a concise representation of those consistent cuts that satisfy a certain condition. To detect a predicate, rather than searching the state-space of the computation, it is much more efficient to search the state-space of the slice. In this paper, we provide efficient algorithms to compute the slice for several classes of predicates. Our experimental results demonstrate that slicing can lead to an exponential improvement over existing techniques in terms of time and space.

Algorithmic Combinatorics based on Slicing Posets

by Vijay K. Garg
"... ..."
Abstract - Cited by 6 (5 self) - Add to MetaCart
Abstract not found
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University