Results 1 - 10
of
1,208
Virtual time
- ACM Transactions on Programming Languages and Systems
, 1985
"... Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concur-rency control. Virtual time provides a flexible abstraction of real time in much the same way that virtua ..."
Abstract
-
Cited by 980 (7 self)
- Add to MetaCart
(Show Context)
Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concur-rency control. Virtual time provides a flexible abstraction of real time in much the same way that virtual memory provides an abstraction of real memory. It is implemented using the Time Warp mechanism, a synchronization protocol distinguished by its reliance on lookahead-rollback, and by its implementation of rollback via antimessages.
Virtual time and global states of distributed systems.
- Proc. Workshop on Parallel and Distributed Algorithms,
, 1989
"... Abstract A distributed system can be characterized by the fact that the global state is distributed and that a common time base does not exist. However, the notion of time is an important concept in every day life of our decentralized \ r eal world" and helps to solve problems like getting a c ..."
Abstract
-
Cited by 744 (5 self)
- Add to MetaCart
(Show Context)
Abstract A distributed system can be characterized by the fact that the global state is distributed and that a common time base does not exist. However, the notion of time is an important concept in every day life of our decentralized \ r eal world" and helps to solve problems like getting a consistent population census or determining the potential causality between events. We argue that a linearly ordered structure of time is not (always) adequate for distributed systems and propose a generalized non-standard m o del of time which consists of vectors of clocks. These clock-vectors are p artially ordered a n d form a lattice. By using timestamps and a simple clock update mechanism the structure o f c ausality is represented in an isomorphic way. The new model of time has a close analogy to Minkowski's relativistic spacetime and leads among others to an interesting characterization of the global state problem. Finally, we present a new algorithm to compute a consistent global snapshot of a distributed system where messages may be r eceived out of order.
A Survey of Rollback-Recovery Protocols in Message-Passing Systems
, 1996
"... this paper, we use the terms event logging and message logging interchangeably ..."
Abstract
-
Cited by 716 (22 self)
- Add to MetaCart
this paper, we use the terms event logging and message logging interchangeably
Grid Information Services for Distributed Resource Sharing
, 2001
"... Grid technologies enable large-scale sharing of resources within formal or informal consortia of individuals and/or institutions: what are sometimes called virtual organizations. In these settings, the discovery, characterization, and monitoring of resources, services, and computations are challengi ..."
Abstract
-
Cited by 712 (52 self)
- Add to MetaCart
Grid technologies enable large-scale sharing of resources within formal or informal consortia of individuals and/or institutions: what are sometimes called virtual organizations. In these settings, the discovery, characterization, and monitoring of resources, services, and computations are challenging problems due to the considerable diversity, large numbers, dynamic behavior, and geographical distribution of the entities in which a user might be interested. Consequently, information services are a vital part of any Grid software infrastructure, providing fundamental mechanisms for discovery and monitoring, and hence for planning and adapting application behavior. We present here an information services architecture that addresses performance, security, scalability, and robustness requirements. Our architecture defines simple low-level enquiry and registration protocols that make it easy to incorporate individual entities into various information structures, such as aggregate directories that support a variety of different query languages and discovery strategies. These protocols can also be combined with other Grid protocols to construct additional higher-level services and capabilities such as brokering, monitoring, fault detection, and troubleshooting. Our architecture has been implemented as MDS-2, which forms part of the Globus Grid toolkit and has been widely deployed and applied.
Knowledge and Common Knowledge in a Distributed Environment
- Journal of the ACM
, 1984
"... : Reasoning about knowledge seems to play a fundamental role in distributed systems. Indeed, such reasoning is a central part of the informal intuitive arguments used in the design of distributed protocols. Communication in a distributed system can be viewed as the act of transforming the system&apo ..."
Abstract
-
Cited by 578 (55 self)
- Add to MetaCart
(Show Context)
: Reasoning about knowledge seems to play a fundamental role in distributed systems. Indeed, such reasoning is a central part of the informal intuitive arguments used in the design of distributed protocols. Communication in a distributed system can be viewed as the act of transforming the system's state of knowledge. This paper presents a general framework for formalizing and reasoning about knowledge in distributed systems. We argue that states of knowledge of groups of processors are useful concepts for the design and analysis of distributed protocols. In particular, distributed knowledge corresponds to knowledge that is "distributed" among the members of the group, while common knowledge corresponds to a fact being "publicly known". The relationship between common knowledge and a variety of desirable actions in a distributed system is illustrated. Furthermore, it is shown that, formally speaking, in practical systems common knowledge cannot be attained. A number of weaker variants...
Distributed Computing in Practice: The Condor Experience
, 2005
"... Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational Grid. In this paper, we provide the history a ..."
Abstract
-
Cited by 551 (8 self)
- Add to MetaCart
Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational Grid. In this paper, we provide the history and philosophy of the Condor project and describe how it has interacted with other projects and evolved along with the field of distributed computing. We outline the core components of the Condor system and describe how the technology of computing must correspond to social structures. Throughout, we reflect on the lessons of experience and chart the course travelled by research ideas as they grow into production
Optimistic recovery in distributed systems
- ACM Transactions on Computer Systems
, 1985
"... Optimistic Recovery is a new technique supporting application-independent transparent recovery from processor failures in distributed systems. In optimistic recovery communication, computation and checkpointing proceed asynchronously. Synchronization is replaced by causal dependency trock-ing, which ..."
Abstract
-
Cited by 355 (6 self)
- Add to MetaCart
Optimistic Recovery is a new technique supporting application-independent transparent recovery from processor failures in distributed systems. In optimistic recovery communication, computation and checkpointing proceed asynchronously. Synchronization is replaced by causal dependency trock-ing, which enables a posteriori reconstruction of a consistent distributed system state following a failure using process rollback and message replay. Because there is no synchronization among computation, communication, and checkpointing, optimistic recovery can tolerate the failure of an arbitrary number of processors and yields better throughput and response time than other general recovery techniques whenever failures are infre-quent.
The Distributed Constraint Satisfaction Problem: Formalization and Algorithms
- IEEE Transactions on Knowledge and Data Engineering
, 1998
"... In this paper, we develop a formalism called a distributed constraint satisfaction problem (distributed CSP) and algorithms for solving distributed CSPs. A distributed CSP is a constraint satisfaction problem in which variables and constraints are distributed among multiple agents. Various applica ..."
Abstract
-
Cited by 326 (27 self)
- Add to MetaCart
(Show Context)
In this paper, we develop a formalism called a distributed constraint satisfaction problem (distributed CSP) and algorithms for solving distributed CSPs. A distributed CSP is a constraint satisfaction problem in which variables and constraints are distributed among multiple agents. Various application problems in Distributed Artificial Intelligence can be formalized as distributed CSPs. We present our newly developed technique called asynchronous backtracking that allows agents to act asynchronously and concurrently without any global control, while guaranteeing the completeness of the algorithm. Furthermore, we describe how the asynchronous backtracking algorithm can be modified into a more efficient algorithm called an asynchronous weak-commitment search, which can revise a bad decision without exhaustive search by changing the priority order of agents dynamically. The experimental results on various example problems show that the asynchronous weak-commitment search algorithm ...
Distributed Constraint Satisfaction for Formalizing Distributed Problem Solving
, 1992
"... Viewing cooperative distributed problem solving (CDPS) as distributed constraint satisfaction provides a useful formalism for characterizing CDPS techniques. In this paper, we describe this formalism and compare algorithms for solving distributed constraint satisfaction problems (DCSPs). In particul ..."
Abstract
-
Cited by 295 (23 self)
- Add to MetaCart
Viewing cooperative distributed problem solving (CDPS) as distributed constraint satisfaction provides a useful formalism for characterizing CDPS techniques. In this paper, we describe this formalism and compare algorithms for solving distributed constraint satisfaction problems (DCSPs). In particular, we present our newly developed technique called asynchronous backtracking that allows agents to act asynchronously and concurrently, in contrast to the traditional sequential backtracking techniques employed in constraint satisfaction problems. Our experimental results show that solving DCSPs in a distributed fashion is worthwhile when the problems solved by individual agents are loosely-coupled. 1 Introduction Cooperative distributed problem solving (CDPS) is a subfield of AI that is concerned with how a set of artificially intelligent agents can work together to solve problems. Recently, [9] has presented the idea of viewing CDPS as a distributed state space search in order to develop...