Results 1 - 10
of
14
An Architecture for A Wide Area Distributed System
- Distributed Systems: Principles and Paradigms
, 1996
"... this paper is part of the Globe Project (Globe stands for GLobal Object Based Environment) . The goal of this project is the design and implementation of a wide area distributed system that provides a convenient programming abstraction and full transparency. The main contribution of this paper is th ..."
Abstract
-
Cited by 165 (45 self)
- Add to MetaCart
this paper is part of the Globe Project (Globe stands for GLobal Object Based Environment) . The goal of this project is the design and implementation of a wide area distributed system that provides a convenient programming abstraction and full transparency. The main contribution of this paper is the description of a new system for distributed shared objects. In contrast to other systems, the implementation of distribution, consistency, and replication of state is completely encapsulated in a distributed shared object. This allows for object-specific solutions, and provides the right mechanism for building efficient and truly scalable systems. 2 Problems to be Solved
CUMULVS: Providing Fault-Tolerance, Visualization and Steering of Parallel Applications
- International Journal of High Performance Computing Applications
, 1996
"... The use of visualization and computational steering can often assist scientists in analyzing large-scale scientific applications. Fault-tolerance to failures is of great importance when running on a distributed system. However, the details of implementing these features are complex and tedious, l ..."
Abstract
-
Cited by 103 (5 self)
- Add to MetaCart
The use of visualization and computational steering can often assist scientists in analyzing large-scale scientific applications. Fault-tolerance to failures is of great importance when running on a distributed system. However, the details of implementing these features are complex and tedious, leaving many scientists with inadequate development tools. CUMULVS is a library that enables programmers to easily incorporate interactive visualization and computational steering into existing parallel programs. The library is divided into two pieces: one for the application program and one for the, possibly commercial, visualization and steering front-end. Together these two libraries encompass all the connection and data protocols needed to dynamically attach multiple independent viewer front-ends to a running parallel application. Viewer programs can also steer one or more user-defined parameters to "close the loop" for computational experiments and analyses. CUMULVS allows the pr...
Enforcing Resource Sharing Agreements among Distributed Server Clusters
- In Proceedings of the Sixteenth International Parallel and Distributed Processing Symposium (IPDPS
, 2002
"... Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with resource access governed by agreements between the owners and users of these servers. As an example, application service prov ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with resource access governed by agreements between the owners and users of these servers. As an example, application service providers (ASPs) can pool their resources together according to pre-specified sharing agreements to provide better services to their customers. Such systems raise several new resource management challenges, chief amongst which is the enforcement of agreements to ensure that, despite the distributed nature of both requests and resources, user requests only receive a predetermined share of the aggregate resource and that the resources of a participant are not misused. Current solutions only enforce such agreements at a coarse granularity and in a centralized fashion, limiting their applicability for general workloads. This paper presents an architecture for the distributed enforcement of resource sharing agreements. Our approach exploits a uniform application-independent representation of agreements, and combines it with efficient time-window based coordinated queuing algorithms running on multiple nodes. We have successfully implemented this general strategy in two different network layers: a layer-7 HTTP redirector and a layer-4 packet redirector, which redirect connection requests from distributed clients to a cluster of distributed servers. Our measurements of both implementations verify that our approach is general and effective: different client groups receive service commensurate with their agreements. 1 1
A Construction of Distributed Reference Counting
, 1999
"... Distributed reference counting is a general purpose technique, which may be used, e.g., to detect termination of distributed programs or to implement distributed garbage collection. We present a distributed reference counting algorithm and a mechanical proof of correctness carried out using the p ..."
Abstract
-
Cited by 12 (8 self)
- Add to MetaCart
Distributed reference counting is a general purpose technique, which may be used, e.g., to detect termination of distributed programs or to implement distributed garbage collection. We present a distributed reference counting algorithm and a mechanical proof of correctness carried out using the proof assistant Coq. The algorithm is formalised by an abstract machine, and its correctness has two dierent facets. The safety property ensures that if there exists a reference to a resource, then its reference counter will be strictly positive. Liveness guarantees that if all references to a resource are deleted, its reference counter will eventually become null. 1 Introduction Reference counting is a general purpose technique that is able to count the number of references to a given resource. Collins [5] was the rst to use it in order to determine when list cells were no longer needed. Operating systems rely on this technique in order to decide when les may be deleted or when le...
A Two-Level Communication Protocol for a Web Operating System (WOS)
- IN IEEE 24TH EUROMICRO WORKSHOP ON NETWORK COMPUTING
, 1998
"... The World-Wide Web consists not only of informational, but also computational resources. However, these resources, especially computational ones are underutilized. One characteristic of the Web is its ever changing structure; for instance, nodes are dynamically added and removed. This makes it diffi ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
The World-Wide Web consists not only of informational, but also computational resources. However, these resources, especially computational ones are underutilized. One characteristic of the Web is its ever changing structure; for instance, nodes are dynamically added and removed. This makes it difficult, if not impossible, to draw a complete and accurate picture of available resources. We consider the Web as a versioned system: resources, services and protocols are versioned. This paper presents a two-level protocol within this framework. The first protocol, the WOS Request Protocol (WOSRP), allows to select an appropriate version of a server. The second protocol, the WOS Protocol (WOSP), allows for locating and using these distributed (informational and computational) resources. We show how the latter protocol provides an efficient fault-tolerant resource search mechanism.
CUMULVS: Extending a Generic Steering and Visualization Middleware for Application Fault-Tolerance
"... CUMULVS is a middleware library that provides application programmers with a simple API for describing viewable and steerable fields in large-scale distributed simulations. These descriptions provide the data type, a logical name of the field-parameter, and the mapping of global indices to local ind ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
CUMULVS is a middleware library that provides application programmers with a simple API for describing viewable and steerable fields in large-scale distributed simulations. These descriptions provide the data type, a logical name of the field-parameter, and the mapping of global indices to local indices (processor and physical storage) for distributed data fields. The CU-MULVS infrastructure uses these descriptions to allow an arbitrary number of front-end "viewer" programs to dynamically attach to a running simulation, select one or more fields for visualization, and update steerable variables. (Viewer programs can be built using commercial visualization software such as AVS or custom software based on GUI interface builders like Tcl/Tk.) Although these data field descriptions require a small effort on the part of the application programmer, the payoff is a high degree of flexibility for the infrastructure and end-user. This flexibility has allowed us to extend the infrastructure to include "application-directed" checkpointing, where the application determines the essential state that must be saved for a restart. This has the advantage that checkpoints can be smaller and made portable across heterogeneous architectures using the semantic description information that can be included in the checkpoint file. Because many technical difficulties, such as efficient I/O handling and time-coherency of data, are shared between visualization and checkpointing, it is advantageous to leverage a checkpoint/restart system against a visualization/steering infrastructure. Also, because CU-
Parallel Application Software on High Performance Computers - Parallel Diagonalisation Routines.
, 1996
"... In this report we list diagonalisation routines available for parallel computers. The methodology of each routine is outlined together with benchmark results on a typical matrix where available. Storage requirements and advantages and disadvantages of the method are also compared. The vast majority ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
In this report we list diagonalisation routines available for parallel computers. The methodology of each routine is outlined together with benchmark results on a typical matrix where available. Storage requirements and advantages and disadvantages of the method are also compared. The vast majority of these routines are available for real dense symmetric matrices only, although there is a known requirement for other data types -- such as Hermitian or structured sparse matrices. We will report on new codes as they become available. This report is available from http://www.dl.ac.uk/TCSC/HPCI/ c fl1996, Daresbury Laboratory. We do not accept any responsibility for loss or damage arising from the use of information contained in any of our reports or in any communication about our tests or investigations. ii CONTENTS iii Contents 1 Summary 1 1.1 Test Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.2 Recommendations : : : : : : : : : : :...
Computing Twin Primes and Brun's Constant: A Distributed Approach
- In Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing
, 1998
"... This paper describes an implementation of a large heterogeneous distributed parallel computation that counts the distribution of twin primes and calculates Brun's constant and maximal distances between pairs of twin primes. Two primes are twins if they differ by two. It is not known if there are inf ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper describes an implementation of a large heterogeneous distributed parallel computation that counts the distribution of twin primes and calculates Brun's constant and maximal distances between pairs of twin primes. Two primes are twins if they differ by two. It is not known if there are infinitely many twin primes but it was proven that the sum of their inverses converges to the value defined as Brun's constant [2]. Prior to this work, the number of twins and their contribution to Brun's constant was known for all twins up to . We have advanced this calculation and are planning to continue to .
A survey of distributed computing, computational grid, meta-computing and network information tools
- Daresbury, Warrington WA4 4AD
, 2001
"... A software environment of unprecedented quality and functionality is emerging in which coupled computing resources are accessed via client-server and Web-based tools. This development is being driven by a combination of the computer industry, which is rapidly developing software for e-commerce and l ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A software environment of unprecedented quality and functionality is emerging in which coupled computing resources are accessed via client-server and Web-based tools. This development is being driven by a combination of the computer industry, which is rapidly developing software for e-commerce and leisure use, and the loose collection of world-wide “freeware ” programmers. Geoffrey Fox has referred to it as the “Distributed Commodity Computing and Information System”. In this survey we examine a number of tools and projects for science and engineering applications on wide-area network based systems. This includes distributed computing, computational steering and meta-computing techniques. We have also included a few “collaborative working ” and “distance education” projects which share a number of the same goals and difficulties. Keywords distributed computing, computational steering, meta-computing, network solvers, collaborative working, distance education, programming tools, networks of workstations, cluster computing, e-Services.
Adaptive Utilization of Communication and Computational Resources in High-Performance Distributed Systems: The EMOP Approach
- The 7th International Symposium on High Performance Distributed Computing
, 1998
"... Development of high-performance distributed applications can be extremely challenging because of their complex runtime environment coupled with their requirement of high-performance. Such applications typically run on a set of heterogeneous machines with dynamically varying loads, connected by heter ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Development of high-performance distributed applications can be extremely challenging because of their complex runtime environment coupled with their requirement of high-performance. Such applications typically run on a set of heterogeneous machines with dynamically varying loads, connected by heterogeneous networks possibly supporting a wide variety of communication protocols. In spite of the size and complexity of such applications, they must provide the required high-performance mandated by their users. In order to achieve this goal, they need to adaptively utilize their computational and communication resources. This paper describes EMOP, a programming environment for building high-performance distributed systems. EMOP is designed on the lines of CORBA and uses an Object Request Broker (ORB) to support seamless communication between distributed application components. In order to provide adaptive utilization of communication resources, it uses the principle of Open Implementation t...

