Results 1 - 10
of
41
A Resource Management Architecture for Metacomputing Systems
, 1997
"... Metacomputing systems are intended to support remote and/or concurrent use of geographically distributed computational resources. Resource management in such systems is complicated by five concerns that do not typically arise in other situations: site autonomy and heterogeneous substrates at the ..."
Abstract
-
Cited by 353 (36 self)
- Add to MetaCart
Metacomputing systems are intended to support remote and/or concurrent use of geographically distributed computational resources. Resource management in such systems is complicated by five concerns that do not typically arise in other situations: site autonomy and heterogeneous substrates at the resources, and application requirements for policy extensibility, co-allocation, and online control. We describe a resource management architecture that addresses these concerns. This architecture distributes the resource management problem among distinct local manager, resource broker, and resource co-allocator components and defines an extensible resource specification language to exchange information about requirements. We describe how these techniques have been implemented in the context of the Globus metacomputing toolkit and used to implement a variety of different resource management strategies. We report on our experiences applying our techniques in a large testbed, GUSTO, incorporating 15 sites, 330 computers, and 3600 processors.
A Directory Service for Configuring High-Performance Distributed Computations
, 1997
"... High-performance execution in distributed computing environments often requires careful selection and configuration not only of computers, networks, and other resources but also of the protocols and algorithms used by applications. Selection and configuration in turn require access to accurate, up-t ..."
Abstract
-
Cited by 221 (45 self)
- Add to MetaCart
High-performance execution in distributed computing environments often requires careful selection and configuration not only of computers, networks, and other resources but also of the protocols and algorithms used by applications. Selection and configuration in turn require access to accurate, up-to-date information on the structure and state of available resources. Unfortunately, no standard mechanism exists for organizing or accessing such information. Consequently, different tools and applications adopt ad hoc mechanisms, or they compromise their portability and performance by using default configurations. We propose a solution to this problem: a Metacomputing Directory Service that provides efficient and scalable access to diverse, dynamic, and distributed information about resource structure and state. We define an extensible data model to represent the information required for distributed computing, and we present a scalable, high-performance, distributed implementation. The dat...
Policy driven heterogeneous resource co-allocation with Gangmatching
- in Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing (HPDC
, 2003
"... Federated distributed systems present new challenges toresource management. Conventional resource managers are based on a relatively static resource model and a centralizedallocator that assigns resources to customers. This model does not adapt well to highly dynamic environments char-acterized by d ..."
Abstract
-
Cited by 48 (0 self)
- Add to MetaCart
Federated distributed systems present new challenges toresource management. Conventional resource managers are based on a relatively static resource model and a centralizedallocator that assigns resources to customers. This model does not adapt well to highly dynamic environments char-acterized by distributed management and distributed ownership. Distributed management introduces resource hetero-geneity: Not only the set of available resources, but even the set of resource types is constantly changing [3]. Distributedownership introduces policy heterogeneity: Each resource may have its own idiosyncratic allocation policy. We pre-viously argued that Matchmaking provides an elegant and robust solution to the problem of heterogeneous resourcemanagement in dynamic, distributed environments [13]. Matchmaking provides a powerful language for a consumer
Software infrastructure for the I-WAY high-performance distributed computing experiment
- In Proc. 5th IEEE Symp. on High Performance Distributed Computing
, 1996
"... High-speed wide area networks are expected to enable innovative applications that integrate geographically distributed, high-performance computing, database, graphics, and networking resources. However, there is as yet little understanding of the higher-level services required to support these appli ..."
Abstract
-
Cited by 46 (8 self)
- Add to MetaCart
High-speed wide area networks are expected to enable innovative applications that integrate geographically distributed, high-performance computing, database, graphics, and networking resources. However, there is as yet little understanding of the higher-level services required to support these applications, or of the techniques required to implement these services in a scalable, secure manner. We report on a large-scale prototyping effort that has yielded some insights into these issues. Building on the hardware base provided by the I-WAY, a national-scale Asynchronous Transfer Mode (ATM) network, we developed an integrated management and application programming system, called I-Soft. This system was deployed at most of the 17 I-WAY sites and used by many of the 60 applications demonstrated on the I-WAY network. In this article, we describe the I-Soft design and report on lessons learned from application experiments. 1
Resource Management through Multilateral Matchmaking
- In Proc. 9th IEEE Symp. on High Performance Distributed Computing
, 1998
"... Federated distributed systems present new challenges to resource management, which cannot be met by conventional systems that employ relatively static resource models and centralized allocators. We previously argued that Matchmaking provides an elegant and robust resource management solution for the ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
Federated distributed systems present new challenges to resource management, which cannot be met by conventional systems that employ relatively static resource models and centralized allocators. We previously argued that Matchmaking provides an elegant and robust resource management solution for these highly dynamic environments [5]. Although powerful and flexible, multiparty policies (e.g., co-allocation) cannot be accomodated by Matchmaking. In this paper we present Gang-Matching, a multilateral matchmaking formalism to address this deficiency.
The relative performance of various mapping algorithms is independent of sizable variances in run-time predictions
- in 7th IEEE Heterogeneous Computing Workshop (HCW ’98
, 1998
"... In this paper we study the performance of four mapping algorithms. The four algorithms include two naive ..."
Abstract
-
Cited by 36 (8 self)
- Add to MetaCart
In this paper we study the performance of four mapping algorithms. The four algorithms include two naive
MIST: PVM with Transparent Migration and Checkpointing
- In 3rd Annual PVM Users' Group Meeting
, 1995
"... We are currently involved in research to enable PVM to take advantage of shared networks of workstations (NOWs) more effectively. In such a computing environment, it is important to utilize workstations unobtrusively and recover from machine failures. Towards this goal, we have enhanced PVM with tra ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
We are currently involved in research to enable PVM to take advantage of shared networks of workstations (NOWs) more effectively. In such a computing environment, it is important to utilize workstations unobtrusively and recover from machine failures. Towards this goal, we have enhanced PVM with transparent task migration, checkpointing, and global scheduling. These enhancements are part of the MIST project which takes an open systems approach in developing a cohesive, distributed parallel computing environment. This open systems approach promotes plug-and-play integration of independently developed modules, such as Condor, DQS, AVS, Prospero, XPVM, PIOUS, Ptools, etc. Transparent task migration, in conjunction with a global scheduler, facilitates the use of shared NOWs by allowing parallel jobs to unobtrusively utilize nodes that are currently unused. PVM tasks can be moved onto nodes that are otherwise idle, and moved off when the node is no longer free. Experiments show that migrati...
Scheduling Independent Tasks on Metacomputing Systems
- in Proceedings of Parallel and Distributed Computing Systems
, 1999
"... Metacomputing is a convenient and powerful abstraction for dealing with the complexities that arise when managing and using a large collection of heterogeneous computational resources. One of the most fundamental characteristics of a metacomputing system is the algorithm it uses for the scheduling p ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Metacomputing is a convenient and powerful abstraction for dealing with the complexities that arise when managing and using a large collection of heterogeneous computational resources. One of the most fundamental characteristics of a metacomputing system is the algorithm it uses for the scheduling placement of jobs on processing nodes. We describe five schedule placement algorithms, and report on their success and failure modes when used to schedule job distributions. We investigate five different distributions of job execution time and the effects of predictability on the algorithms' performance. Our objective in this work is to develop a hierarchical scheduling model for large scale job management in a metacomputing system. We investigate the use of a gateway model for controlling job placement on sub-clusters of a larger cluster of resources. Keywords: metacomputing; scheduling; cluster computing; adaptive scheduling. 1 Introduction The term metacomputer was coined by Fox [4] to d...

