• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

R.: Automatic methods for predicting machine availability in desktop Grid and peer-to-peer systems (2004)

by J Brevik, D Nurmi, Wolski
Venue:In: Proc. of CCGrid’04
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 28
Next 10 →

Fault-aware scheduling for bag-of-tasks applications on desktop grids,” in GRID ’06

by Cosimo Anglano, John Brevik, Massimo Canonico, Dan Nurmi - IEEE Computer Society
"... Abstract — Desktop Grids have proved to be a suitable platform for the execution of Bag-of-Tasks applications but, being characterized by a high resource volatility, require the availability of scheduling techniques able to effectively deal with resource failures and/or unplanned periods of unavaila ..."
Abstract - Cited by 25 (2 self) - Add to MetaCart
Abstract — Desktop Grids have proved to be a suitable platform for the execution of Bag-of-Tasks applications but, being characterized by a high resource volatility, require the availability of scheduling techniques able to effectively deal with resource failures and/or unplanned periods of unavailability. In this paper we present a set of fault-aware scheduling policies that, rather than just tolerating faults as done by traditional fault-tolerant schedulers, exploit the information concerning resource availability to improve application performance. The performance of these strategies have been compared via simulation with those attained by traditional fault-tolerant schedulers. Our results, obtained by consider a set of realistic scenarios modeled after real Desktop Grids, show that our approach results in better application performance and resource utilization. I.
(Show Context)

Citation Context

...as much as possible by jointly exploiting the faulthandling mechanisms, and the knowledge of the effective computing power delivered by resources [13], [14] and the distributions of their fault times =-=[15]-=- (i.e. the time elapsing between two consecutive faults), to improve scheduling performance. We show how this information can be exploited to improve both task selection (the choice of the next task t...

Fault-Tolerant Scheduling for Bag-of-Tasks Grid Applications

by Cosimo Anglano, Massimo Canonico - Proc. of the 2005 European Grid Conference (EuroGrid 2005). Lecture Notes in Computer Science , 2005
"... Abstract. In this paper we propose a fault-tolerant scheduler for Bagof-Tasks ..."
Abstract - Cited by 17 (4 self) - Add to MetaCart
Abstract. In this paper we propose a fault-tolerant scheduler for Bagof-Tasks
(Show Context)

Citation Context

...rted in [7]), while the Repair Time (i.e., the time elapsing from the fault to when the machine is operational again) is assumed to be uniformly distributed between 120 and 600 seconds (for a reboot) =-=[8]-=-, or exponentially distributed with mean 2 days (for a hardware crash) [14]. The parameters of the Weibull distribution characterizing Fault Time were set according to the following procedure. We cons...

RIDGE: Combining Reliability and Performance in Open Grid Platforms

by Krishnaveni Budati, Jason Sonnek, Abhishek Chandra, Jon Weissman , 2007
"... Large-scale donation-based distributed infrastructures need to cope with the inherent unreliability of participant nodes. A widely-used work scheduling technique in such environments is to redundantly schedule the outsourced computations to a number of nodes. We present the design and implementation ..."
Abstract - Cited by 9 (0 self) - Add to MetaCart
Large-scale donation-based distributed infrastructures need to cope with the inherent unreliability of participant nodes. A widely-used work scheduling technique in such environments is to redundantly schedule the outsourced computations to a number of nodes. We present the design and implementation of RIDGE, a reliabilityaware system which uses a node’s prior performance and behavior to make more effective scheduling decisions. We have implemented RIDGE on top of the BOINC distributed computing infrastructure and have evaluated its performance on a live testbed consisting of 120 PlanetLab nodes. Our experimental results show that RIDGE is able to match or surpass the throughput of the best vanilla BOINC configuration under different reliability environments, by automatically adapting to the characteristics of the underlying environment. In addition, RIDGE is able to provide much lower workunit makespans compared to BOINC, which indicates its desirability in service-oriented environments with time constraints.

The effectiveness of threshold-based scheduling policies in boinc projects

by Trilce Estrada, David A. Flores, Michela Taufer, Patricia J. Teller, Andre Kerstens, David P. Anderson - In Proceedings of the 2nd IEEE International Conference on e-Science and Grid Technologies (eScience , 2006
"... Several scientific projects use BOINC (Berkeley Open Infrastructure for Network Computing) to perform largescale simulations using volunteers ’ computers (workers) across the Internet. In general, the scheduling of tasks in BOINC uses a First-Come-First-Serve policy and no attention is paid to worke ..."
Abstract - Cited by 8 (7 self) - Add to MetaCart
Several scientific projects use BOINC (Berkeley Open Infrastructure for Network Computing) to perform largescale simulations using volunteers ’ computers (workers) across the Internet. In general, the scheduling of tasks in BOINC uses a First-Come-First-Serve policy and no attention is paid to workers ’ past performance, such as whether or not they have tended to perform tasks promptly and correctly. In this paper we use SimBA, a discrete-event Simulator of BOINC Applications, to study new threshold-based scheduling strategies for BOINC projects that use availability and reliability metrics to classify workers and distribute tasks according to this classification. We show that if availability and reliability thresholds are selected properly, then the workers ’ throughput of valid results increases significantly in BOINC projects.
(Show Context)

Citation Context

...ptive grid computing and, in particular, have introduced adaptation on GrADS systems. GrADS systems, though, are grid environments that have different features than VC environments. Wolski and others =-=[20, 21]-=- studied the effectiveness of statistical models for predicting machine failure/availability distributions. Statistical models such as exponential, hyper exponential, Pareto, and Weibull distributions...

Metrics for Effective Resource Management in Global Computing Environments

by Michela Taufer, Patricia J. Teller, David P. Anderson, Charles L. Brooks - IEEE International Conference on e-Science and Grid Technologies (eScience 2005 , 2005
"... Global computing uses Internet-connected PCs volunteered by their owners. These PCs are diverse, volatile, and error-prone. Sophisticated scheduling methods commonly applied in Grid computing may not be sufficiently scalable and flexible for global computing environments. This paper shows that it is ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
Global computing uses Internet-connected PCs volunteered by their owners. These PCs are diverse, volatile, and error-prone. Sophisticated scheduling methods commonly applied in Grid computing may not be sufficiently scalable and flexible for global computing environments. This paper shows that it is possible to classify global computing hosts based on simple metrics such as availability and reliability, and that it is efficient to assign tasks to such hosts accordingly. The proposed classification of workers is applied to

Resource failure prediction for fine-grained cycle sharing

by Xiaojuan Ren, Seyong Lee, Rudolf Eigenmann, Saurabh Bagchi , 2005
"... Fine-Grained Cycle Sharing (FGCS) systems aim at utilizing the large amount of computational resources available on the Internet. In FGCS, host computers allow guest jobs to utilize the CPU cycles if the jobs do not significantly impact the local users of a host. A characteristic of such resources i ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
Fine-Grained Cycle Sharing (FGCS) systems aim at utilizing the large amount of computational resources available on the Internet. In FGCS, host computers allow guest jobs to utilize the CPU cycles if the jobs do not significantly impact the local users of a host. A characteristic of such resources is that they are generally provided voluntarily and their availability fluctuates highly. Guest jobs may incur resource failures because of unexpected resource unavailability. To provide fault tolerance to guest jobs without adding significant computational overhead, failure prediction is required. This paper presents a method to predict resource failures in FGCS systems. It applies a semi-Markov Process and is based on a novel failure model, combining generic hardware-software failures with domain-specific failures in FGCS. We describe the failure prediction framework and its implementation in a production FGCS system named iShare. Through the experiments on an iShare testbed, we demonstrate that the prediction achieves accuracy above 86 % on average and outperforms linear time series models, while the computational cost is negligible. Our experimental results also show that the prediction is robust in the presence of irregular resource failures. 1
(Show Context)

Citation Context

...diction in large-scale distributed systems, especially in FGCS systems. Although several previous contributions have measured the distribution of general machine availability in networked environment =-=[4, 21, 16]-=-, or the temporal structure of CPU availability in Grids [29, 19, 15], no work targets predicting failures caused by both resource contention and resource revocation in FGCS systems. The main contribu...

Resource availability prediction in fine-grained cycle sharing systems

by Xiaojuan Ren, Seyong Lee, Rudolf Eigenmann, Saurabh Bagchi - In Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing , 2006
"... Fine-Grained Cycle Sharing (FGCS) systems aim at utilizing the large amount of computational resources available on the Internet. In FGCS, host computers allow guest jobs to utilize the CPU cycles if the jobs do not significantly impact the local users of a host. A characteristic of such resources i ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
Fine-Grained Cycle Sharing (FGCS) systems aim at utilizing the large amount of computational resources available on the Internet. In FGCS, host computers allow guest jobs to utilize the CPU cycles if the jobs do not significantly impact the local users of a host. A characteristic of such resources is that they are generally provided voluntarily and their availability fluctuates highly. Guest jobs may fail because of unexpected resource unavailability. To provide fault tolerance to guest jobs without adding significant computational overhead, it requires to predict future resource availability. This paper presents a method for resource availability prediction in FGCS systems. It applies a semi-Markov Process and is based on a novel resource availability model, combining generic hardware-software failures with domain-specific resource behavior in FGCS. We describe the prediction framework and its implementation in a production FGCS system named iShare. Through the experiments on an iShare testbed, we demonstrate that the prediction achieves accuracy above 86 % on average and outperforms linear time series models, while the computational cost is negligible. Our experimental results also show that the prediction is robust in the presence of irregular resource unavailability. 1
(Show Context)

Citation Context

...diction in large-scale distributed systems, especially in FGCS systems. Although several previous contributions have measured the distribution of general machine availability in networked environment =-=[4, 21, 16]-=-, or the temporal structure of CPU availability in Grids [29, 19, 15], no work targets predicting availability with regard to both resource contention and resource revocation in FGCS systems. The main...

M.J.: Scheduling on the grid via multi-state resource availability prediction

by Brent Rood, Michael J. Lewis - In: 9th IEEE/ACM International Conference on Grid Computing, 2008 , 2008
"... To make the most effective application placement decisions on volatile large-scale heterogeneous Grids, schedulers must consider factors such as resource speed, load, and reliability. Including reliability requires availability predictors, which consider different periods of resource history, and us ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
To make the most effective application placement decisions on volatile large-scale heterogeneous Grids, schedulers must consider factors such as resource speed, load, and reliability. Including reliability requires availability predictors, which consider different periods of resource history, and use various strategies to make predictions about resource behavior. Prediction accuracy significantly affects the quality of the schedule, as does the method by which schedulers combine various factors, including the weight given to predicted availability, speed, load, and more. This paper explores the question of how to consider predicted availability to improve scheduling, concentrating on multi-state availability predictors. We propose and study several classes of schedulers, and a method for combining factors. We characterize the inherent tradeoff between application makespan and the number of evictions due to failure, and demonstrate how our schedulers can navigate this tradeoff under various scenarios. We vary application load and length, and the percentage of jobs that are checkpointable. Our results show that the only other multi-state prediction based scheduler causes up to 51 % more evicted jobs while simultaneously increasing average job makespan by 18% when compared with our scheduler. 1

Managing Opportunistic and Dedicated Resources in a Bi-modal Service Deployment Architecture by

by Shah Asaduzzaman , 2007
"... This document is dedicated to my father who inspired the never-ending quest for knowledge in my life. ii Acknowledgement ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
This document is dedicated to my father who inspired the never-ending quest for knowledge in my life. ii Acknowledgement
(Show Context)

Citation Context

...ng to the user behavior. Several studies have attempted to record the user idleness and machine availability characteristics in different settings [73, 32] and statistically modeled the distributions =-=[73, 28, 94, 32]-=-. The effect of availability on job execution time distribution have been studied in [53]. In addition to user activity, availability is also affected by system software failure, hardware failure and ...

Strategies to Create Platforms for Differentiated Services from Dedicated and Opportunistic Resources

by Shah Asaduzzaman, Muthucumaru Maheswaran - J. PARALLEL AND DISTRIBUTED COMPUTING , 2007
"... This paper is proposing a new platform for implementing services in future service oriented architectures. The basic premise of our proposal is that by combining large volume of uncontracted resources with small clusters of dedicated resources, we can dramatically reduce the amount of dedicated reso ..."
Abstract - Cited by 3 (3 self) - Add to MetaCart
This paper is proposing a new platform for implementing services in future service oriented architectures. The basic premise of our proposal is that by combining large volume of uncontracted resources with small clusters of dedicated resources, we can dramatically reduce the amount of dedicated resources while the goodput provided by the overall system remains at a high level. This paper presents particular strategies for implementing this idea for a particular class of applications. We performed very detailed simulations on synthetic and real traces to evaluate the performance of the proposed strategies. Our findings on compute-intensive applications show that preemptive reallocation of resources is necessary for assured services. The proposed preemption based scheduling heuristic can significantly improve utilization of the dedicated resources by opportunistically offloading the peak loads on uncontracted resources, while keeping the service quality virtually unaffected.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University