Results 1 - 10
of
163
The eucalyptus open-source cloud-computing system
- In Proceedings of Cloud Computing and Its Applications [Online
"... Cloud computing systems fundamentally provide access to large pools of data and computational resources through a variety of interfaces similar in spirit to existing grid and HPC resource management and programming systems. These types of systems offer a new programming target for scalable applicati ..."
Abstract
-
Cited by 98 (3 self)
- Add to MetaCart
Cloud computing systems fundamentally provide access to large pools of data and computational resources through a variety of interfaces similar in spirit to existing grid and HPC resource management and programming systems. These types of systems offer a new programming target for scalable application developers and have gained popularity over the past few years. However, most cloud computing systems in operation today are proprietary, rely upon infrastructure that is invisible to the research community, or are not explicitly designed to be instrumented and modified by systems researchers. In this work, we present EUCALYPTUS – an opensource software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources. We outline the basic principles of the EUCALYPTUS design, detail important operational aspects of the system, and discuss architectural trade-offs that we have made in order to allow Eucalyptus to be portable, modular and simple to use on infrastructure commonly found within academic settings. Finally, we provide evidence that EUCALYPTUS enables users familiar with existing Grid and HPC systems to explore new cloud computing functionality while maintaining access to existing, familiar application development software and Grid middle-ware. 1
Modeling Machine Availability in Enterprise and Wide-area Distributed Computing Environments
- In Euro-Par’05
, 2003
"... In this paper, we consider the problem of modeling machine availability in enterprise-area and wide-area distributed computing settings. Using availability data gathered from three different environments, we detail the suitability of four potential statistical distributions for each data set: expone ..."
Abstract
-
Cited by 51 (7 self)
- Add to MetaCart
In this paper, we consider the problem of modeling machine availability in enterprise-area and wide-area distributed computing settings. Using availability data gathered from three different environments, we detail the suitability of four potential statistical distributions for each data set: exponential, Pareto, Weibull, and hyperexponential. In each case, we use software we have developed to determine the necessary parameters automatically from each data collection.
GRENCHMARK: A Framework for Analyzing, Testing, and Comparing Grids
- In Proc. of the sixth IEEE/ACM International Symposium on Cluster Computing and the GRID (CCGrid’06
, 2006
"... Grid computing is becoming the natural way to aggregate and share large sets of heterogeneous resources. With the infrastructure becoming ready for the challenge, current grid development and acceptance hinge on proving that grids reliably support real applications, and on creating adequate benchmar ..."
Abstract
-
Cited by 33 (21 self)
- Add to MetaCart
Grid computing is becoming the natural way to aggregate and share large sets of heterogeneous resources. With the infrastructure becoming ready for the challenge, current grid development and acceptance hinge on proving that grids reliably support real applications, and on creating adequate benchmarks to quantify this support. However, grid applications are just beginning to emerge, and traditional benchmarks have yet to prove representative in grid environments. To address this chicken-and-egg problem, we propose a middle-way approach: create and run synthetic grid workloads comprising applications representative for today’s grids. For this purpose, we have designed and implemented GRENCHMARK, a framework for synthetic workload generation and submission. The framework greatly facilitates synthetic workload modeling, comes with over 35 synthetic and real applications, and is extensible and flexible. We show how the framework can be used for grid system analysis, functionality testing in grid environments, and for comparing different grid settings, and present the results obtained with GRENCHMARK in our multi-cluster grid, the DAS. 1
Predicting bounds on queuing delay for batch-scheduled parallel machines
- In Proceedings of PPoPP 2006
, 2006
"... Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources have accounts at multiple sites and have the option of choosing at which site or s ..."
Abstract
-
Cited by 28 (8 self)
- Add to MetaCart
Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources have accounts at multiple sites and have the option of choosing at which site or sites to submit a parallel job. In such a situation, the amount of time a user’s job will wait in any one batch queue can significantly impact the overall time a user waits from job submission to job completion. In this work, we explore a new method for providing end-users with predictions for the bounds on the queuing delay individual jobs will experience. We evaluate this method using batch scheduler logs for distributed-memory parallel machines that cover a 9-year period at 7 large HPC centers. Our results show that it is possible to predict delay bounds reliably for jobs in different queues, and for jobs requesting different ranges of processor counts. Using this information, scientific application developers can intelligently decide where to submit their parallel codes in order to minimize overall turnaround time. 1.
Inter-operating Grids through delegated matchmaking
- In 2007 ACM/IEEE Conference on Supercomputing (SC 2007
, 2007
"... The grid vision of a single computing utility has yet to materialize: while many grids with thousands of processors each exist, most work in isolation. An important obstacle for the effective and efficient inter-operation of grids is the problem of resource selection. In this paper we propose a solu ..."
Abstract
-
Cited by 23 (9 self)
- Add to MetaCart
The grid vision of a single computing utility has yet to materialize: while many grids with thousands of processors each exist, most work in isolation. An important obstacle for the effective and efficient inter-operation of grids is the problem of resource selection. In this paper we propose a solution to this problem that combines the hierarchical and decentralized approaches for interconnecting grids. In our solution, a hierarchy of grid sites is augmented with peer-to-peer connections between sites under the same administrative control. To operate this architecture, we employ the key concept of delegated matchmaking, which temporarily binds resources from remote sites to the local environment. With trace-based simulations we evaluate our solution under various infrastructural and load conditions, and we show that it outperforms other approaches to inter-operating grids. Specifically, we show that delegated matchmaking achieves up to 60 % more goodput and completes 26 % more jobs than its best alternative, daily. 1
Automatic Methods for Predicting Machine Availability in Desktop Grid and Peer-to-peer Systems
- In Proceedings of the of the IEEE International Symposium on Cluster Computing and the Grid (CCGrid’04
, 2004
"... In this paper, we examine the problem of predicting machine availability in desktop and enterprise computing environments. Predicting the duration that a machine will run until it restarts (availability duration) is critically useful to application scheduling and resource characterization in federat ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
In this paper, we examine the problem of predicting machine availability in desktop and enterprise computing environments. Predicting the duration that a machine will run until it restarts (availability duration) is critically useful to application scheduling and resource characterization in federated systems. We describe one parametric model fitting technique and two non-parametric prediction techniques, comparing their accuracy in predicting the quantiles of empirically observed machine availability distributions. We describe each method analytically and evaluate its precision using a synthetic trace of machine availability constructed from a known distribution. To detail their practical efficacy, we apply them to machine availability traces from three separate desktop and enterprise computing environments, and evaluate each method in terms of the accuracy with which it predicts availability in a trace driven simulation. Our results indicate that availability duration can be predicted with quantifiable confidence bounds and that these bounds can be used as conservative bounds on lifetime predictions. Moreover, a non-parametric method based on a binomial approach generates the most accurate estimates.
An M-Net
, 2003
"... GGF DOCUMENT SUBMISSION CHECKLIST (include as front page of submission) 1. Author name(s), institution(s), and contact information 2. Date (original and, where applicable, latest revision date) 3. Title, table of contents, clearly numbered sections 4. Security Considerations section 5. GGF Copyright ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
GGF DOCUMENT SUBMISSION CHECKLIST (include as front page of submission) 1. Author name(s), institution(s), and contact information 2. Date (original and, where applicable, latest revision date) 3. Title, table of contents, clearly numbered sections 4. Security Considerations section 5. GGF Copyright statement inserted (See below) 6. GGF Intellectual Property statement inserted. (See below) NOTE that authors should read the statement. 7. Document format-The GGF document format to be used for both GWD's and GFD's is available in MSWord, RTF, and PDF formats. (note that font type is not part of the requirement, however authors should avoid font sizes smaller than 10pt).
NWSLite: A Light-Weight Prediction Utility for Mobile Devices
, 2004
"... Computation off-loading, i.e., remote execution, has been shown to be effective for extending the computational power and battery life of resource-restricted devices, e.g., hand-held, wearable, and pervasive computers. Remote execution systems must predict the cost of executing both locally and remo ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
Computation off-loading, i.e., remote execution, has been shown to be effective for extending the computational power and battery life of resource-restricted devices, e.g., hand-held, wearable, and pervasive computers. Remote execution systems must predict the cost of executing both locally and remotely to determine when offloading will be most beneficial. These costs however, are dependent upon the execution behavior of the task being considered and the highly-variable performance of the underlying resources, e.g., CPU (local and remote), bandwidth, and network latency. As such, remote execution systems must employ sophisticated, prediction techniques that accurately guide computation off-loading. Moreover, these techniques must be efficient, i.e., they cannot consume significant resources, e.g., energy, execution time, etc., since they are performed on the mobile device.
Security-Driven Heuristics and a Fast Genetic Algorithm for Trusted Grid Job Scheduling
- in Proc. of 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05
, 2005
"... In this paper, our contributions are two-fold: First, we enhance the Min-Min and Sufferage heuristics under three risk modes driven by security concerns. Second, we propose a new Space-Time Genetic Algorithm (STGA) for trusted job scheduling, which is very fast and easy to implement. Under our new m ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
In this paper, our contributions are two-fold: First, we enhance the Min-Min and Sufferage heuristics under three risk modes driven by security concerns. Second, we propose a new Space-Time Genetic Algorithm (STGA) for trusted job scheduling, which is very fast and easy to implement. Under our new model, a job can possibly fail if the site security level is lower than the job security demand. We consider three security-driven heuristic modes: secure, risky, and f -risky. The secure mode always dispatches jobs to secure sites meeting the job security demands. The risky mode allocates jobs to any available resource site, taking whatever the risk it may face. The f -risky mode tries to limit the risk to be at most certain probability f . Our extensive simulation results indicated that the proposed STGA is highly effective in scheduling two types of practical workloads: NAS (Numerical Aerodynamic Simulation) and PSA (parametersweep application). The STGA outperforms the Min-Min and Sufferage heuristics under three risk modes, in terms of a wide range of performance metrics including makespan, average response time, site utilization, slowdown ratio, and job failure rate.
The Grid: past, present, future
, 2002
"... The Grid is the computing and data management infrastructure that will provide the electronic underpinning for a global society in business, government, research, science and entertainment [1–5]. Grids, illustrated in Figure 1.1, integrate networking, communication, computation and information to pr ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
The Grid is the computing and data management infrastructure that will provide the electronic underpinning for a global society in business, government, research, science and entertainment [1–5]. Grids, illustrated in Figure 1.1, integrate networking, communication, computation and information to provide a virtual platform for computation and data management in the same way that the Internet integrates resources to form a virtual platform for information. The Grid is transforming science, business, health and society. In this book we consider the Grid in depth, describing its immense promise, potential and complexity from the perspective of the community of individuals working hard to make the Grid vision a reality. Grid infrastructure will provide us with the ability to dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications.

