| A. Bricker, M. Litzkow, and M. Livny, "Condor technical summary," Tech. Rep. 1069, Computer Sciences Department, University of Wisconsin-- Madison, January 1992. |
....on systems that make use of idle time on work stations through load balancing and process migration. The systems described in this section cater to sequential jobs and independent tasks and not to parallel com 17 putations with synchronization. We summarize some of the systems here. Condor ([44, 45, 10, 42]) is a very successful remote program execution facility developed at the University of Wisconsin. The system schedules long running back ground jobs on idle workstations. When the owner of a workstation resumes activity at a workstation, Condor checkpoints the remote job running on that ....
....expressed in terms of the number of local computation steps. The cost of communicating the computation state of a component process is gSa p and it represents the cost of data replication of a single 54 component process. 4.2.2. 3 Process Migration We use the migration scheme of Condor [10] to migrate processes across ma chines. The process to be migrated writes its data segment at a checkpoint onto the disk and exits by invoking an exception. The program is restarted on a new host, by loading the checkpointed data from the disk. This scheme assumes the availabil ity of the ....
[Article contains additional citation context not shown here]
A. Bricker, M. Litzkow, and M. Livny. Condor Technical Summary. Technical Report CS-TR-92-1069, Comp. Sc. Dept, Univ of Wisconsin, Madison, Jan 1992.
....support for remote execution of interactive sequential jobs and load balancing but does not provide process migration. Parallel processing libraries such as PVM [7] provided for means to perform parallel processing on a cluster. Extensions of these systems such as dynamicPVM [9] using Condor [2]) and tmPVM [8] provided for dynamic load balancing in a PVM environment. GLUnix [5] supports both interactive and batch style remote execution of both parallel and sequential jobs. All the above systems lack transparency as either special commands are introduced, or the user is require to ....
A. Bricker, M. Litzkow, and M. Livny, "Condor Technical Summary." University of Wisconsin-Madison Technical Report 1069. Oct 1991.
....posed by the frustrated users, namely to provide convenient access to unutilized workstations while preserving the rights of their owners. Condor has been developed by the Computer Science department of the university of Wisconsin Madison. This chapter gives a summary of the documentation [3] [4] 13] 14] 15] 16] 18] on Condor. 4.1 Design features Several principles have driven the design of Condor. Workstation owners should always have the resources of the workstation they own at their disposal. Workstation owners are generally happy to let somebody else compute on their ....
....allocating the idle machines to other machines which have Condor jobs to run. To illustrate how the daemons work together we will follow a Condor job from the moment of submitting to the moment that it finishes or the owner of the hosting workstation returns (as in the Condor technical summary [3]) Figure 4.3 illustrates the situation when there are no Condor jobs running. When a machine becomes idle, the startd of this machine will tell it to the central manager. The central manager will then decide which of the machines should execute one of its job remotely on the idle machine. The ....
A. Bricker, M.J. Litzkow and M. Livny, "Condor technical summary," Version 4.lb, University of Wisconsin - Madison, 1991.
....network of workstations (NOW) that is available for serving large (parallel) jobs. Such a system requires an effective policy for recruiting idle nodes as well as efficient mechanisms for migrating the processes of parallel jobs away from nodes that are preempted by a higher priority user [2, 30, 1]. Although we do not consider the impact of node interruptions nor particular policy customizations that might be needed, we consider synthetic workloads and repartitioning overheads that are relevant to such environments. Repartitioning overheads are discussed further in section 4. 3.2 Synthetic ....
A. Bricker, M. Litzkow, M. Livny, Condor Technical Summary. Technical Report TR 1069, Computer Sciences Dept., University of Wisconsin, Madison, WI, January 1992.
....resources of shared computing networks, a system layer is needed to efficiently allocate resources to competing users. Condor, a distributed batch system for pools of UNIX workstations, is such a system layer that supports the transfer, execution and control of jobs on remote workstations [3, 4, 7, 10, 17]. Users can submit UNIX jobs to Condor, which will attempt to find an idle machine that matches a job s requirements and move this job to that machine for remote execution. As soon as the owner of an idle workstation returns and starts using it, any Condor job running on this workstation is ....
A. Bricker, M.J. Litzkow, M. Livny, "Condor technical summary," Version 4.1b, University of Wisconsin - Madison, 1991
....then PVM starts up a default debugger in a new X window for each requested task. This scheme is not scalable, but it is effective in the initial stages of coding. Two additional internal interfaces have been integrated into PVM. These interfaces allow third party resource managers such as Condor [2] and LSF to work with PVM applications. The function pvm reg rm registers the calling task as responsible for all new task placement into the virtual machine. Once registered, PVM defers all requests to this task. Similarly the function pvm reg hoster registers the calling task as responsible for ....
A. Bricker, M. Litzkow, M. Livny "Condor Technical Summary", University of Wisconsin-Madison, Computer Sciences Technical Report 1069, 1992
....applications, there has been insufficient discussion on how to predict these effects. Machine workload has been used to parameterize the allocation of tasks to workstations in a network, however, many allocation strategies do not consider load characteristics in the measurement of workload (e.g. [4, 10, 14, 27]) Load characteristics have been included in performance prediction models for networks of workstations (e.g. 30, 55] however such models assume that each workstation is shared by at most one compute intensive task and one or more local tasks that alternate idle with busy cycles. We believe ....
....on each machine, depending on the architecture and on the local scheduling policy. Different groups have proposed systems that focus on scheduling of parallel applications on distributed systems in different ways (e.g. AppLeS 9 [5, 6] Legion [24] Globus [23] Prophet [51] MARS [22] Condor [10], SmartNet [27] and Nile [34, 35] In general, current scheduling systems assign tasks (or entire applications) to machines according to computation time and communication costs using several different performance measures (e.g. 25, 31] but few consider contention effects. Among ....
[Article contains additional citation context not shown here]
A. Bricker, M. Litzkow, and M. Livny, "Condor Technical Summary", Technical Report #1069, University of Wisconsin, Computer Science Department, May 1992.
....of all workstations are usually idle in the Sprite system. In order to exploit that unused computing power, several software tools have been developed which offer remote executions of processes. Moreover, the user community of such a distributed system is usually not homogeneous: A. Bricker et al. [2] for instance, have observed three types of users: Type 1 users mostly use their workstations for sending and receiving mail or preparing papers, whereas type 2 users are frequently involved in the debugging cycle where they alternately edit and compile software. Such users have phases where their ....
....a high throughput and better performance. There are many resource management systems, either for research or commercial. They differ in their implementations and mechanisms to treat the lack of the resource management in distributed systems. Examples of such systems are: Condor Condor [2][3] 4] is a distributed batch queuing system for sharing the workload within a pool of UNIX workstations connected by a network. Codine Another resource management system targeted to optimize the utilization of software and hardware resources in a heterogeneous networked environment, similar ....
[Article contains additional citation context not shown here]
A. Bricker; M. Litzkow; M. Livny: "Condor Technical Summary", Technical Report 1096, Computer Science Department, University of Wisconsin-Madison, January 1992.
....applications, there has been insufficient discussion on how to predict these effects. Machine workload has been used to parameterize the allocation of tasks to workstations in a network, however, many allocation strategies do not consider load characteristics in the measurement of workload (e.g. [4, 10, 14, 27]) Load characteristics have been included in performance prediction models for networks of workstations (e.g. 30, 55] however such models assume that each workstation is shared by at most one compute intensive task and one or more local tasks that alternate idle with busy cycles. We believe ....
....on each machine, depending on the architecture and on the local scheduling policy. Different groups have proposed systems that focus on scheduling of parallel applications on distributed systems in different ways (e.g. AppLeS 9 [5, 6] Legion [24] Globus [23] Prophet [51] MARS [22] Condor [10], SmartNet [27] and Nile [34, 35] In general, current scheduling systems assign tasks (or entire applications) to machines according to computation time and communication costs using several different performance measures (e.g. 25, 31] but few consider contention effects. Among ....
[Article contains additional citation context not shown here]
A. Bricker, M. Litzkow, and M. Livny, "Condor Technical Summary", Technical Report #1069, University of Wisconsin, Computer Science Department, May 1992.
....in a way that promotes execution performance in the system. In current work, machine workload has been used to parameterize the allocation of tasks to workstations in a network. However, many allocation strategies do not consider load characteristics in the measurement of workload (e.g. [1, 2, 4, 6]) Load characteristics have been included in performance prediction models for networks of workstations (e.g. 10, 15] but such models 1. Supported in part by CAPES and UFRJ (Brazil) and by NSF contract number ASC 9301788. 2. Supported in part by NSF contract number ASC 9301788. This ....
A. Bricker, M. Litzkow, and M. Livny, "Condor Technical Summary", Technical Report #1069, University of Wisconsin, Computer Science Department, May 1992.
....recently, systems such as xFS [Anderson et al. 1995b] and Petal [Lee Thekkath 1996] use client side techniques to improve 12 overall file system performance. Many distributed clusters perform load balancing on the level of jobs (interactive or otherwise) submitted to the system [Nichols 1987, Bricker et al. 1991, Douglis Ousterhout 1991, Zhou et al. 1992] Once again, all these systems implement server side solutions for load balancing and require client intervention to spread jobs among cluster machines. Perhaps most closely related to our systems are ISIS [Birman 1993] and so called gossip ....
A. Bricker, M. Litzkow, and M. Livny. "Condor Technical Summary". Technical Report 1069, University of Wisconsin---Madison, Computer Science Department, October 1991.
....Machine are assigned before the jobs of a normal machine when the latter has a lower priority than the World Machine. ffl It is possible that jobs are run on the World Machine while there are idle machines in the initiating pool. This is caused by the first fit allocation algorithm of Condor [1]. ffl Because the W Startd and W Schedd use the same port numbers as the normal Startd and Schedd, a complete machine has to be set aside as the World Machine. ffl Because Condor allows at most one Condor job on a machine, the Negotiator will give permission to at most one machine per schedule ....
A. Bricker, M.J. Litzkow and M. Livny, "Condor technical summary," Version 4.1b, University of Wisconsin - Madison, 1991.
....computation and communication. The rationale for this restriction is that OS processes are general purpose and carry a lot of state that is not necessary for most parallel applications. Further, OS processes allow certain operations that are location dependent that complicates process migration [BLL91, DO91] Thus, special purpose interfaces can result in reducing the size of a ULP s state and simplify the task of ULP migration. 2.3.2 Programming interface The programming interface to a ULP system can be an existing interface such as TCGMSG, P4, PVM, or NX so that existing applications can be ....
....page fault, swaps in the required page into physical memory, updates the memory management mappings, and reschedules the process for execution. This integration problem between OS and abstractions implemented at user level has been observed and solutions for better integration have been proposed [ABLL91, MSLM91] These solutions essentially provide a way to communicate events in the OS such as a page fault to the user level library. Availability of the support described in these solutions in the OS will allow a ULP library to switch to another ULP on a page fault event. In the absence of such ....
A. Bricker, M. Litzkow, and M. Livny. Condor technical summary. Technical report, University of Wisconsin at Madison, October 1991.
....with the resource s attributes automatically determine how the user sees the world. 7.2 Checkpoint Restart and Migration Systems Presentation manager provides limited support for application checkpointing. UNIX for Nomads [Bender, 1993] is a system wide checkpointing approach, while Condor [Bricker] and the ARCADE micro kernel support transparent process migration [Tracey, 1991] Presentation Manager for OS 2 provides a user interface similar to X. Its major drawback, comparatively, is its tight integration of keyboard and display with the application. All applications can only ever write ....
Bricker, A. and Litzkow, M.; Condor Technical Summary
....around the world have been working on adapting existing distributed batch schedulers to be able to handle parallel PVM applications. The DQS package developed at Florida State University was the first to be able to use PVM, followed by the Condor package developed at the University of Wisconsin [1,2]. PVM 3.3 defines a clean interface between PVM and scheduler packages. This allows organizations that wish to set up a single scheduler that controls a organization wide virtual machine to do so. Their users could submit serial or PVM jobs to the scheduler with a suggested number of machines to ....
Bricker, A., Litzkow, M., Miron, L.: Condor Technical Summary. CS technical report No. 1069 University of Wisconsin, (Jan. 1992)
....allocated in a way that promotes execution performance in the system. In current work, machine workload has been used to parameterize the allocation of tasks to workstations in a network. However, many allocation strategies do not consider load characteristics in the measurement of workload (e.g. [1, 2, 4, 6]) Load characteristics have been included in performance prediction models for networks of workstations (e.g. 10, 15] but such models 1. Supported in part by CAPES and UFRJ (Brazil) and by NSF contract number ASC 9301788. 2. Supported in part by NSF contract number ASC 9301788. This ....
A. Bricker, M. Litzkow, and M. Livny, "Condor Technical Summary", Technical Report #1069, University of Wisconsin, Computer Science Department, May 1992.
....amount of time [Theimer et al. 1985, Nichols 1987, Douglis Ousterhout 1991, Arpaci et al. 1995] In response to this, many workstation cluster systems have been built which allow users to run jobs on other idle workstations in the cluster. These systems include PVM [Sunderam 1990] Condor [Bricker et al. 1991], LSF [Zhou 1992] Locus [Walker et al. 1983] Butler [Nichols 1987] and Sprite [Douglis Ousterhout 1991] However, using an idle workstation to run foreign jobs can negatively impact the user of the workstation when that user resumes work on the workstation. For example, if foreign processes ....
....prediction algorithm and an evaluation of it are presented in Section 4. When the user leaves the workstation, the Prediction Engine first signals the State Restoration Libraries to snapshot the virtual memory and file cache state and then notifies the cluster resource sharing system (e.g. Condor [Bricker et al. 1991], LFS [Zhou 1992] that the machine is available to foreign jobs. When the Prediction Engine predicts that the user is likely to return soon, or if the user returns unexpectedly, the Prediction Engine notifies the cluster resource sharing system that the machine is unavailable to foreign jobs. If ....
[Article contains additional citation context not shown here]
A. Bricker, M. Litzkow, and M. Livny. "Condor Technical Summary". Technical Report 1069, University of Wisconsin---Madison, Computer Science Department, October 1991.
....to machines, there has been little discussion on how to predict these effects. Machine workload has been used to parameterize the allocation of tasks to workstations in a network, however, many allocation strategies do not consider load characteristics in the measurement of workload (e.g. 3] 4][5][6] Load characteristics have been included in performance prediction models for networks of workstations (e.g. 10] 18] however such models assume that each workstation is shared by at most one compute intensive task and one local task which alternates idle with compute intensive cycles. We ....
A. Bricker, M. Litzkow, and M. Livny, "Condor Technical Summary", Technical Report #1069, University of Wisconsin, Computer Science Department, May 1992.
No context found.
A. Bricker, M. Litzkow, and M. Livny, "Condor technical summary," Tech. Rep. TR 1069, Department of Computer Science, University of Wisconsin, Oct. 1991.
No context found.
A. Bricker, M. Litzkow, and M. Livny, "Condor technical summary," Tech. Rep. 1069, Computer Sciences Department, University of Wisconsin-- Madison, January 1992.
No context found.
A. Bricker, M. Litzkow, and M. Livny. Condor Technical Summary. Technical Report 1069, University of Wisconsin--Madison, CS Department, 1991.
No context found.
Bricker, A. et al., \Condor Technical Summary", Computer Sciences Dept., Univ. of Wisconsin, 1989.
No context found.
A. Bricker, M. Litzkow, M. Livny, T. Summary, and V. Report. Condor Technical Summary, 1992.
No context found.
Bricker, A., M. Litzkow, M. Livny, "CONDOR Technical Summary," CS Department Technical Report, University of Wisconsin-Madison, 10/9/91.
No context found.
Bricker, A., M.Litzkow, M.Livny, "Condor Technical Summary ", University of Wisconsin - Madison, Technical Report 1069, October 9, 1991.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC