| Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: A Global Layer Unix for a Network of Workstations. In Software Practice and Experience, 1989. 234 |
....scalability and performance limitations of launching a parallel application. Even fewer provide performance data for direct comparison. Two such projects that do provide some of this data are discussed below. The Berkeley NOW project implemented a software layer called GLUNIX (Global Layer UNIX)[7] on top of the Solaris operating system that implemented parallel application launch as well as several other cluster management and runtime features. GLUNIX was able to start a 100 node parallel job in 1.3 seconds. In [7] a detailed breakdown of the time needed for various tasks in job startup ....
....NOW project implemented a software layer called GLUNIX (Global Layer UNIX) 7] on top of the Solaris operating system that implemented parallel application launch as well as several other cluster management and runtime features. GLUNIX was able to start a 100 node parallel job in 1.3 seconds. In [7], a detailed breakdown of the time needed for various tasks in job startup is given as well as a graph of startup times from 1 to 100 nodes. The size of the executable used to gather this data is not 64 nodes 128 nodes 256 nodes 512 nodes 1010 nodes Phase 4 MB 12 MB 4 MB 12 MB 4 MB 12 MB 4 MB 12 ....
D. P. Ghormley, D. Petrou, S. H. Rodrigues, A. M. Vahdat, and T. E. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. Software: Practice and Experience, 28(9):929--961, July 1998.
....about the system state when assigning a new arriving job. For example, PVM, a popular parallel programming environment uses a round robin method to determine the job s assignment [9] A dynamic strategy assigns a new job based on the current system state. For example, the GLUNIX operating system [10] supports a remote execution service, assigning a new arriving job to the least loaded machine. An adaptive strategy reassigns jobs transparently during their execution in order to improve their performance. For example, the MOSIX operating system [7] provides a set of kernel enhancements in order ....
D. P. Ghormley, D. Petrou, S. H. Rodrigues, A. M. Vahdat, and T. E. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. Software Practice and Experience, 28(9):929-- 961, 1998.
....the first approach, the advantage is the higher portability. This is crucial factor considering a rapid changing nature of an operating system such as Linux. The reason is the reduction of the needs to frequently release new patches. The examples of this approach are PVM [8] and MPI [9] Glunix [10], and Score D [11] In a larger scope is the recent interest in Grid technology. Some of MPI implementations, which support MPI 2 standard has already had dynamic process management features including LAM MPI [12] WMPI [13] and Fujitsu s MPI [14] Finally, there are many efforts that focus on ....
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M Vahdat, and Thomas E. Anderson, "GLUnix: a Global Layer Unix for a Network of Workstations", SoftwarePractice and Experience Volume 28, 1998
....the set of destinations, and each node in the destination set sends and acknowledgment to the management node after the successful delivery of the broadcast datagram. By using RDGM, a job can be launched in a few tens of seconds on a cluster with 16 nodes, with relatively good scalability. GLUnix [16] is an operating system middle ware for clusters of workstations, designed to provide transparent remote execution, load balancing, coscheduling of parallel jobs and fault detection. In [16] the authors of GLUnix note that the overhead in the master node, when forking a parallel job, increases by ....
....be launched in a few tens of seconds on a cluster with 16 nodes, with relatively good scalability. GLUnix [16] is an operating system middle ware for clusters of workstations, designed to provide transparent remote execution, load balancing, coscheduling of parallel jobs and fault detection. In [16] the authors of GLUnix note that the overhead in the master node, when forking a parallel job, increases by a small amount (an average of 220 sec per node) Also, one to many communication patterns scale relatively well, at only 230 sec per node. When GLUnix launches a job, remote execution ....
[Article contains additional citation context not shown here]
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, and Amin M. Vadhar. GLUnix: a GLobal Layer Unix for a Network of Workstations. Software - Practice and Experience, 28(9), 1998.
....does not provide process migration. Parallel processing libraries such as PVM [7] provided for means to perform parallel processing on a cluster. Extensions of these systems such as dynamicPVM [9] using Condor [2] and tmPVM [8] provided for dynamic load balancing in a PVM environment. GLUnix [5] supports both interactive and batch style remote execution of both parallel and sequential jobs. All the above systems lack transparency as either special commands are introduced, or the user is require to rewrite, recompile and relink existing code with special libraries. Another approach ....
D.P. Ghormley, D. Petrou and S.H. Rodrigues, "GLUnix: a Global Layer Unix for a Network of Workstations", Software Practice and Experience, Vol 28(9), 1998:929961, http://now.cs.berkeley.edu/Glunix/glunix.html.
....reduces contention on the file server it still has severe performance and scalability limitations on large scale clusters. In contrast, by implementing the STORM mechanisms in terms of a tree based multicast, STORM overhead grows logarithmically, not linearly, in the number of nodes. GLUnix [17] is a piece of operating system middleware for clusters of workstations, designed to provide transparent remote execution, load balancing, coscheduling of parallel jobs, and faultdetection. While GLUnix launches jobs quickly on small clusters, a substantial performance degradation emerges on ....
....load balancing, coscheduling of parallel jobs, and faultdetection. While GLUnix launches jobs quickly on small clusters, a substantial performance degradation emerges on larger clusters ( 32 nodes) because reply messages from the slaves collide with subsequent request messages from the master [17]. STORM, however, can benefit from QsNET s network conditionals [31] which utilize a combining tree to reduce network contention and improve performance and scalability. The Computational Plant (Cplant) project [34] at Sandia National Laboratories is the closest project in spirit to ours in that ....
[Article contains additional citation context not shown here]
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: a global layer Unix for a network of workstations. Software---Practice and Experience, 28(9):929--961, July 25, 1998. Available from http://www-2.cs.cmu.edu/~dpetrou/ papers/glunix98.ps.gz.
....is identified and terminated. Once a faulty manager replica is identified, a new manager replica is restarted using the states of the remaining two manager replicas. 3. The Communication Infrastructure The idea of using a communication layer to facilitate porting is not new. For example, Glunix [3] provides such a layer. However, the Glunix portability layer is tuned for an implementation over TCP IP and leads to inefficiencies when implemented over state of the art communication platforms. High performance communication requires avoiding system calls and eliminating local message copies ....
D. P. Ghormley, D. Petrou, S. H. Rodrigues, A. M. Vahdat, and T. E. Anderson, "GLUnix: A Global Layer Unix for a Network of Workstations," Software - Practice and Experience, vol.28, no.9, pp. 929-961, July 1998.
....of resources in clusters. In particular, we develop systems that can dynamically turn cluster nodeson to be able to handle the load imposed on the system efficiently and off to save power under lighter load. This research is inspired by previous work in cluster wide load balancing (e.g. [2, 15, 24, 26, 9, 4]) When performing load balancing, the goal is to evenly spread the work over the available cluster resources in such a way that idle nodes can be used and performance can be promoted. The inverse of the load balancing operation concentrates work in fewer nodes, idling other nodes that can be ....
....evaluate a resource allocation policy for a clustered WWW server that is similar to the cluster configuration algorithm we study here. The paper exploits ideas from a previous paper [8] As aforementioned, load concentration is inspired by previous work in cluster wide load balancing (e.g. [2, 15, 24, 12, 26, 9, 4]) Some systems do use some form of load concentration, but only as a remedial technique like in systems that harvest idle workstations (e.g. 2, 24] or as a management technique for manually excluding a cluster node. We use load concentration as a first class technique for conserving power and ....
D. Ghormley, D. Petrou, S. Rodrigues, A. Vahdat, and T. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. Software: Practice and Experience, February 1998.
....system reliability since if the cluster management middleware fails, the system is lost. Virtually all cluster management middleware is based on centralized decision making. While distributed decision making is possible [2, 13] centralized managers are simpler to design, implement, and debug [12, 18]. A key problem with centralized management is that the failure of the manager leads to the failure of the entire system. Surprisingly, many cluster management systems fail to deal with this problem. Other systems, such as Sun s Grid Engine Software [23] use a cold spare approach, where a backup ....
....primary replica is detected, the time to identify and advertise the identity of the former backup replica that is now the primary replica is approximately 3.7 milliseconds. 10. Related Work Over the past decade, a number of resource management systems for cluster computing have been implemented [12, 21, 23, 25]. A survey of 20 research and commercial cluster management systems can be found in [1] Excluding the management fault tolerance features, the basic functionality of the FTCT is currently similar to the functionality of the GLUnix system [12] While various projects mention the possibility of ....
[Article contains additional citation context not shown here]
D. P. Ghormley, D. Petrou, S. H. Rodrigues, A. M. Vahdat, and T. E. Anderson, "GLUnix: A Global Layer Unix for a Network of Workstations," Software - Practice and Experience 28(9), pp. 929-961 (July 1998).
No context found.
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. Software---Practice and Experience, 28(8), July 1998.
No context found.
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: A Global Layer Unix for a Network of Workstations. In Software Practice and Experience, 1989. 234
No context found.
GHORMLEY, D. P., PETROU, D., RODRIGUES, S. H., VAHDAT, A. M., AND ANDERSON, T. E. GLUnix: a Global Layer Unix for a Network of Workstations. Software Practice and Experience 28, 9 (July 1998), 929--61.
No context found.
D. Ghormley, D. Petrou, S. Rodrigues, A. Vahdat, and T. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. Software-Practice and Experience, 28(9), July 1998.
No context found.
GHORMLEY, D. P., PETROU, D., RODRIGUES, S. H., VAHDAT, A. M., AND ANDERSON, T. E. GLUnix: a Global Layer Unix for a Network of Workstations. Software Practice and Experience 28, 9 (July 1998), 929--61.
No context found.
PETROU, D., RODRIGUES, S. H., VAHDAT, A., AND ANDERSON, T. E. Glunix: A global layer unix for a network of workstations. Software - Practice and Experience 28 (1998), 929--961.
No context found.
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. Glunix: a global layer unix for a network of workstations. IEEE Transactions on Software Engineering, 1998.
No context found.
GHORMLEY, D. P., PETROU, D., RODRIGUES, S. H., VAHDAT, A. M., AND ANDERSON, T. E. Glunix: a global layer unix for a network of workstations. Software---Practice and Experience (Apr. 1998).
No context found.
D. Ghormley, D. Petrou, S. Rodrigues, A. Vahdat, and T. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. Software-Practice and Experience, 28(9), July 1998.
No context found.
D. Ghormley, D. Petrou, S. Rodrigues, A. Vahdat, and T. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. Software-Practice and Experience, 28(9), July 1998.
No context found.
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: A Global Layer Unix for a Network of Workstations. Technical report, Computer Science Division, University of California, Berkeley, August 1997.
No context found.
D. P. Ghormley, D. Petrou, S. H. Rodrigues, A. M. Vahdat, T. E. Anderson. "Glunix: A global layer unix for a network of workstations." Technical report, Computer Science Division, University of California, Berkeley, August 1997.
No context found.
D.P. Ghormley,D.Petrou, S.H. Rodrigues, A.M. Vahdat, et al. Glunix: a global layer unix for a network of workstations. Software-Practice and Experience, 28(9):929--961, July 1998.
No context found.
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: A global layer Unix for a network of workstations. Software---Practice and Experience, 28(9):929--961, July 1998.
No context found.
Douglas Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: a Global Layer Unix for a Network of Workstations. To appear in Software | Practice and Experience.
No context found.
Ghormley DP, Petrou D, Rodrigues SH, Vahdat AM, Anderson TE. GLUnix: a global layer Unix for a network of workstations. Software---Practice and Experience 1998; 28(9):929--961.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC