MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  User-Level Communication in a System with Gang Scheduling (2001) [5 citations — 1 self]

Download:
Download as a PDF | Download as a PS
by Yoav Etsion, Dror G. Feitelson
In Proceedings of the International Parallel and Distributed Processing Symposium 2001, IPDPS2001
http://www.cs.huji.ac.il/~feit/gang_comm.ps.gz
Add To MetaCart

Abstract:

One of the main limitations on multiprogramming of parallel jobs on clusters is the communication bandwidth allocation. When using an o-the-shelf network card there are memory limitations: the memory size resident on the network card is xed. Because of this, any use of this memory as a fast send or receive buer prohibits us from using a multiprogramming schene because of the obvious need to divide the buer among the dierent jobs, which reduces the achievable bandwidth substantially. In this paper we propose a dierent scheme, which combines gang scheduling (coordinated context switching of the processes on dierent nodes) with a communication buers switch. In this scheme, we associate a communication buer with each context. Whenever the scheduling algorithm initiates a context switch, we also replace the communication buers | the send queue and receive queue. This allows the jobs to utilize the bufers as if it was the only job in the system, resulting in improved bandwidth. 1

Citations

807 Myrinet: A Gigabit-per-second Local Area Network – Boden, Cohen, et al. - 1995
261 UNIX Network Programming – Stevens - 1990
83 Fast Messages: Efficient, portable communication for workstation clusters and MPPs – Pakin, Karamcheti, et al. - 1997
79 Distributed hierarchical control for parallel processing – Feitelson, Rudolph - 1990
73 GLUnix: A Global Layer Unix for a Network of Workstations – Ghormley, Petrou, et al. - 1998
67 Virtual network transport protocols for Myrinet – Chun, Mainwaring, et al. - 1997
61 The Network Architecture of the Connection Machine CM-5 – Pierre, Wong, et al.
56 The Prospero resource manager: A scalable framework for processor allocation in distributed systems. Concurrency: Practice and Experience – Neumann, Rao - 1994
49 Dynamic Coscheduling on Workstation Clusters – Sobalvarro, Pakin, et al. - 1998
43 Interfacing Condor and PVM to Harness the Cycles of WorkstationClusters. Future Generation Computer Systems – Pruyne, Livny - 1996
26 Gang Scheduling for Highly Efficient Distributed Multiprocessor Syetems – Franke, Pattnaik, et al. - 1996
25 Myrinet: A gigabitper -second local area network – Boden, Cohen, et al. - 1995
21 Leiserson et al., "The Network Architecture of the Connection Machine CM-5 – E - 1992
13 Overhead Analysis of Preemptive Gang Scheduling,” Job Scheduling – Hori, Tezuka, et al. - 1998
11 Fast messages: Ecient, portable communication for workstation clusters and MPPs – Pakin, Karamcheti, et al. - 1997
7 The ParPar system: a software MPP – Feitelson, Batat, et al. - 1999
4 Using multicast to pre-load jobs on the parpar cluster – Kavas, Er-El, et al.
2 Feitelson, "Using multicast to preload jobs on the ParPar cluster ". Parallel Comput – Kavas, Er-El, et al. - 2001
2 Gang scheduling for highly ecient distributed multiprocessor systems – Franke, Pattnaik, et al. - 1996
1 Write Combining Memory Implementation Guidelines. Order number – Corp - 1998
1 an agent-based architecture for dynamic resource management – Tan - 1999