MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Distributed Computing in Practice: The Condor Experience. Concurrency and Computation: Practice and Experience (2005) [112 citations — 3 self]

Download:
Download as a PDF
by Douglas Thain, Todd Tannenbaum, Miron Livny
http://www.cs.wisc.edu/~thain/library/condor-practice.pdf
Add To MetaCart

Abstract:

Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational grid. In this chapter, we provide the history and philosophy of the Condor project and describe how it has interacted with other projects and evolved along with the field of distributed computing. We outline the core components of the Condor system and describe how the technology of computing must correspond to social structures. Throughout, we reflect on the lessons of experience and chart the course traveled by research ideas as they grow into production systems.

Citations

1757 Time, clocks, and the ordering of events in a distributed system – Lamport - 1978
815 M.: The Byzantine generals problem – Lamport, Shostak, et al. - 1982
804 Distributed Snapshots: Determining Global States of Distributed Systems – Chandy, Lamport - 1985
562 Kerberos: An Authentication Service for Open Network Systems – Steiner, Neuman, et al. - 1988
391 Extensibility, safety, and performance in the SPIN operating system – Bershad, Savage, et al. - 1995
341 The protection of information in computer systems – Saltzer, Schroeder - 1975
319 A Security Architecture for Computational Grids – Foster, Kesselman, et al. - 1998
297 S.Tuecke. A resource management architecture for metacomputing systems – Czajkowski, Karonis, et al. - 1998
274 Condor-G: A Computation Management Agent for Multi-Institutional Grids – Frey - 2002
243 Dealing With Disaster: Surviving Misbehaved Kernel Extensions – Seltzer, Endo, et al. - 1996
235 BEOWULF: A Parallel Workstation for Scientific Computation – Sterling, Savarese, et al. - 1995
234 Matchmaking: Distributed resource management for high throughput computing – Raman, Livny, et al. - 1998
173 Grapevine: An exercise in distributed computing – BIRRELL, LEVIN, et al. - 1982
167 The LOCUS distributed operating system – Walker, Popek, et al. - 1983
132 The MULTICS System: An Examination of its Structure – Organick - 1972
124 Supporting Checkpointing and Process Migration Outside the UNIX Kernel – Litzkow, Solomon - 1992
124 Multiprocessor scheduling with the aid of network flow algorithms – Stone - 1977
88 Condor Technical Summary – Bricker, Litzkow, et al. - 1991
72 Checkpoint and migration of UNIX processes in the Condor distributed processing system – Litzkow, Tannenbaum, et al. - 1997
70 LSF: Load sharing in large-scale heterogeneous distributed systems – Zhou - 1992
69 A Worldwide Flock of Condors: Load Sharing among Workstation Clusters”, Future Generation Computer Systems – Epema, Livny, et al. - 1996
68 Experience with the Condor Distributed Batch System – Litzkow, Livny - 1990
66 Remote unix - turning idle workstations into cycle servers – Litzkow - 1987
63 Replica Selection in the Globus Data Grid – Vazhkudai, Tuecke, et al. - 2001
62 Core Algorithms of the Maui Scheduler – Jackson, Snell, et al. - 2001
53 Portable Batch System: External reference specification – Bayucan, Henderson, et al. - 1999
48 Deploying a High Throughput Computing Cluster – Basney, Livny
42 Interfacing condor and pvm to harness the cycles of workstation clusters – Pruyne, Livny - 1996
39 Stork: Making data placement a first class citizen in the grid – Kosar, Livny - 2004
38 Solving large quadratic assignment problems on computational grids – Anstreicher, Brixius, et al. - 2002
36 Providing resource management services to parallel applications – Pruyne, Livny - 1994
35 High-Throughput Resource Management – Livny - 1999
34 A stable distributed scheduling algorithm – Bryant, Finkel - 1981
33 Protocols and services for distributed data-intensive science – Allcock, Chervenak, et al. - 2000
30 Condor - A Distributed Job Scheduler – Tannenbaum, Wright, et al. - 2002
29 Parrot: Transparent user-level middleware for data-intensive computing – Thain, Livny - 2003
28 Resource management through multilateral matchmaking – Raman, Livny, et al. - 2000
24 Gathering at the Well: Creating Communities for Grid I/O – Thain, Bent, et al. - 2001
22 Cheap cycles from the desktop to the dedicated cluster: combining opportunistic and dedicated scheduling with Condor – WRIGHT - 2001
21 Matchmaking Frameworks for Distributed Resource Management – Raman - 2000
18 Utilizing Widely Distributed Computational Resources Efficiently with Execution Domains – Basney, Livny, et al. - 2001
16 Bypass: A tool for building split execution systems – Thain, Livny - 2000
13 Globus: A metacomputing intrastructure toolkit – Foster, Kesselman - 1997
12 Error Scope on a Computational Grid: Theory and Practice – Thain, Livny - 2002
11 Design and evaluation of a resource selection framework for grid applications – Angulo, Foster, et al. - 2002
10 An enabling framework for master-worker applications on the computational grid – Linderoth, Kulkarni, et al. - 2000
9 The Crystal Multicomputer: Design and Implementation Experience – DeWitt, Finkel, et al. - 1987
8 Distributed scheduling for a changing environment – Krueger - 1988
8 RFC 2222: Simple Authentication and Security Layer(SASL – Myers - 1997
7 The Study of Load Balancing Algorithms for Decentralized Distributed Processing Systems – Livny - 1983