Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational grid. In this chapter, we provide the history and philosophy of the Condor project and describe how it has interacted with other projects and evolved along with the field of distributed computing. We outline the core components of the Condor system and describe how the technology of computing must correspond to social structures. Throughout, we reflect on the lessons of experience and chart the course traveled by research ideas as they grow into production systems.
|
1757
|
Time, clocks, and the ordering of events in a distributed system
– Lamport
- 1978
|
|
815
|
M.: The Byzantine generals problem
– Lamport, Shostak, et al.
- 1982
|
|
804
|
Distributed Snapshots: Determining Global States of Distributed Systems
– Chandy, Lamport
- 1985
|
|
562
|
Kerberos: An Authentication Service for Open Network Systems
– Steiner, Neuman, et al.
- 1988
|
|
391
|
Extensibility, safety, and performance in the SPIN operating system
– Bershad, Savage, et al.
- 1995
|
|
341
|
The protection of information in computer systems
– Saltzer, Schroeder
- 1975
|
|
319
|
A Security Architecture for Computational Grids
– Foster, Kesselman, et al.
- 1998
|
|
297
|
S.Tuecke. A resource management architecture for metacomputing systems
– Czajkowski, Karonis, et al.
- 1998
|
|
274
|
Condor-G: A Computation Management Agent for Multi-Institutional Grids
– Frey
- 2002
|
|
243
|
Dealing With Disaster: Surviving Misbehaved Kernel Extensions
– Seltzer, Endo, et al.
- 1996
|
|
235
|
BEOWULF: A Parallel Workstation for Scientific Computation
– Sterling, Savarese, et al.
- 1995
|
|
234
|
Matchmaking: Distributed resource management for high throughput computing
– Raman, Livny, et al.
- 1998
|
|
173
|
Grapevine: An exercise in distributed computing
– BIRRELL, LEVIN, et al.
- 1982
|
|
167
|
The LOCUS distributed operating system
– Walker, Popek, et al.
- 1983
|
|
132
|
The MULTICS System: An Examination of its Structure
– Organick
- 1972
|
|
124
|
Supporting Checkpointing and Process Migration Outside the UNIX Kernel
– Litzkow, Solomon
- 1992
|
|
124
|
Multiprocessor scheduling with the aid of network flow algorithms
– Stone
- 1977
|
|
88
|
Condor Technical Summary
– Bricker, Litzkow, et al.
- 1991
|
|
72
|
Checkpoint and migration of UNIX processes in the Condor distributed processing system
– Litzkow, Tannenbaum, et al.
- 1997
|
|
70
|
LSF: Load sharing in large-scale heterogeneous distributed systems
– Zhou
- 1992
|
|
69
|
A Worldwide Flock of Condors: Load Sharing among Workstation Clusters”, Future Generation Computer Systems
– Epema, Livny, et al.
- 1996
|
|
68
|
Experience with the Condor Distributed Batch System
– Litzkow, Livny
- 1990
|
|
66
|
Remote unix - turning idle workstations into cycle servers
– Litzkow
- 1987
|
|
63
|
Replica Selection in the Globus Data Grid
– Vazhkudai, Tuecke, et al.
- 2001
|
|
62
|
Core Algorithms of the Maui Scheduler
– Jackson, Snell, et al.
- 2001
|
|
53
|
Portable Batch System: External reference specification
– Bayucan, Henderson, et al.
- 1999
|
|
48
|
Deploying a High Throughput Computing Cluster
– Basney, Livny
|
|
42
|
Interfacing condor and pvm to harness the cycles of workstation clusters
– Pruyne, Livny
- 1996
|
|
39
|
Stork: Making data placement a first class citizen in the grid
– Kosar, Livny
- 2004
|
|
38
|
Solving large quadratic assignment problems on computational grids
– Anstreicher, Brixius, et al.
- 2002
|
|
36
|
Providing resource management services to parallel applications
– Pruyne, Livny
- 1994
|
|
35
|
High-Throughput Resource Management
– Livny
- 1999
|
|
34
|
A stable distributed scheduling algorithm
– Bryant, Finkel
- 1981
|
|
33
|
Protocols and services for distributed data-intensive science
– Allcock, Chervenak, et al.
- 2000
|
|
30
|
Condor - A Distributed Job Scheduler
– Tannenbaum, Wright, et al.
- 2002
|
|
29
|
Parrot: Transparent user-level middleware for data-intensive computing
– Thain, Livny
- 2003
|
|
28
|
Resource management through multilateral matchmaking
– Raman, Livny, et al.
- 2000
|
|
24
|
Gathering at the Well: Creating Communities for Grid I/O
– Thain, Bent, et al.
- 2001
|
|
22
|
Cheap cycles from the desktop to the dedicated cluster: combining opportunistic and dedicated scheduling with Condor
– WRIGHT
- 2001
|
|
21
|
Matchmaking Frameworks for Distributed Resource Management
– Raman
- 2000
|
|
18
|
Utilizing Widely Distributed Computational Resources Efficiently with Execution Domains
– Basney, Livny, et al.
- 2001
|
|
16
|
Bypass: A tool for building split execution systems
– Thain, Livny
- 2000
|
|
13
|
Globus: A metacomputing intrastructure toolkit
– Foster, Kesselman
- 1997
|
|
12
|
Error Scope on a Computational Grid: Theory and Practice
– Thain, Livny
- 2002
|
|
11
|
Design and evaluation of a resource selection framework for grid applications
– Angulo, Foster, et al.
- 2002
|
|
10
|
An enabling framework for master-worker applications on the computational grid
– Linderoth, Kulkarni, et al.
- 2000
|
|
9
|
The Crystal Multicomputer: Design and Implementation Experience
– DeWitt, Finkel, et al.
- 1987
|
|
8
|
Distributed scheduling for a changing environment
– Krueger
- 1988
|
|
8
|
RFC 2222: Simple Authentication and Security Layer(SASL
– Myers
- 1997
|
|
7
|
The Study of Load Balancing Algorithms for Decentralized Distributed Processing Systems
– Livny
- 1983
|