MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Error management in the pluggable file system (2002) [1 citations — 0 self]

Download:
Download as a PDF | Download as a PS
by Douglas Thain, Miron Livny
http://www.cs.wisc.edu/condor/doc/pfs-tr.ps
Add To MetaCart

Abstract:

Distributed computing continues to be an alphabet-soup of services and protocols. No single system for managing CPUs or I/O devices has emerged (or is likely to emerge) as a universal solution. Therefore, distributed applications require adapters in order to plug themselves into existing systems. The difficulty of building such adapters lies not in normal operations, but in the complications of failures and other unusual situations. We demonstrate this with the Pluggable File System, an adapter for connecting POSIX applications to remote I/O services. We offer a detailed discussion of the construction of the system while dealing with failures and other events that are not trivially mapped into the application's expectations. The key insight is that correct I/O management requires coordination with CPU management. We conclude with some practical advice for others constructing similar software. 1

Citations

1094 The GRID: Blueprint for a new Computing Infrastructure – Foster, Kesselman - 1999
862 Condor - A Hunter of Idle Workstations – Litzkow, Livny, et al. - 1988
406 Design and Implementation of the Sun Network Filesystem – Sandberg, Goldberg, et al. - 1985
204 Vnodes: An architecture for multiple file system types – Kleiman - 1986
152 Utopia: a Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems – Zhou, Zheng, et al. - 1993
101 Detours: Binary interception of Win32 functions – Hunt, Brubacher - 1999
84 SLIC: An Extensibility System for Commodity Operating Systems – Ghormley, Petrou, et al. - 1998
69 The kangaroo approach to data movement on the grid – Thain, Basney, et al. - 2001
68 Core algorithms of the maui scheduler – Jackson, Snell, et al. - 2001
58 Portable Batch System: External reference specification – Henderson, Tweten - 1996
49 The Cambridge Distributed Computing System – Needham, Herbert - 1982
47 Remote I/O: Fast Access to Distant Storage – Foster, Kohr, et al. - 1997
37 Providing resource management services to parallel applications – Pruyne, Livny - 1994
36 Multiple bypass: Interposition agents for distributed computing – Thain, Livny - 2001
33 Knit: Component composition for systems software – Reid, Flatt, et al. - 2000
28 Gathering at the well: Creating communities for grid I/O – Thain, Bent, et al. - 2001
23 Cheap Cycles from the Desktop to the Dedicated Cluster: Combining Opportunistic and Dedicated Scheduling with Condor – Wright - 2001
14 Globus: A metacomputing intrastructure toolkit – Foster, Kesselman - 1997
14 On the Security of the RC5 Encryption Algorithm – Kaliski, Yin - 1998
10 Error scope on a computational grid – Thain, Livny - 2002
7 Protocol independence using the sockets API – Metz - 2002
7 FTP: File transfer protocol specification. Internet Engineering Task Force Request for Comments (RFC) 765 – Postel - 1980
3 Hypertext transfer protocol (HTTP). Internet Engineering Task Force Request for Comments (RFC) 2616 – Fielding, Gettys, et al. - 1999
3 Ace: a language for parallel programming with customizable protocols – Raghavachari, Rogers - 1999