Fault tolerance for a workstation cluster (1993) [3 citations — 0 self]
by Elmootazbellah N. Elnozahy, Willy Zwaenepoel
In Proc. of the Workshop on Hardware and Software Architectures for Fault Tolerance
ftp://ftp.cs.cmu.edu/user/mootaz/papers/fta.ps
Add To MetaCart
Abstract:
Recent technological trends are making it feasible to build loosely coupled multicomputers with workstation clusters. A typical configuration would consist of a number of high performance workstations connected via a high speed network. The workstations offer the compute cycles necessary to run sequential or parallel user
Citations
| 1746 | Time, clocks, and the ordering of events in a distributed system – Lamport - 1978 |
| 341 | Reliable Broadcast Protocols – Chang, Maxemchuck - 1984 |
| 253 | Optimistic recovery in distributed systems – Strom, Yemini - 1985 |
| 247 | Fail-Stop Processors: An Approach to Designing Fault-Tolerant Computing Systems – Schlichting, Schneider - 1982 |
| 162 | Manetho: Transparent rollback-recovery with low overhead, limited rollback, and fast output commit – Elnozahy, Zwaenepoel - 1992 |
| 151 | Group Communication in the Amoeba Distributed Operating Systems – Kaashoek, Tanenbaum - 1991 |
| 141 | Broadcast protocols for distributed systems – Melliar-Smith, Moser, et al. - 1990 |
| 92 | Replication and fault-tolerance in the ISIS system – Birman - 1985 |
| 79 | Replicated distributed programs – COOPER - 1985 |
| 42 | Distributed System Fault Tolerance Using Message Logging and Checkpointing – Johnson - 1989 |
| 28 | Replicated distributed processes in Manetho – Elnozahy, Zwaenepoel - 1992 |

