| Dan Teodosiu. End-to-end fault containment in scalable shared-memory multiprocessors. Ph.D. Thesis, Stanford University, 2000. |
....these problems is to use the Virtual Clusters approach, which effectively turns a large scale shared memory machine into a virtual cluster by combining the scalability and fault containment benefits of clusters with the resource management flexibility of shared memory systems. Previous research [9, 56] on Virtual Clusters has shown that a virtual machine monitor can leverage existing operating system technology to avoid scalability bottlenecks and provide fault containment at a low development cost. However, they still lacked scalable resource management. Resource management raises new ....
....an independent probability of failure. Therefore, large multiprocessors have at a higher risk of experiencing faults. On a fault unaware system, any single fault can crash the entire machine; thus affecting every task running on the system, even those that were not using the resource that failed [10, 56]. These designs have the undesirable property that the probability that a task is affected by a fault is proportional to the size of the system, not the number of resources being used by the task. Fault tolerance is a well known technique for designing systems that can withstand faults without ....
[Article contains additional citation context not shown here]
Dan Teodosiu. End-to-end fault containment in scalable shared-memory multiprocessors. Ph.D. Thesis, Stanford University, 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC