Results 1 - 10
of
54
Fast Byte-Granularity Software Fault Isolation
"... Bugs in kernel extensions remain one of the main causes of poor operating system reliability despite proposed techniques that isolate extensions in separate protection domains to contain faults. We believe that previous fault isolation techniques are not widely used because they cannot isolate exist ..."
Abstract
-
Cited by 59 (1 self)
- Add to MetaCart
(Show Context)
Bugs in kernel extensions remain one of the main causes of poor operating system reliability despite proposed techniques that isolate extensions in separate protection domains to contain faults. We believe that previous fault isolation techniques are not widely used because they cannot isolate existing kernel extensions with low overhead on standard hardware. This is a hard problem because these extensions communicate with the kernel using a complex interface and they communicate frequently. We present BGI (Byte-Granularity Isolation), a new software fault isolation technique that addresses this problem. BGI uses efficient byte-granularity memory protection to isolate kernel extensions in separate protection domains that share the same address space. BGI ensures type safety for kernel objects and it can detect common types of errors inside domains. Our results show that BGI is practical: it can isolate Windows drivers without requiring changes to the source code and it introduces a CPU overhead between 0 and 16%. BGI can also find bugs during driver testing. We found 28 new bugs in widely used Windows drivers.
Device driver safety through a reference validation mechanism
- In OSDI’08
"... Device drivers typically execute in supervisor mode and thus must be fully trusted. This paper describes how to move them out of the trusted computing base, by running them without supervisor privileges and constraining their interactions with hardware devices. An implementation of this approach in ..."
Abstract
-
Cited by 47 (4 self)
- Add to MetaCart
(Show Context)
Device drivers typically execute in supervisor mode and thus must be fully trusted. This paper describes how to move them out of the trusted computing base, by running them without supervisor privileges and constraining their interactions with hardware devices. An implementation of this approach in the Nexus operating system executes drivers in user space, leveraging hardware isolation and checking their behavior against a safety specification. These Nexus drivers have performance comparable to inkernel, trusted drivers, with a level of CPU overhead acceptable for most applications. For example, the monitored driver for an Intel e1000 Ethernet card has throughput comparable to a trusted driver for the same hardware under Linux. And a monitored driver for the Intel i810 sound card provides continuous playback. Drivers for a disk and a USB mouse have also been moved successfully to operate in user space with safety specifications. 1
The Design and Implementation of Microdrivers
, 2008
"... Device drivers commonly execute in the kernel to achieve high performance and easy access to kernel services. However, this comes at the price of decreased reliability and increased programming difficulty. Driver programmers are unable to use user-mode development tools and must instead use cumberso ..."
Abstract
-
Cited by 46 (5 self)
- Add to MetaCart
Device drivers commonly execute in the kernel to achieve high performance and easy access to kernel services. However, this comes at the price of decreased reliability and increased programming difficulty. Driver programmers are unable to use user-mode development tools and must instead use cumbersome kernel tools. Faults in kernel drivers can cause the entire operating system to crash. Usermode drivers have long been seen as a solution to this problem, but suffer from either poor performance or new interfaces that require a rewrite of existing drivers. This paper introduces the Microdrivers architecture that achieves high performance and compatibility by leaving critical path code in the kernel and moving the rest of the driver code to a user-mode process. This allows data-handling operations critical to I/O performance to run at full speed, while management operations such as initialization and configuration run at reduced speed in userlevel. To achieve compatibility, we present DriverSlicer, a tool that splits existing kernel drivers into a kernel-level component and a user-level component using a small number of programmer annotations. Experiments show that as much as 65 % of driver code can be removed from the kernel without affecting common-case performance, and that only 1-6 percent of the code requires annotations.
Dingo: Taming Device Drivers
, 2009
"... Device drivers are notorious for being a major source of failure in operating systems. In analysing a sample of real defects in Linux drivers, we found that a large proportion (39%) of bugs are due to two key shortcomings in the device-driver architecture enforced by current operating systems: poorl ..."
Abstract
-
Cited by 44 (11 self)
- Add to MetaCart
Device drivers are notorious for being a major source of failure in operating systems. In analysing a sample of real defects in Linux drivers, we found that a large proportion (39%) of bugs are due to two key shortcomings in the device-driver architecture enforced by current operating systems: poorly-defined communication protocols between drivers and the OS, which confuse developers and lead to protocol violations, and a multithreaded model of computation that leads to numerous race conditions and deadlocks. We claim that a better device driver architecture can help reduce the occurrence of these faults, and present our Dingo framework as constructive proof. Dingo provides a formal, state-machine based, language for describing driver protocols, which avoids confusion and ambiguity, and helps driver writers implement correct behaviour. It also enforces an event-driven model of computation, which eliminates most concurrency-related faults. Our implementation of the Dingo architecture in Linux offers these improvements, while introducing negligible performance overhead. It allows Dingo and native Linux drivers to coexist, providing a gradual migration path to more reliable device drivers.
The Role of Virtualization in Embedded Systems
"... System virtualization, which enjoys immense popularity in the enterprise and personal computing spaces, is recently gaining significant interest in the embedded domain. Starting from a comparison of key characteristics of enterprise systems and embedded systems, we will examine the difference in mot ..."
Abstract
-
Cited by 40 (4 self)
- Add to MetaCart
(Show Context)
System virtualization, which enjoys immense popularity in the enterprise and personal computing spaces, is recently gaining significant interest in the embedded domain. Starting from a comparison of key characteristics of enterprise systems and embedded systems, we will examine the difference in motivation for the use of system virtual machines, and the resulting differences in the requirements for the technology. We find that these differences are quite substantial, and that virtualization is unable to meet the special requirements of embedded systems. Instead, more general operatingsystems technologies are required, which support virtualization as a special case. We argue that high-performance microkernels, specifically L4, are a technology that provides a good match for the requirements of next-generation embedded systems. 1.
Failure Resilience for Device Drivers
"... Studies have shown that device drivers and extensions contain 3–7 times more bugs than other operating system code and thus are more likely to fail. Therefore, we present a failure-resilient operating system design that can recover from dead drivers and other critical components—primarily through mo ..."
Abstract
-
Cited by 30 (11 self)
- Add to MetaCart
Studies have shown that device drivers and extensions contain 3–7 times more bugs than other operating system code and thus are more likely to fail. Therefore, we present a failure-resilient operating system design that can recover from dead drivers and other critical components—primarily through monitoring and replacing malfunctioning components on the fly—transparent to applications and without user intervention. This paper focuses on the post-mortem recovery procedure. We explain the working of our defect detection mechanism, the policy-driven recovery procedure, and post-restart reintegration of the components. Furthermore, we discuss the concrete steps taken to recover from network, block device, and character device driver failures. Finally, we evaluate our design using performance measurements, software fault-injection experiments, and an analysis of the reengineering effort.
Tolerating Hardware Device Failures in Software
"... Hardware devices can fail, but many drivers assume they do not. When confronted with real devices that misbehave, these assumptions can lead to driver or system failures. While major operating system and device vendors recommend that drivers detect and recover from hardware failures, we find that th ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
(Show Context)
Hardware devices can fail, but many drivers assume they do not. When confronted with real devices that misbehave, these assumptions can lead to driver or system failures. While major operating system and device vendors recommend that drivers detect and recover from hardware failures, we find that there are many drivers that will crash or hang when a device fails. Such bugs cannot easily be detected by regular stress testing because the failures are induced by the device and not the software load. This paper describes Carburizer, a code-manipulation tool and associated runtime that improves system reliability in the presence of faulty devices. Carburizer analyzes driver source code to find locations where the driver incorrectly trusts the hardware to behave. Carburizer identified almost 1000 such bugs in Linux drivers with a false positive rate of less than 8 percent. With the aid of shadow drivers for recovery, Carburizer can automatically repair 840 of these bugs with no programmer involvement. To facilitate proactive management of device failures, Carburizer can also locate existing driver code that detects device failures and inserts missing failure-reporting code. Finally, the Carburizer runtime can detect and tolerate interrupt-related bugs, such as stuck or missing interrupts. 1
Tolerating malicious device drivers in linux
- In USENIX ATC
, 2010
"... This paper presents SUD, a system for running existing Linux device drivers as untrusted user-space processes. Even if the device driver is controlled by a malicious adversary, it cannot compromise the rest of the system. One significant challenge of fully isolating a driver is to confine the action ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
(Show Context)
This paper presents SUD, a system for running existing Linux device drivers as untrusted user-space processes. Even if the device driver is controlled by a malicious adversary, it cannot compromise the rest of the system. One significant challenge of fully isolating a driver is to confine the actions of its hardware device. SUD relies on IOMMU hardware, PCI express bridges, and message-signaled interrupts to confine hardware devices. SUD runs unmodified Linux device drivers, by emulating a Linux kernel environment in user-space. A prototype of SUD runs drivers for Gigabit Ethernet, 802.11 wireless, sound cards, USB host controllers, and USB devices, and it is easy to add a new device class. SUD achieves the same performance as an in-kernel driver on networking benchmarks, and can saturate a Gigabit Ethernet link. SUD incurs a CPU overhead comparable to existing run-time driver isolation techniques, while providing much stronger isolation guarantees for untrusted drivers. Finally, SUD requires minimal changes to the kernel—just two kernel modules comprising 4,000 lines of code—which may at last allow the adoption of these ideas in practice. 1
Construction of a Highly Dependable Operating System
- In Proc. 6th European Dependable Computing Conf
, 2006
"... It has been well established that most operating system crashes are due to bugs in device drivers. Because drivers are normally linked into the kernel address space, a buggy driver can wipe out kernel tables and bring the system crashing to a grinding halt. We have greatly mitigated this problem by ..."
Abstract
-
Cited by 26 (11 self)
- Add to MetaCart
(Show Context)
It has been well established that most operating system crashes are due to bugs in device drivers. Because drivers are normally linked into the kernel address space, a buggy driver can wipe out kernel tables and bring the system crashing to a grinding halt. We have greatly mitigated this problem by reducing the kernel to an absolute minimum and running each driver as a separate, unprivileged user-mode process. In addition, we implemented a POSIX-conformant operating system, MINIX 3, as multiple user-mode servers. In this design, a server or driver failure no longer is fatal and does not require rebooting the computer. This paper discusses how we designed and implemented the system, which problems we encountered, and how we solved these problems. We also discuss the performance effects of our changes and evaluate how our multiserver design improves operating system dependability over monolithic designs. ’Perfection is not achieved when there is nothing left to add, but when there is nothing left to take away.’ 1.
The OKL4 Microvisor: Convergence Point of Microkernels and Hypervisors ABSTRACT
"... We argue that recent hypervisor-vs-microkernel discussions completely miss the point. Fundamentally, the two classes of systems have much in common, and provide similar abstractions. We assert that the requirements for both types of systems can be met with a single set of abstractions, a single desi ..."
Abstract
-
Cited by 24 (6 self)
- Add to MetaCart
We argue that recent hypervisor-vs-microkernel discussions completely miss the point. Fundamentally, the two classes of systems have much in common, and provide similar abstractions. We assert that the requirements for both types of systems can be met with a single set of abstractions, a single design, and a single implementation. We present partial proof of the existence of this convergence point, in the guise of the OKL4 microvisor, an industrial-strength system designed as a highly-efficient hypervisor for use in embedded systems. It is also a third-generation microkernel that aims to support the construction of similarly componentised systems as classical microkernels. Benchmarks show that the microvisor’s virtualization performance is highly competitive.