Results 1 - 10
of
53
Semantically-Smart Disk Systems
, 2003
"... We propose and evaluate the concept of a semantically-smart disk system (SDS). As opposed to a traditional "smart" disk, an SDS has detailed knowledge of how the file system above is using the disk system, including information about the on-disk data structures of the file system. An SDS e ..."
Abstract
-
Cited by 100 (13 self)
- Add to MetaCart
(Show Context)
We propose and evaluate the concept of a semantically-smart disk system (SDS). As opposed to a traditional "smart" disk, an SDS has detailed knowledge of how the file system above is using the disk system, including information about the on-disk data structures of the file system. An SDS exploits this knowledge to transparently improve performance or enhance functionality beneath a standard block read/write interface. To automatically acquire this knowledge, we introduce a tool (EOF) that can discover file-system structure for certain types of file systems, and then show how an SDS can exploit this knowledge on-line to understand file-system behavior. We quantify the space and time overheads that are common in an SDS, showing that they are not excessive. We then study the issues surrounding SDS construction by designing and implementing a number of prototypes as case studies; each case study exploits knowledge of some aspect of the file system to implement powerful functionality beneath the standard SCSI interface. Overall, we find that a surprising amount of functionality can be embedded within an SDS, hinting at a future where disk manufacturers can compete on enhanced functionality and not simply cost-per-byte and performance.
A nine year study of file system and storage benchmarking
- ACM Transactions on Storage
, 2008
"... Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features ..."
Abstract
-
Cited by 55 (8 self)
- Add to MetaCart
Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features and optimizations, so no single benchmark is always suitable. The large variety of workloads that these systems experience in the real world also adds to this difficulty. In this article we survey 415 file system and storage benchmarks from 106 recent papers. We found that most popular benchmarks are flawed and many research papers do not provide a clear indication of true performance. We provide guidelines that we hope will improve future performance evaluations. To show how some widely used benchmarks can conceal or overemphasize overheads, we conducted a set of experiments. As a specific example, slowing down read operations on ext2 by a factor of 32 resulted in only a 2–5 % wall-clock slowdown in a popular compile benchmark. Finally, we discuss future work to improve file system and storage benchmarking.
Operating system profiling via latency analysis
- In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (ACM SIGOPS
, 2006
"... Operating systems are complex and their behavior depends on many factors. Source code, if available, does not directly help one to understand the OS’s behavior, as the behavior depends on actual workloads and external inputs. Runtime profiling is a key technique to prove new concepts, debug problems ..."
Abstract
-
Cited by 37 (13 self)
- Add to MetaCart
(Show Context)
Operating systems are complex and their behavior depends on many factors. Source code, if available, does not directly help one to understand the OS’s behavior, as the behavior depends on actual workloads and external inputs. Runtime profiling is a key technique to prove new concepts, debug problems, and optimize performance. Unfortunately, existing profiling methods are lacking in important areas—they do not provide enough information about the OS’s behavior, they require OS modification and therefore are not portable, or they incur high overheads thus perturbing the profiled OS. We developed OSprof: a versatile, portable, and efficient OS profiling method based on latency distributions analysis. OSprof automatically selects important profiles for subsequent visual analysis. We have demonstrated that a suitable workload can be used to profile virtually any OS component. OSprof is portable because it can intercept operations and measure OS behavior from user-level or from inside the kernel without requiring source code. OSprof has typical CPU time overheads below 4%. In this paper we describe our techniques and demonstrate their usefulness through a series of profiles conducted on Linux, FreeBSD, and Windows, including client/server scenarios. We discovered and investigated a number of interesting interactions, including scheduler behavior, multi-modal I/O distributions, and a previously unknown lock contention, which we fixed. 1
A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications
"... We analyze the I/O behavior of iBench, a new collection of productivity and multimedia application workloads. Our analysis reveals a number of differences between iBench and typical file-system workload studies, including the complex organization of modern files, the lack of pure sequential access, ..."
Abstract
-
Cited by 34 (9 self)
- Add to MetaCart
(Show Context)
We analyze the I/O behavior of iBench, a new collection of productivity and multimedia application workloads. Our analysis reveals a number of differences between iBench and typical file-system workload studies, including the complex organization of modern files, the lack of pure sequential access, the influence of underlying frameworks on I/O patterns, the widespread use of file synchronization and atomic operations, and the prevalence of threads. Our results have strong ramifications for the design of next generation local and cloud-based storage systems. 1.
Model-Based Failure Analysis of Journaling File Systems
- In Proceedings of the International Conference on Dependable Systems and Networks (DSN-2005
, 2005
"... We propose a novel method to measure the dependability of journaling file systems. In our approach, we build models of how journaling file systems must behave under different journaling modes and use these models to analyze file system behavior under disk failures. Using our techniques, we measure t ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
(Show Context)
We propose a novel method to measure the dependability of journaling file systems. In our approach, we build models of how journaling file systems must behave under different journaling modes and use these models to analyze file system behavior under disk failures. Using our techniques, we measure the robustness of three important Linux journaling file systems: ext3, Reiserfs and IBM JFS. From our analysis, we identify several design flaws and correctness bugs present in these file systems, which can cause serious file system errors ranging from data corruption to unmountable file systems. 1
Redline: First class support for interactivity in commodity operating systems
- IN PROC. OF THE OSDI
, 2008
"... While modern workloads are increasingly interactive and resource-intensive (e.g., graphical user interfaces, browsers, and multimedia players), current operating systems have not kept up. These operating systems, which evolved from core designs that date to the 1970s and 1980s, provide good support ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
(Show Context)
While modern workloads are increasingly interactive and resource-intensive (e.g., graphical user interfaces, browsers, and multimedia players), current operating systems have not kept up. These operating systems, which evolved from core designs that date to the 1970s and 1980s, provide good support for batch and command-line applications, but their ad hoc attempts to handle interactive workloads are poor. Their best-effort, priority-based schedulers provide no bounds on delays, and their resource managers (e.g., memory managers and disk I/O schedulers) are mostly oblivious to response time requirements. Pressure on any one of these resources can significantly degrade application responsiveness. We present Redline, a system that brings first-class support for interactive applications to commodity operating systems. Redline works with unaltered applications and standard APIs. It uses lightweight specifications to orchestrate memory and disk I/O management so that they serve the needs of interactive applications. Unlike realtime systems that treat specifications as strict requirements and thus pessimistically limit system utilization, Redline dynamically adapts to recent load, maximizing responsiveness and system utilization. We show that Redline delivers responsiveness to interactive applications even in the face of extreme workloads including fork bombs, memory bombs and bursty, large disk I/O requests, reducing application pauses by up to two orders of magnitude.
Accurate and efficient replaying of file system traces
- In Proc. USENIX Conference on File and Storage Technologies (FAST’05
, 2005
"... Replaying traces is a time-honored method for benchmarking, stress-testing, and debugging systems—and more recently—forensic analysis. One benefit to replaying traces is the reproducibility of the exact set of operations that were captured during a specific workload. Existing trace capture and repla ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
(Show Context)
Replaying traces is a time-honored method for benchmarking, stress-testing, and debugging systems—and more recently—forensic analysis. One benefit to replaying traces is the reproducibility of the exact set of operations that were captured during a specific workload. Existing trace capture and replay systems operate at different levels: network packets, disk device drivers, network file systems, or system calls. System call replayers miss memory-mapped operations and cannot replay I/Ointensive workloads at original speeds. Traces captured at other levels miss vital information that is available only at the file system level. We designed and implemented Replayfs, the first system for replaying file system traces at the VFS level. The VFS is the most appropriate level for replaying file system traces because all operations are reproduced in a manner that is most relevant to file-system developers. Thanks to the uniform VFS API, traces can be replayed transparently onto any existing file system, even a different one than the one originally traced, without modifying existing file systems. Replayfs’s user-level compiler prepares a trace to be replayed efficiently in the kernel where multiple kernel threads prefetch and schedule the replay of file system operations precisely and efficiently. These techniques allow us to replay I/O-intensive traces at different speeds, and even accelerate them on the same hardware that the trace was captured on originally. 1
A Logic of File Systems
- In Proceedings of the 4th USENIX Symposium on File and Storage Technologies (FAST ’05
, 2005
"... Years of innovation in file systems have been highly successful in improving their performance and functionality, but at the cost of complicating their interaction with the disk. A variety of techniques exist to ensure consistency and integrity of file system data, but the precise set of correctness ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
(Show Context)
Years of innovation in file systems have been highly successful in improving their performance and functionality, but at the cost of complicating their interaction with the disk. A variety of techniques exist to ensure consistency and integrity of file system data, but the precise set of correctness guarantees provided by each technique is often unclear, making them hard to compare and reason about. The absence of a formal framework has hampered detailed verification of file system correctness. We present a logical framework for modeling the interaction of a file system with the storage system, and show how to apply the logic to represent and prove correctness properties. We demonstrate that the logic provides three main benefits. First, it enables reasoning about existing file system mechanisms, allowing developers to employ aggressive performance optimizations without fear of compromising correctness. Second, the logic simplifies the introduction and adoption of new file system functionality by facilitating rigorous proof of their correctness. Finally, the logic helps reason about smart storage systems that track semantic information about the file system. A key aspect of the logic is that it enables incremental modeling, significantly reducing the barrier to entry in terms of its actual use by file system designers. In general, we believe that our framework transforms the hitherto esoteric and error-prone “art ” of file system design into a readily understandable and formally verifiable process. 1
EXCES: EXternal Caching in Energy Saving Storage Systems
, 2007
"... Power consumption within the disk-based storage subsystem forms a substantial portion of the overall energy footprint in commodity systems. Researchers have proposed external caching on a persistent, low-power storage device which we term external caching device (ECD), to minimize disk activity and ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
(Show Context)
Power consumption within the disk-based storage subsystem forms a substantial portion of the overall energy footprint in commodity systems. Researchers have proposed external caching on a persistent, low-power storage device which we term external caching device (ECD), to minimize disk activity and conserve energy. While recent simulation-based studies have argued in favor of this approach, the lack of an actual system implementation has precluded answering several key questions about external caching systems. We present the design and implementation of EXCES, an external caching system that employs prefetching, caching, and buffering of disk data for reducing disk activity. EXCES addresses important questions relating to external caching, including the estimation of future data popularity, I/O indirection, continuous reconfiguration of the ECD contents, and data consistency. We evaluated EXCES with both micro- and macro-benchmarks that address idle, I/O intensive, and real-world workloads. While earlier studies had focused on disk energy savings alone and had predicted as much as 90 % savings, our experiments with EXCES revealed that the overall system energy savings, which accounts for the additional energy consumed by the ECD, is a more modest 2-14%, depending on the workload. Further, while the CPU and memory overheads of EXCES were well within acceptable limits, we found that flash-based external caching can substantially degrade I/O performance. We believe that external caching systems hold promise. However, substantial improvements in ECD technology, both in terms of power consumption and performance, must ensue before the full potential of such systems are realized.
Semantically-Smart Disk Systems: Past, Present, and Future
- In The Eighth Workshop on Hot Topics in Operating Systems (HotOS VIII
, 2001
"... In this paper we describe research that has been on-going within our group for the past four years on semantically-smart disk systems. A semantically-smart system goes beyond typical blockbased storage systems by extracting higher-level information from the stream of traffic to disk; doing so enable ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
In this paper we describe research that has been on-going within our group for the past four years on semantically-smart disk systems. A semantically-smart system goes beyond typical blockbased storage systems by extracting higher-level information from the stream of traffic to disk; doing so enables new and interesting pieces of functionality to be implemented within lowlevel storage systems. We first describe the development of our efforts over the past four years, highlighting the key technologies needed to build semantically-smart systems as well as the main weaknesses of our approach. We then discuss future directions in the design and implementation of smarter storage systems.