Results 11 - 20
of
35
Workload-Specific File System Benchmarks
, 2001
"... To Maddie, who didn’t understand why Daddy had to work late And to Jackie, who did A fundamental problem with the current generation of file system benchmarks is that they fail to take into account the fact that a file system’s performance can vary depending on the workload running on it. Many bench ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
To Maddie, who didn’t understand why Daddy had to work late And to Jackie, who did A fundamental problem with the current generation of file system benchmarks is that they fail to take into account the fact that a file system’s performance can vary depending on the workload running on it. Many benchmarks attempt to reduce file system perfor-mance to a single number, producing a simplistic one-dimensional ordering of the sys-tems being tested. Although this may be useful for marketing literature, the performance of file systems in the real world is more complicated. Different workloads place different demands on the file system, and can result in different behavior from the underlying sys-tem. A file system that provides superior performance for a web server may have inferior performance when running a software development workload. In this dissertation I demonstrate that the “one size fits all ” approach of current file system benchmarks does not accurately predict the performance of different workloads on different file systems. I then present a new benchmarking methodology
A case for increased operating system support in chip multiprocessors
- In Proc. of 2nd IBM Watson P=ac 2
, 2005
"... We identify the operating system as one area where a novel architecture could significantly improve on current chip multi-processor designs, allowing increased performance and improved power efficiency. We first show that the operating system contributes a non-trivial overhead to even the most compu ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
We identify the operating system as one area where a novel architecture could significantly improve on current chip multi-processor designs, allowing increased performance and improved power efficiency. We first show that the operating system contributes a non-trivial overhead to even the most computationally intense workloads and that this OS contribution grows to a significant fraction of total instructions when executing interactive applications. We then show that architectural improvements have had little to no effect on the performance of the operating system over the last 15 years. Based on these observations we propose the need for increased operating system support in chip multiprocessors. Specifically we consider the potential of a separate Operating System Processor (OSP) operating concurrently with General Purpose Processors (GPP) in a Chip Multi-Processor (CMP) organization. 1
Using Context to Assist in Personal File Retrieval
, 2006
"... Personal data is growing at ever increasing rates, fueled by a growing market for personal computing solutions and dramatic growth of available storage space on these platforms. Users, no longer limited in what they can store, are now faced with the problem of organizing their data such that they ca ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Personal data is growing at ever increasing rates, fueled by a growing market for personal computing solutions and dramatic growth of available storage space on these platforms. Users, no longer limited in what they can store, are now faced with the problem of organizing their data such that they can find it again later. Unfortunately, as data sets grow the complexity of organizing these sets also grows. This problem has driven a sudden growth in search tools aimed at the personal computing space, designed to assist users in locating data within their disorganized file space.
Simple and General Statistical Profiling with PCT
- IN PROC. USENIX TECHNICAL CONFERENCE
, 2002
"... The Profile Collection Toolkit (PCT) provides a novel generalized CPU profiling facility. PCT enables arbitrarily late profiling activation and arbitrarily early report generation. PCT usually requires no re-compilation, re-linking, or even restarting of programs. Profiling reports gracefully degrad ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The Profile Collection Toolkit (PCT) provides a novel generalized CPU profiling facility. PCT enables arbitrarily late profiling activation and arbitrarily early report generation. PCT usually requires no re-compilation, re-linking, or even restarting of programs. Profiling reports gracefully degrade with available debugging data. PCT uses
Providing a Linux API on the Scalable K42 Kernel
"... K42 is an open-source research kernel targeted for 64bit cache-coherent multiprocessor systems. It was designed to scale up to multiprocessor systems containing hundreds or thousands of processors and to scale down to perform well on 2- to 4-way multiprocessors. K42's goal was to re-design the core ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
K42 is an open-source research kernel targeted for 64bit cache-coherent multiprocessor systems. It was designed to scale up to multiprocessor systems containing hundreds or thousands of processors and to scale down to perform well on 2- to 4-way multiprocessors. K42's goal was to re-design the core of an operating system, but not an entire application environment. We wanted to use a commonly available interface with a large established code base. Because Linux is open source and widely available, we chose to support its application environment by supporting the Linux API and ABI. There were some interesting complications as well as advantages that arose from K42's structure because our implementation of the Linux application environment was done primarily in user space, had to interface with K42's object-oriented technology, and used fine-grained locking. Other research systems efforts directed at achieving a high degree of scalability and maintainability exhibit similar structural characteristics. In this
Trace-Based Analyses and Optimizations for Network Storage Servers
, 2004
"... In this thesis, I show how network storage servers can infer useful information about the requests they are likely to see in the future by analyzing the history of requests they have observed in the past. I also show that this information can be used to improve future decisions about disk block allo ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this thesis, I show how network storage servers can infer useful information about the requests they are likely to see in the future by analyzing the history of requests they have observed in the past. I also show that this information can be used to improve future decisions about disk block allocation and read-ahead and thereby increase network storage server performance without any change to its clients or the applications running on its clients.
1 Optimizing Bandwidth of Call Traces for Wireless Embedded Systems
"... Abstract—Call traces expose runtime behaviors that greatly aid system developers in profiling performance and diagnosing problems within wireless embedded applications. Strict resource constraints limit the volume of trace data that can be handled on embedded devices, especially bandwidth limited wi ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract—Call traces expose runtime behaviors that greatly aid system developers in profiling performance and diagnosing problems within wireless embedded applications. Strict resource constraints limit the volume of trace data that can be handled on embedded devices, especially bandwidth limited wireless embedded systems. We propose two new call trace gathering techniques, local identifier logging and control flow logging, which provide significant reductions in bandwidth consumption compared to the current standard practice of global identifier logging. Intuition into the savings made possible by the proposed trace gathering techniques is provided by an analytical comparison of the bandwidth required by various call tracing approaches. Confirmation of this intuition is demonstrated through experimentation that reveals log bandwidth savings of approximately 85 % compared to global identifier logging using flat name spaces, and 35 % compared to global identifier logging using optimal Huffman coding. Index Terms—Wireless embedded systems, logging, bandwidth compression, dataflow. I.
Scoped Identifiers for Efficient Bit Aligned Logging
- In DATE. ACM, 2010. Thomas Schmid, Roy Shea, Zainul Charbiwala, Jonathan Friedman, Young Cho, and Mani Srivastava. “On the Interaction of Clocks, Power, and Synchronization in Embedded Sensor Nodes.” In TOSN
"... Abstract—Detailed diagnostic data is a prerequisite for debugging problems and understanding runtime performance in distributed wireless embedded systems. Severe bandwidth limitations, tight timing constraints, and limited program text space hinder the application of standard diagnostic tools within ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract—Detailed diagnostic data is a prerequisite for debugging problems and understanding runtime performance in distributed wireless embedded systems. Severe bandwidth limitations, tight timing constraints, and limited program text space hinder the application of standard diagnostic tools within this domain. This work introduces the Log Instrumentation Specification (LIS), which provides a high level logging interface to developers and is able to create extremely compact diagnostic logs. LIS uses a token scoping technique to aggressively compact identifiers that are packed into bit aligned log buffers. LIS is evaluated in the context of recording call traces within a network of wireless sensor nodes. Our evaluation shows that logs generated using LIS require less than 50 % of the bandwidth utilized by alternate logging mechanisms. Through microbenchmarking of a complete LIS implementation for the TinyOS operating system, we demonstrate that LIS can comfortably fit onto low-end embedded systems. By significantly reducing log bandwidth, LIS enables extraction of a more complete picture of runtime behavior from distributed wireless embedded systems. I.
Towards runtime monitoring in real-time systems
- In Proceedings of the Eighth Real-Time Linux Workshop
, 2006
"... In this paper we present the state of our work on runtime monitoring for real-time systems: a way to observe system behavior online without unpredictably disturbing real-time properties. We discuss generic requirements to achieve these properties wherefrom we deduce our monitoring framework architec ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper we present the state of our work on runtime monitoring for real-time systems: a way to observe system behavior online without unpredictably disturbing real-time properties. We discuss generic requirements to achieve these properties wherefrom we deduce our monitoring framework architecture. We describe this architecture in detail and discuss several challenges for our implementation called Ferret. We also explain why common operating system primitives, such as message passing or system calls, should not be used for monitoring in the general case and propose a very low-intrusive alternative. We also propose a way of measuring the intrusiveness caused by monitoring. We applied our technique in different scenarios ranging from simple temporal debugging, resource requirement estimation, gaining behavioral information of peripheral hardware devices to build timing models for providing real-time capable service on top of them, up to whole-system views, such as the interaction between concurrently running system threads. Our research platform also contains a para-virtualized version of Linux that we use to run legacy applications. We discuss how to apply our framework to these components with real-time requirements being only one of several important aspects. We also show how to compare the behavior of our para-virtualized Linux kernel with the behavior of the native variant. In this work, we demonstrate how to gain a continuous whole-system view by using only Ferret sensors in all layers of our system, starting from the underlying microkernel, basic microkernel programs, real-time applications, and the para-virtualized Linux kernel, as well as Linux user-space applications. 1
Towards Scalable Event Tracing for High End Systems
"... Abstract. Although event tracing of parallel applications offers highly detailed performance information, tracing on current leading edge systems may lead to unacceptable perturbation of the target program and unmanageably large trace files. High end systems of the near future promise even greater s ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Although event tracing of parallel applications offers highly detailed performance information, tracing on current leading edge systems may lead to unacceptable perturbation of the target program and unmanageably large trace files. High end systems of the near future promise even greater scalability challenges. Development of more scalable approaches requires a detailed understanding of the interactions between current approaches and high end runtime environments. In this paper we present the results of studies that examine several sources of overhead related to tracing: instrumentation, differing trace buffer sizes, periodic buffer flushes to disk, system changes, and increasing numbers of processors in the target application. As expected, the overhead of instrumentation correlates strongly with the number of events; however, our results indicate that the contribution of writing the trace buffer increases with increasing numbers of processors. We include evidence that the total overhead of tracing is sensitive to the underlying file system. 1

