Results 1 - 10
of
55
Measurements of a Distributed File System
, 1991
"... We analyzed the user-level file access patterns and caching behavior of the Sprite distributed file system. The first part of our analysis repeated a study done in 1985 of the BSD UNIX file system. We found that file throughput has increased by a factor of 20 to an average of 8 Kbytes per second per ..."
Abstract
-
Cited by 399 (10 self)
- Add to MetaCart
We analyzed the user-level file access patterns and caching behavior of the Sprite distributed file system. The first part of our analysis repeated a study done in 1985 of the BSD UNIX file system. We found that file throughput has increased by a factor of 20 to an average of 8 Kbytes per second per active user over 10-minute intervals, and that the use of process migration for load sharing increased burst rates by another factor of six. Also, many more very large (multi-megabyte) files are in use today than in 1985. The second part of our analysis measured the behavior of Spite's main-memory file caches. Client-level caches average about 7 Mbytes in size (about one-quarter to onethird of main memory) and filter out about 50% of the traffic between clients and servers. 35% of the remaining server traffic is caused by paging, even on workstations with large memories. We found that client cache consistency is needed to prevent stale data errors, but that it is not invoked often enough to degrade overall system performance.
A comparison of file system workloads
- In Proceedings of the 2000 USENIX Annual Technical Conference
, 2000
"... In this report, we describe the collection of file system traces from three different environments. By using the auditing system to collect traces on client machines, we are able to get detailed traces with minimal kernel changes. We then present results of traffic analysis on the traces, contrastin ..."
Abstract
-
Cited by 197 (3 self)
- Add to MetaCart
In this report, we describe the collection of file system traces from three different environments. By using the auditing system to collect traces on client machines, we are able to get detailed traces with minimal kernel changes. We then present results of traffic analysis on the traces, contrasting them with those from previous studies. Based on these results, we argue that file systems must optimize disk layout for good read performance. 1
An Analytical Approach to File Prefetching
- In Proceedings of the USENIX 1997 Annual Technical Conference
, 1997
"... File prefetching is an effective technique for improving file access performance. In this paper, we present a file prefetching mechanism that is based on on-line analytic modeling of interesting system events and is transparent to higher levels. The mechanism, incorporated into a client's file cache ..."
Abstract
-
Cited by 115 (0 self)
- Add to MetaCart
File prefetching is an effective technique for improving file access performance. In this paper, we present a file prefetching mechanism that is based on on-line analytic modeling of interesting system events and is transparent to higher levels. The mechanism, incorporated into a client's file cache manager, seeks to build semantic structures that capture the intrinsic correlations between file accesses. It then heuristically uses these structures to represent distinct file usage patterns and exploits them to prefetch files from a file server. We show results of a simulation study and of a working implementation. Measurements suggest that our method can predict future file accesses with an accuracy around 90%, that it can reduce cache miss rate by up to 47% and application latency by up to 40%. Our method imposes little overhead, even under antagonistic circumstances. 1 Introduction This paper reports the effectiveness of a predictive file prefetching technique that operates automat...
File System Usage in Windows NT 4.0
- ACM Symposium on Operating System Principles (Kiawah Island Resort
, 1999
"... We have performed a study of the usage of the Windows NT File System through long-term kernel tracing. Our goal was to provide a new data point with respect to the 1985 and 1991 trace-based File System studies, to investigate the usage details of the Windows NT file system architecture, and to study ..."
Abstract
-
Cited by 106 (0 self)
- Add to MetaCart
We have performed a study of the usage of the Windows NT File System through long-term kernel tracing. Our goal was to provide a new data point with respect to the 1985 and 1991 trace-based File System studies, to investigate the usage details of the Windows NT file system architecture, and to study the overall statistical behavior of the usage data. In this paper we report on these issues through a detailed comparison with the older traces, through details on the operational characteristics and through a usage analysis of the file system and cache manager. Next to architectural insights we provide evidence for the pervasive presence of heavy-tail distribution characteristics in all aspect of file system usage. Extreme variances are found in session inter-arrival time, session holding times, read/write frequencies, read/write buffer sizes, etc., which is of importance to system engineering, tuning and benchmarking.
Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files
- In Proceedings of the 1997 USENIX Technical Conference
, 1997
"... Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insuf ..."
Abstract
-
Cited by 92 (14 self)
- Add to MetaCart
Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insufficient --- exploiting disk bandwidth for small data objects requires that they be placed adjacently. We describe C-FFS (Co-locating Fast File System), which introduces two techniques, embedded inodes and explicit grouping, for exploiting what disks do well (bulk data movement) to avoid what they do poorly (reposition to new locations). With embedded inodes, the inodes for most files are stored in the directory with the corresponding name, removing a physical level of indirection without sacrificing the logical level of indirection. With explicit grouping, the data blocks of multiple small files named by a given directory are allocated adjacently and moved to and from the disk as a unit in ...
Long Term Distributed File Reference Tracing: Implementation and Experience
, 1994
"... DFSTrace is a system to collect and analyze long-term file reference data in a distributed UNIX workstation environment. The design of DFSTrace is unique in that it pays particular attention to efficiency, extensibility, and the logistics of long-term trace data collection in a distributed environme ..."
Abstract
-
Cited by 82 (3 self)
- Add to MetaCart
DFSTrace is a system to collect and analyze long-term file reference data in a distributed UNIX workstation environment. The design of DFSTrace is unique in that it pays particular attention to efficiency, extensibility, and the logistics of long-term trace data collection in a distributed environment. The components of DFSTrace are a set of kernel hooks, a kernel buffer mechanism, a data extraction agent, a set of collection servers, and post-processing tools. Our experience with DFSTrace has been highly positive. Tracing has been virtually unnoticeable, degrading performance 3-7%, depending on the level of detail of tracing. We have collected file reference traces from approximately 30 workstations continuously for over two years. We have implemented a post-processing library to provide a convenient programmer interface to the traces, and have created an on-line database of results from a suite of analysis programs to aid trace selection. Our data has been used for a wide variety of purposes, including file system studies, performance measurement and tuning, and debugging. Extensions of DFSTrace have enabled its use in applications such as field reliability testing and determining disk geometry. This paper presents the design, implementation, and evaluation of DFSTrace and associated tools, and describes how they have been used.
Track-aligned Extents: Matching Access Patterns to Disk Drive Characteristics
- IN PROCEEDINGS OF THE 1ST USENIX SYMPOSIUM ON FILE AND STORAGE TECHNOLOGIES(FAST '02
, 2002
"... Track-aligned extents (traxtents) utilize disk-specific knowledge to match access patterns to the strengths of modern disks. By allocating and accessing related data on disk track boundaries, a system can avoid most rotational latency and track crossing overheads. Avoiding these overheads can incre ..."
Abstract
-
Cited by 72 (19 self)
- Add to MetaCart
Track-aligned extents (traxtents) utilize disk-specific knowledge to match access patterns to the strengths of modern disks. By allocating and accessing related data on disk track boundaries, a system can avoid most rotational latency and track crossing overheads. Avoiding these overheads can increase disk access efficiency by up to 50 % for mid-sized requests (100-500 KB). This paper describes traxtents, algorithms for detecting track boundaries, and some uses of traxtents in file systems and video servers. For large-file workloads, a version of FreeBSD's FFS implementation that exploits traxtents reduces application run times by up to 20 % compared to the original version. A video server using traxtent-based requests can support 56 % more concurrent streams at the same startup latency and buffer space. For LFS, 44 % lower overall write cost for track-sized segments can be achieved.
The DiskSim Simulation Environment -- Version 2.0 Reference Manual
, 1999
"... DiskSim is an efficient, accurate and highly-configurable disk system simulator developed at the University of Michigan to support research into various aspects of storage subsystem architecture. It includes modules that simulate disks, intermediate controllers, buses, device drivers, request schedu ..."
Abstract
-
Cited by 64 (3 self)
- Add to MetaCart
DiskSim is an efficient, accurate and highly-configurable disk system simulator developed at the University of Michigan to support research into various aspects of storage subsystem architecture. It includes modules that simulate disks, intermediate controllers, buses, device drivers, request schedulers, disk block caches, and disk array data organizations. In particular, the disk drive module simulates modern disk drives in great detail and has been carefully validated against several production disks (with accuracy that exceeds any previously reported simulator). This manual
Disk Cache Replacement Policies for Network Fileservers
, 1993
"... Trace driven simulations are used to study the performance of several disk cache replacement policies for network fileservers. It is shown that locality based approaches, such as the common Least Recently Used (LRU) policy, which are known to work well on standalone disked workstations and at cli ..."
Abstract
-
Cited by 55 (4 self)
- Add to MetaCart
Trace driven simulations are used to study the performance of several disk cache replacement policies for network fileservers. It is shown that locality based approaches, such as the common Least Recently Used (LRU) policy, which are known to work well on standalone disked workstations and at client workstations in distributed systems, are inappropriate at a fileserver. Quite simple frequency based approaches do better.
Large Granularity Cache Coherence for Intermittent Connectivity
"... To function in mobile computing environments, distributed file systems must cope with networks that are slow, intermittent, or both. Intermittence vitiates the effectiveness of callback-based cache coherence schemes in reducing client-server communication, because clients must validate files when co ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
To function in mobile computing environments, distributed file systems must cope with networks that are slow, intermittent, or both. Intermittence vitiates the effectiveness of callback-based cache coherence schemes in reducing client-server communication, because clients must validate files when connections are reestablished. In this paper we show how maintaining cache coherence at a large granularity alleviates this problem. We report on the implementation and performance of large granularity cache coherence for the Coda File System. Our measurements confirm the value of this technique. At 9.6 Kbps, this technique takes only 4 – 20 % of the time required by two other strategies to validate the cache for a sample of Coda users. Even at this speed, the network is effectively eliminated as the bottleneck for cache validation.

