| Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, J. A., Kupfer, M., and Thompson, J. G., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proceedings of the 10th Symposium on Operating System Principles, pp. 15-24, December 1985. |
....queue length and disk speed. On these systems, users must resign themselves to the fact that their data may not be safe on disk when a write or close finishes. Many file systems improve performance further by delaying some writes to disk in the hopes of the new data being deleted or overwritten [54]. This delay is often set to 30 seconds, which risks the loss of data written within 30 seconds of a crash. Unfortunately, 1 3 to 2 3 of newly written data lives longer than 30 seconds [5] 27] and this data is written through to disk under this policy. File systems differ in how much data is ....
J.K. Ousterhout et al., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proc. 1985.
....traffic produced by today s Internet applications. Unfortunately, to the best of our knowledge, there is little recent published work on the exact characteristics of streaming media stored on the Internet. While there have been studies characterizing Web content [16] 17] and file system content [18], 19] measured at the client side, there have been no recent studies on the general characteristics of streaming media stored at the server on the Internet. In 1997, Acharya and Smith [20] characterized video content stored on the Web by analyzing every video available in the (then popular) Alta ....
J. Ousterhout, H.L. DaCosta, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A Trace-Driven Analysis of the Unix 4.2 BSD File System," in Proceedings of the 10th Symposium on Operating System Principles (SOSP), Dec. 1985.
.... between CPU and disk I O [1] I O parallelism, a means of increasing the aggregate I O bandwidth and capacity, is an attractive approach to bridge the gap between CUP and disk I O [19] 20] 21] Several challenges lie in parallel I O system include communication strategies, the file system policies [6], and the data storage layouts [7] There are three kinds of data intensive applications that require high performance I O, namely scientific simulations, multimedia applications and database management systems [2] In a data intensive application, if the data is too large to fit into memory to ....
.... from 32 KB to 512 KB, and the access sizes were relatively constant [4] Likewise, the regular patterns of large data access on a Cray C90 were studied in [5] Studies of sequential Unix system found that the sizes of many files were small, and many files were created and then removed shortly [6]. However, most general file system and storage hierarchy designers have little experimental data on parallel I O access patterns. Generally, tuning the I O system for the specified access pattern is an efficient approach to improve the overall performance. Unfortunately, perfect foreknowledge of ....
J. K. Ousterhout, H.D. Costa, D. Harrison, J.A. Kunze, M. Kupfer, and J.G.Thompson, "A trace- driven analysis of the UNIX 4.2 BSD file system," In Proc. of the Tenth ACM symposium on 15 Operating System Principles, Dec. 1985.
....disk arrays because codewords which do not have up to date parity are susceptible to a disk failure. Instead of sacrificing dependability to achieve higher performance, it is possible to increase performance through more traditional means, such as caching. Recognizing that disk traffic is bursty [Ousterhout85, Ruemmler92] a write back cache can be used to defer updates until the array is idle [Golding95, Menon93a, Symbios95a] By making the cache nonvolatile, the semantic that completed write operations are durable is preserved. Additionally, deferring write operations allows small sequential ....
Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, j. A., Kupfer, M., and Thompson, J. G. "A trace-driven analysis of the Unix 4.2 BSD file system." Proceedings of the 10th Symposium on Operating Systems Principles. Orcas Island, WA (1985) 15-24.
....that prefetch unnecessary data are completely useless and are therefore classified as time noncritical. The ufs file system prefetches file blocks sequentially until non sequential access is detected. Measurements of UNIX file systems indicate that most user level file reads are sequential [Ousterhout85, Baker91]. This is true of the system level workloads used in this dissertation. The ufs file system distinguishes between three types of file block writes: synchronous, asynchronous and delayed [Ritchie86] A synchronous file write immediately generates the corresponding I O request and causes the ....
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, J. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System", ACM Symposium on Operating System Principles, 1985, pp. 15--24.
....long term storage and migration simulator. 2. Previous Work Researchers have looked at disk or tape storage systems in the past. However, the nature of computing changes continually, so storage systems need periodic, if not constant, study. In the 1980 s, both Smith [14] and Ousterhout, et al. [10] made detailed studies of file activity on computing systems. While their observations are still useful, some of the underlying structure has lost relevance. For example, Smith primarily observed text based user files for thirteen months; the size and nature of today s multimedia files, unforeseen ....
....in the system. The first runs the file tracing program nightly as a cron routine and maintains an activity log; the second automates running the differencing program on trace files. Our statistics collection and analysis package for file systems has both strengths and weaknesses. Ousterhout [10] noted that 80 of all file creations have a lifetime of less than three minutes. Because the daemons, compilers and other programs that created these files during Ousterhout s work still exist, so do the temporary files they create. Our system File Activity Statistics Collected Access total ....
John K. Ousterhout, Herve Da Costa, David Harrison, John Kunze, Mike Kupfer, and James Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System. " Operating System Review 19(5), 1985, pp. 15-24
....file system. These instructions were counted by Mendel Rosenblum. This instruction sequence is relevant because it affects the performance of RAID I for large sequential operations. Traditional file system workloads are characterized by small transfers, performed 4 KBytes or 8 KBytes at a time. [Ouster] showed that files are generally small, and are read and written in their entirety. Sprite running on the RAID I host works well for this traditional file system workload, but unfortunately, runs into problems for large operations. All I O on Sprite is done through the kernel; when the disk read ....
John K. Ousterhout, Herve Da Costa, et. al., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System", Proceedings of the 10th SOSP, Operating Systems Review, Vol. 19, No. 5, December 1985, pp. 15-24.
....processing environment [9] However, the cleaning cost in a benchmarking environment is an unrealistic indicator since the benchmark is constantly demanding use of the file system. Unlike benchmark environments, the realworld behavior of most workstation environments is observed to be bursty [1][5]. For example, consider an application that has two phases in which it executes. In phase 1, it creates and deletes many small files. In phase 2, it computes or uses the network, or terminates. Examples of such applications include Sendmail and NNTP servers. In FFS, the writes for the new data are ....
Ousterhout, J., Costa, H., Harrison, D., Kunze, J., Kupfer, M., Thompson, J., "A Trace-Driven Analysis of the UNIX 4.2BSD File System,"' Proceedings of the Tenth Symposium on Operating System Principles, December 1985, 1524.
....its decisions. Also, with delayed allocation, short lived files which can be buffered in memory are often never allocated any real disk blocks. The files are removed and purged from the file cache before they are pushed to disk. Such short lived files appear to be relatively common in Unix systems [Ousterhout85, Baker91], and delayed allocation reduces both the number of metadata updates caused by such files and the impact of such files on file system fragmentation. Another benefit of delayed allocation is that files which are written randomly but have no holes can often be allocated contiguously. If all of the ....
Ousterhout, J., Da Costa, H., Harrison, D., Kunze, J., Kupfer, M., Thompson, J., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proceedings of the 10th Symposium on Operating System Principles, Orcas Island, WA, December 1985, 15-24.
....added to the system may take too long to be displayed. This delay will surface in the application as a failure to meet client specifications. A further issue is that most distributed file systems are built around data similar to those presented in studies by Satyanarayanan [19] and Ousterhout [15]. These studies show that most file system access patterns contain many more reads than writes. This may not be true in the case of dynamic data. Data in an emergency management application may be written with greater frequency. This has implications for the effectiveness of caching. If the number ....
J. K. Ousterhout et. al., "A Trace-Driven Analysis of the Unix 4.2 BSD File System," Communications of the ACM, 1985.
....performance. Although larger main memory caches will doubtless reduce the need for synchronous disk READs, they do not eliminate it indeed [Braunstein89] and [Baker91] suggest that the benefits of large buffer caches are more difficult to achieve than was predicted by earlier studies such as [Ousterhout85a]. The third approach, and the subject of this paper, is to reorganize the layout of data on disk as a function of the observed workload to minimize the time spent moving the head between data blocks of interest. The paper is organized as follows. The next section introduces previous work in this ....
....can be done by moving only 6 24 of the data less than half that required to move all the active quanta. 5.6 What are the best dependency algorithms Assuming that disk accesses are independent is not a good model of real system behavior. For example, many files are accessed sequentially [Ousterhout85a], and this access pattern is often reflected in the low level disk traffic despite the effects of file buffer caches. Taking account of these inter quanta dependencies in access patterns can improve the overall effectiveness of disk shuffling. With the smallest shuffling quanta (blocks) the ....
John K. Ousterhout, Herv Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson. "A trace-driven analysis of the UNIX 4.2 BSD file system." Proceedings of 10th ACM Symposium on Operating Systems Principles (Orcas Island, Washington). Published as Operating Systems Review 19(5):15--24, December 1985.
....exacerbate these difficulties, especially when the network is composed of a large number of heterogeneous machines. As a result of these difficulties, only a relatively small number of traces of Unix filesystem workloads have been conducted, primarily in computing research environments. [3], 4] and [5] are examples of such traces. Since distributed filesystems work by transmitting their activity over a network, it would seem reasonable to obtain traces of such systems by placing a tap on the network and collecting trace data based on the network traffic. Ethernet[6] based ....
Ousterhout J., et al. "A Trace-Driven Analysis of the Unix 4.2 BSD File System." Proc. 10th ACM Symp. on Operating Systems Principles, 1985.
....Several researchers have looked at file systems in the past; these systems include both disk only and hierarchical storage systems. However, the nature of computing changes continually, therefore, storage systems need periodic, if not constant, study. In the 1980 s, both Smith [21] and Ousterhout [16] made detailed studies of file activity on computing systems. While their observations are still useful, some of the underlying structure has lost relevance. For example, Smith primarily observed text based user files for thirteen months; the size and nature of today s multimedia files, unforeseen ....
....in Table 1. An additional advantage of using file differences is that the file system simulator runs faster because every entry in a difference file corresponds to a required action by the simulator. Our statistics collection and analysis package for file systems has some weaknesses. Ousterhout [16] noted that 80 of all file creations have a lifetime of less than three minutes. Because the daemons, compilers and other programs that created these files during Ousterhout s work still exist, so do the temporary files they create. Our system traces File Activity Statistics Collected Accesses, ....
[Article contains additional citation context not shown here]
John K. Ousterhout, Herve Da Costa, David Harrison, John Kunze, Mike Kupfer, and James Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System." Operating System Review 19(5), Proceedings of the 10th ACM Symposium on Operating Systems Principles, 1985, pp. 15-24
....can be applied to integrated file compression techniques. system will be used next. The key to this new method is exploiting the reference locality characteristic of an individual file s access patterns. This locality of reference attribute has been observed previously by many researchers [3, 6, 25, 26, 27, 28]. By exploiting reference locality, large numbers of files can be identified for movement to tertiary storage with minimal user impact. This paper is organized into an introduction and four other sections. In Section 2 we describe the environments we collected data from, how we analyzed the data, ....
....to be used again; Files which have never been used are deleted more quickly than files that have been used; and . The likelihood of a file being used decreases as the time since its last use increases. Our file system collection and analysis package has some weaknesses. Ousterhout and others [3, 25, 31] noted that 80 of all file creations have a lifetime of less than three minutes. However, our system is designed to gather and analyze long term disk use and file system activity; temporary files that exist for less than three minutes will not be moved to long term storage and do not contribute ....
John K. Ousterhout, Herve Da Costa, David Harrison, John Kunze, Mike Kupfer, and James Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Operating System Review 19(5), Proceedings of the 10th ACM Symposium on Operating Systems Principles, 1985, pages 15--24.
....activity from either the file name view, or from the operating system s underlying numeric index. This comparison is done in Section 7. We summarize our findings in Section 8 and briefly discuss our future research in Section 9. 2 Related Research In the 1980 s, both Smith [20] and Ousterhout [15] made detailed studies of file activity on computing systems. While their observations are still useful, some of the underlying structure has lost relevance. For example, Smith primarily observed text based user files for thirteen months; the size and nature of today s multimedia files, unforeseen ....
....the analysis program that generates statistics on the items shown in Table 1. An additional advantage of using file differences is the analysis tool runs faster because of the reduced amount of I O. Our statistics collection and analysis package for file systems has some weaknesses. Ousterhout [15] noted that 80 of all file creations have a lifetime of less than three minutes. Because the daemons, compilers and other programs that created these files during Ousterhout s work still exist, we miss many of the temporary files they create. However, our collection system is designed to gather ....
[Article contains additional citation context not shown here]
John K. Ousterhout, Herve Da Costa, David Harrison, John Kunze, Mike Kupfer, and James Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System." Operating System Review 19(5), Proceedings of the 10th ACM Symposium on Operating Systems Principles, 1985, pp. 15-24
....process creates client requests using an exponential distribution for request 9 interarrival times. The client requests are differentiated according to a read to write ratio. In each of the following figures, this ratio has been conservatively estimated to be 4:1, motivated by the Berkeley study [16] and by our belief that for continuous media read access will strongly dominate write access. In our simulation of Swift, to read, a small request packet is multicast to the storage agents. The client then waits for the data to be transmitted by the storage agents. A write request transmits the ....
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A trace-driven analysis of the UNIX 4.2 BSD file system," in Proceedings of the 10 th Symposium on Operating System Principles, (Orcas Island, Washington), pp. 15--24, ACM, Dec. 1985.
....It also supports a file versioning mechanism similar to that found in the VAX VMS operating system. 1.5.1. 2 File access placement studies Three comprehensive file system activity traces have been performed, one in a commercial mainframe setting [Smi81] and two in university Unix environments [OCH85, Flo86b]. These traces form the basis for a number of analyses [Smi81, OCH85, MB87, Kur88, Flo86b, Flo86a] These studies generally showed that a few files receive a large portion of file accesses; most files accessed are small, and read sequentially in their entirety. A significant amount of working set ....
....in the VAX VMS operating system. 1.5.1. 2 File access placement studies Three comprehensive file system activity traces have been performed, one in a commercial mainframe setting [Smi81] and two in university Unix environments [OCH85, Flo86b] These traces form the basis for a number of analyses [Smi81, OCH85, MB87, Kur88, Flo86b, Flo86a]. These studies generally showed that a few files receive a large portion of file accesses; most files accessed are small, and read sequentially in their entirety. A significant amount of working set type locality was observed: a reference to a file was frequently followed by another reference to ....
John K. Ousterhout, Herv'e Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson. "A Trace-Driven Analysis of the Unix 4.2 BSD File System." Technical Report UCB/CSD 85/230, UCB, 1985.
....This need was recognized long ago [20,1] and in several fields workload data was indeed collected, analyzed, and modeled. Wellknown Examples are address traces used to analyze processor cache performance [45,48] and records of file system activity used to motivate the use of file caching [40]. Recently we are witnessing a large increase in such activity, with data being collected relating to LAN traffic [36] Web server loads [3] and video streams [34] This new wave of collecting and analyzing data for use in evaluations is also present in the field of job scheduling on ....
J. K. Ousterhout, H. Da Costa, D. Harrison, J. A. Kunze, M. Kupfer, and J. G. Thompson, "A trace-driven analysis of the UNIX 4.2 BSD file system". In 10th Symp. Operating Systems Principles, pp. 15--24, Dec 1985.
....NFS system [9] and she also treated read calls as not changing the state. The semantics of Unix could be changed so that the atime is only updated when the file is closed, although the benefits of this are uncertain due to the large percentage of whole file reads in typical Unix systems [10, 11]. Locus, a distributed operating system based on Unix, did not retain the atime field at all, so this seems to be an area where a semantic change of almost no impact could make Unix substantially more amenable to fault tolerance techniques. Data was collected on three machines: the first, CS, is ....
....flat curve for the Read predictor and an upward curve eventually tapering off due to full quantum usage for the Same Sender predictor. Read ratios in fact varied by a factor of two to four on each machine, but in general were consistent with the studies of the Unix file system by Ousterhout et al. [11] and more recently by Baker et al. [10] which measured the fraction of bytes read out of the total bytes transferred by the file system on time shared systems. The expected curve for Same Sender was only observed on CS, presumably because of the additional data at higher load averages. We would ....
J. K. Ousterhout, H. D. Costa, D. Harrison, J. A. Kunze, M. Kupfer, and J. G. Thompson, "A trace-driven analysis of the UNIX 4.2 BSD file system," in Proceedings of the Tenth Symnposium on Operating Systems Principles, Association for Computing Machinery, 1985.
....the distributed file system. Thus, one machine may act as both an intermediate server offering files out of its cache to other local clients and as a client. A delayed write policy offers the ability to avoid writing data that are subsequently overwritten or deleted (a frequent occurrence [19, 20]) which could substantially benefit low speed network users. Writes that cannot be avoided could be delayed until a period when traffic over the slow link is low. Delayed write caching may be useful as an option that can be manually turned on and off. For example, if a user wishes to build a ....
J. Ousterhout, H.L. DaCosta, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A Trace-Driven Analysis of the Unix 4.2 BSD File System," in Proc. of the 10th ACM Symp. on Operating System Principles, Orcas Island (December, 1985).
....in this thesis is the use of synthetic programs. The main advantage of using trace data to generate workloads is an accurate representation of an actual workload. An example in which trace data of usage in a file system were used to perform experiments on file system cache sizes is found in [19]. The disadvantages of using trace data include large data files and difficulty in using the data on different machines or in modifying the data to alter the workload. These problems are not characteristic of benchmarks. Benchmarks are relatively short programs or scripts that are designed to ....
J. Ousterhout et al., "A trace-driven analysis of the unix 4.2 bsd file system," in Proceedings of the 10th ACM Symposium on Operating System Principles, pp. 15--24, 1985.
....has been placed at the edge of the disk. More recently, similar results have been shown for optical storage media [Ford 91] In practice, data references are not drawn from a fixed distribution, nor are they independent. Although references are highly skewed [Floyd 89, Staelin 91, Vongsath 90, Ouster 85] request distributions change over time, and they are generally not known in advance. Nevertheless, variations of the organ pipe heuristic seem to work well in practice. Recently, several papers have proposed adaptive applications of data clustering based on this idea. Vongsathorn and Carson ....
Ousterhout, John K., et al, "A Trace Driven Analysis of the UNIX 4.2 BSD File System," Proceedings of the 10th ACM Symposium on Operating System Principles, 1985.
....long term storage and migration simulator. 2. Previous Work Researchers have looked at disk or tape storage systems in the past. However, the nature of computing changes continually, so storage systems need periodic, if not constant, study. In the 1980 s, both Smith [5] and Ousterhout, et al. [6] made detailed studies of file activity on computing systems. While their observations are still useful, some of the underlying structure has lost relevance. For example, Smith primarily observed text based user files for thirteen months; the size and nature of today s multimedia files, unforeseen ....
....in the system. The first runs the file tracing program nightly as a cron routine and maintains an activity log; the second automates running the differencing program on trace files. Our statistics collection and analysis package for file systems has both strengths and weaknesses. Ousterhout [6] noted that 80 of all file creations have a lifetime of less than three minutes. Because the daemons, compilers and other programs that created these files during Ousterhout s work still exist, so do the temporary files they create. Our system traces are collected nightly at 10 PM when few users ....
John K. Ousterhout, Herve Da Costa, David Harrison, John Kunze, Mike Kupfer, and James Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System. " Operating System Review 19(5), Proceedings of the 10th ACM Symposium on Operating Systems Principles (1985) pp. 15-24.
....requests. Thus, with 10 disks and track level striping, large requests have been exactly 400 KB and individual requests have been exactly 40 KB. However, in real world systems, large requests are often much larger than 400 KB [Bucher 80] and small requests are often much smaller than 40 KB [Ousterhout 85, Anon 85] Also, applications rarely issue requests that are all the same size. We therefore make two changes to the request size distribution: 1) We no longer restrict the workloads to one particular size. Rather, we use a distribution of request sizes. For large requests, we generate request ....
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, J. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," ACM Operating Systems Review, Vol. 19, No. 5, Proceedings of the 10th ACM Symposium on Operating System Principles, Dec. 1-4, 1985.
....short, there is ample motivation to propagate updates aggressively. On the other hand, delaying replay offers an opportunity for optimizing the log. Ousterhout reports that most UNIX files have a lifetime under three minutes and that 30 40 of modified file data is overwritten within three minutes [17]. Using our optimizer [7] we find it typical for 70 of the operations in a large log to be eliminated. In fact, the larger the log, the greater the fraction of operations eliminated by the 4 Partially Connected Operation optimizer. It is clear that delaying log replay can help reduce the ....
....does not immediately propagate changes, other users can not see modified data. Furthermore, conflicts may occur if partially connected users modify the same file. In our experience, these conflicts are rare; a substantial body of research concurs by showing that this kind of file sharing is rare [2, 11, 17]. If stronger guarantees are needed, they might be provided by server enhancements. For example, an enhanced consistency protocol might inform servers that dirty data is cached at a client; when 6 Partially Connected Operation another client requests the data, the server can demand the dirty ....
J. Ousterhout, H.L. DaCosta, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A Trace-Driven Analysis of the Unix 4.2 BSD File System," in Proc. of the 10th ACM SOSP, Orcas Island, WA (December 1985).
....not increase cache hit ratios as much as one might expect. For example, the Sprite group s 1985 caching study led them to predict higher hit ratios for larger caches. But, in 1991, when larger caches were installed, hit ratios were not much changed files had grown just as fast as the caches [Ousterhout85, Baker91]. The problem is especially acute for the growing class of I O intensive applications. Examples include: text search, 3D scientific visualization, relational database queries, speech recognition, and computational chemistry. For most of these, the amount of data processed is large relative to file ....
Ousterhout, J.K., Da Costa, H., Harrison, D., Kunze, J.A., Kupfer, M., and Thompson, J.G., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proc. of the 10th Symp. on Operating System Principles, Orcas Island, WA, December 1985, pp. 15-24.
....the choice for whole file transfer. As a consequence, processors can only operate on files that fit in their physical memory. This affects the way in which we store data structures on files, and how we assign processors to applications. Since most files (about 75 ) are accessed in entirety [4], whole file transfer optimizes overall scaling and performance, as has also been reported in other system that do whole file transfer, such as in the Andrew ITC file system [5] Another design choice, which is closely linked to keeping files contiguous, is to make all files immutable. That is, ....
Ousterhout, J. K., Costa, H. Da, Harrison, D., Kunze, J. A., Kupfer, M., and Thompson, J. G., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proc. of the 10th Symp. on Operating Systems Principles, pp. 15-24, Orcas Island, WA (December 1985).
....This need was recognized long ago [25,1] and in several fields workload data was indeed collected, analyzed, and modeled. Wellknown examples are address traces used to analyze processor cache performance [56,59] and records of file system activity used to motivate the use of file caching [48]. Recently we are witnessing a large increase in such activity, with data being collected relating to LAN traffic [44] web server loads [3] and video streams [42] This new wave of collecting and analyzing data for use in evaluations is also present in the field of job scheduling on ....
J. K. Ousterhout, H. Da Costa, D. Harrison, J. A. Kunze, M. Kupfer, and J. G. Thompson, "A trace-driven analysis of the UNIX 4.2 BSD file system". In 10th Symp. Operating Systems Principles, pp. 15--24, Dec 1985.
....Relying on caching to satisfy the page 4 data throughput needs of such high performance clients would require cache miss rates to decrease proportionately. Unfortunately, increasing computation sizes, file sizes, and workgroup sharing are all blocking the needed decrease in miss rates [Ousterhout85, Baker91], while increased client cache sizes are making those misses more bursty. Thus, if client performance is to improve, the performance of distributed file systems, while servicing client cache misses, must also improve. This is the argument that led storage subsystem designers to develop disk ....
Ousterhout, J.K. et al., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System", 10th SOSP, Dec. 1985.
....workload studies as characterizing general purpose workstations or workstation networks, scientific vector applications, or scientific parallel applications. General purpose workstations. Uniprocessor file access patterns have been measured many times. Floyd and Ellis [3, 4] and Ousterhout et al. [5] measured isolated Unix workstations, and Baker et al. measured a distributed Unix system (Sprite) 6] Ramakrishnan et al. 7] studied access patterns in a commercial computing environment on a VAX VMS platform. These studies all cover general purpose (engineering and office) workloads with ....
....parallel file systems must focus on providing low latency for small requests as well as high bandwidth for large requests. 4. 4 Sequentiality One common characteristic of previous file system workload studies, particularly of scientific workloads, is that files are typically accessed sequentially [5, 6, 10]. We define a sequential request to be one that begins at a higher file offset than the point where the previous request from that compute node ended. This is a looser definition of sequential than is used in the studies referred to above. What previous studies have called sequential, we call ....
John Ousterhout, Herv'e Da Costa, David Harrison, John Kunze, Mike Kupfer, and James Thompson, "A trace driven analysis of the UNIX 4.2 BSD file system", in Proceedings of the Tenth ACM Symposium on Operating Systems Principles, Dec. 1985, pp. 15--24.
....order to maintain a running time. The accuracy of the synthesized clock is measured. It varies dramatically from one system to another. ffl measure file sizes in your system and propose an appropriate file block size. This is basically another go at the Satyanarayanan study, and sometimes one of [4, 5, 6] is assigned as related reading. ffl simulate multi processor scheduling. A comparison of scheduling n servers with either 1 queue or n queues. A range of loads is simulated; the students are surprised to discover how difficult simulation turns out to be. The simulation leads well into a lecture ....
J. K. Ousterhout, H. D. Costa, D. Harrison, J. A. Kunze, M. D. Kupfer, and J. G. Thompson, "A trace-driven analysis of the unix 4.2 BSD file system," in Proceedings of the 10th Symposium on Operating System Principles, December 1985.
....number in two successive snapshots, but different file sizes and change times, the file must have been modified between the two snapshots. It is impossible to infer, however, whether the file was truncated to zero length and rewritten, was appended, or was partially truncated. Previous studies [2][8] have demonstrated that files are almost always written in their entirety at the time the file is created. Thus, when a file has been modified, I assume that the file was completely truncated and rewritten at the time indicated by the inode s change time. Another shortcoming of the workload ....
....that the file was completely truncated and rewritten at the time indicated by the inode s change time. Another shortcoming of the workload generated from these snapshots is that it only captures operations on files that survive across as least one snapshot. Trace based file system studies [2][8] have shown that most files live for less than the twenty four hours between successive snapshots. To approximate the additional file creations and deletions generated by these short lived files, I used a two week trace of NFS requests to a Network Appliance FAServer (attic) This data, which was ....
Ousterhout, J., Costa, H., Harrison, D., Kunze, J., Kupfer M., Thompson, J., "A Trace-Driven Analysis of the UNIX 4.2BSD File System," Proceedings of the Tenth Symposium on Operating System Principles, Orcas Island, WA, December 1985, pp. 15--24.
....client caching to reduce this server load. For example, AFS clients use local disk to cache a subset of the global system s files. While client caching is essential for high performance, increasing file sizes, computation sizes, and workgroup sharing are all inducing more misses per cache block [Ousterhout85, Baker91]. At the same time, increased client cache sizes are making these misses more bursty. When the post client cache server load is still too large, it can either be distributed over multiple servers or satisfied by a customdesigned high end file server. Multiple server distributed file systems ....
Ousterhout, J.K. et al., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System", 10th SOSP, Dec. 1985.
....the programmer to reassemble logical records from multiple sources. The most appropriate distribution strategy for parallel files will ultimately depend on the role that files assume in parallel applications. Unfortunately, the information that is currently available about file usage patterns [10, 11, 12] in uniprocessor systems does not necessarily apply to the multiprocessor environment. Preliminary experience allows us to make some educated guesses about what to expect. The principal role of sequential file systems, 1 Blocks on the Connection Machine are called chunks and contain 64 bits of ....
J. Ousterhout, H. DaCosta, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A trace driven analysis of the UNIX 4.2 BSD file system," Proceedings of 10th Symposium on Operating Systems Principles, Operating Systems Review, vol. 19, pp. 15--24, December 1985.
....the distributed file system. Thus, one machine may act as both an intermediate server offering files out of its cache to other local clients and as a client. A delayed write policy offers the ability to avoid writing data that are subsequently overwritten or deleted (a frequent occurrence [23, 26]) which could substantially benefit low speed network users. Writes that cannot be avoided could be delayed until a time when traffic over the slow link is low. Delayed write Delayed Write in a Multilevel File System caching may be useful as an option that can be turned on and off by hand. For ....
J. Ousterhout, H.L. DaCosta, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A Trace-Driven Analysis of the Unix 4.2 BSD File System," in Proc. of the 10th ACM Symp. on Operating System Principles, Orcas Island (December, 1985).
....cache consistency is needed to prevent stale data errors, but that it is not invoked often enough to degrade overall system perfor mance. 1. Introduction In 1985 a group of researchers at the University of Califomia at Berkeley performed a trace driven analysis of the UNIX 4. 2 BSD file system [11]. That study, which we call the BSD study, showed that average file access rates were only a few hundred bytes per second per user for engineering and office applications, and that many files had lifetimes of only a few seconds. It also reinforced commonly held beliefs that file accesses tend to ....
.... to deduce the exact range of bytes 199 accessed, but it introduced a small amount of uncertainty in times (actual reads and writes could have occurred at any time between the surrounding open close reposition events) Ousterhout et al. have a more complete discussion of the tracing approach [11]. One of the most difficult tasks in tracing a network of workstations is coordinating the traces from many machines. Our task was greatly simplified because most of the information we wished to trace was available on the Sprite file servers. Several key file system operations, such as file ....
[Article contains additional citation context not shown here]
Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, $. A., Kupfer, M. and Thompson, J. G., "A Trace- Driven Analysis of the UNIX 4.2 BSD File System", Proceedings of the loth Symposium on Operating System Principles, Orcas Island, WA, December 1985, 15-24. 211
No context found.
Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, J. A., Kupfer, M., and Thompson, J. G., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proceedings of the 10th Symposium on Operating System Principles, pp. 15-24, December 1985.
No context found.
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," ACM 10th Symposium on Operating Systems Principles, pp. 15-24, 1985.
No context found.
J.K. Ousterhout, H. Da Costa, D. Harrison, J.A. Kunze, M. Kupfer, J.G. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proceedings of the 10th Symposium on Operating Systems Principles (SOSP), Orcas Island, WA, December, 1985, pp. 15-24.
No context found.
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A trace driven analysis of the UNIX 4.2 BSD file system", Proceedings of the 10 ACM Symposium on Operating System Principles, pp. 15-24, December 1985.
No context found.
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, J. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System", ACM Symposium on Operating System Principles, 1985, pp. 15--24.
No context found.
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," in Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), pages 15--24, Dec. 1985.
No context found.
John K. Ousterhout, Herv e Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson, `A trace-driven analysis of the UNIX 4.2 BSD file system', Proceedings of the Tenth Symposium on Operating Systems Principles. ACM, December 1985, pp. 15--24.
No context found.
John K. Ousterhout, Herv e Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson, `A trace-driven analysis of the UNIX 4.2 BSD file system', Proceedings of the Tenth Symposium on Operating Systems Principles. ACM, December 1985, pp. 15--24.
No context found.
Ousterhout, J.K., H.D. Costa, D. Harrison, J.A. Kunze, M. Kupfer, and J.G. Thompson. (1985). "A Trace Driven Analysis of the Unix 4.2BSD File System." Technical Report, Department of Computer Science, University of California at Berkley.
No context found.
Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, J. A., Kupfer, M. and Thompson, J. G., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proceedings of the 10th ACM Symposium on Operating System Principles, December 1985, pp. 15-24. 13
No context found.
J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, J. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System", ACM Symposium on Operating System Principles, 1985, pp. 15--24.
No context found.
John K. Ousterhout, Herv'e Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson. "A Trace-Driven Analysis of the Unix 4.2 BSD File System." Technical Report UCB/CSD 85/230, UCB, 1985.
No context found.
Ousterhout, John K., et al, "A Trace Driven Analysis of the UNIX 4.2 BSD File System," Proceedings of the 10th ACM Symposium on Operating System Principles, 1985.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC