| Evgenia Smirni, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed. I/O requirements of scientific applications: An evolutionary view. In Hai Jin, Toni Cortes, and Rajkumar Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applications, pages 576--594. IEEE Computer Society Press and Wiley, New York, NY, 2001. |
.... to vary from zero bytes to 8 KB [CHKM96] Detailed studies of three scalable, I O intensive scientific applications (electron scattering, terrain rendering, and quantum chemistry) show tremendous variations in I O workload parameters such as I O request sizes and the total I O volume [CACR95, SACR96] It is clear that more work remains to be done in understanding the I O characteristics of applications. Some e#orts, such as the CHARISMA project [KN95] studied production scientificapplication workloads. So far, however, there has been relatively little attention paid to the detailed I O ....
Evgenia Smirni, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed. I/O requirements of scientific applications: An evolutionary view. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59, Syracuse, NY, 1996. IEEE Computer Society Press.
....Parallel applications are in many ways the most similar to pipelined batch applications. The CPU, memory, communication, and I O behavior of parallel and vector applications have been quantified in a number of studies [9, 37, 36] but the most relevant studies consider the impact of explicit I O [30, 32, 23, 8, 1]. Our study embellishes these works by studying the sharing behavior of an important new class of workload. Many of these studies demonstrate the drastic differences in I O behavior for parallel applications compared to general purpose workloads. For example, parallel scientific workloads often ....
E. Smirni, R. A. Aydt, A. A. Chien, and D. A. Reed. I/O requirements of scientific applications: An evolutionary view. In H. Jin, T. Cortes, and R. Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applica11 tions, pages 576--594. IEEE Computer Society Press, New York, NY, 2001.
.... or even less) These small requests occur for the following reasons: ffl In many parallel applications (for example, those that access distributed arrays from files) each process needs to access a large number of relatively small pieces of data that are not contiguously located in the file [1, 3, 11, 19, 18, 24]. ffl Most parallel file systems have a Unix like API (application programming interface) that allows a user to access only a single, contiguous chunk of data at a time from a file. 1 Noncontiguous data sets must therefore be accessed by making separate function calls to access each individual ....
....can be defined by using any MPI basic or derived datatype; therefore, any general noncontiguous access pattern can be compactly represented. Several studies have shown that, in many parallel applications, each process needs to access a number of relatively small, noncontiguous portions of a file [1, 3, 11, 19, 24]. From a performance perspective, it is critical that the I O interface can express such an access pattern, as it enables the implementation to optimize the I O request. The optimizations typically allow the physical I O to take place in large, contiguous chunks, even though the user s request may ....
[Article contains additional citation context not shown here]
Evgenia Smirni, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59. IEEE Computer Society Press, 1996. 17
....and represents a significant obstacle to achieving good performance. The problem is often not with the hardware; many parallel I O subsystems offer excellent performance. Rather, the problem arises from other factors, primarily the I O patterns exhibited by many parallel scientific applications [9, 18, 2, 3, 22, 24, 25, 28]. In particular, each processor tends to make a large number of small I O requests, incurring the high cost of I O on each such request. One reason for this access pattern is that parallel scientific codes frequently involve large arrays distributed across the processor s local memory. After a ....
Smirni, E., Aydt, R., Chien, A., and D. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59, IEEE Computer Society Press, 1996.
....and represents a significant obstacle to achieving good performance. The problem is often not with the hardware; many parallel I O subsystems offer excellent performance. Rather, the problem arises from other factors, primarily the I O patterns exhibited by many parallel scientific applications [9, 18, 2, 3, 22, 24, 25, 28]. In particular, each processor tends to make a large number of small I O requests, incurring the high cost of I O on each such request. One reason for this access pattern is that parallel scientific codes frequently involve large arrays distributed across the processor s local memory. After a ....
Smirni, E., Aydt, R., Chien, A., and D. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59, IEEE Computer Society Press, 1996.
....derived datatypes [19] Noncontiguous locations in memory can be specified by using a derived datatype in the read write call. The ability of users to specify noncontiguous accesses in a single function call is very important, becausenoncontiguous accesses are very common in parallel applications [1, 4, 21, 26, 27, 32]. Most file systems, however, do not provide functions for noncontiguous I O. The Unix functions readv writev are widely supported, but they allow noncontiguity only in memory and not in the file. Noncontiguous memory accesses are not as commonly needed in parallel applications as noncontiguous ....
E. Smirni, R. Aydt, A. Chien, and D. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59. IEEE Computer Society Press, 1996.
....different parallel machines: HP Exemplar, IBM SP, Intel Paragon, NEC SX 4, and SGI Origin2000. 1 Introduction Numerous studies of the I O characteristics of parallel applications have shown that many applications need to access a large number of small, noncontiguous pieces of data from a file [1, 2, 7, 9, 10]. For good I O performance, however, the size of an I O request must be large (on the order of megabytes) The I O performance suffers considerably if applications access data by making many small I O requests. Such is the case when parallel applications perform I O by using the Unix read and ....
E. Smirni, R. Aydt, A. Chien, and D. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59. IEEE Computer Society Press, 1996.
....is not an issue in our experimental design. 5. 2 Collective Input Output Benchmark Studies of application level I O access patterns reveal that iterative I O compute loops where processors collaborate to read or write and process multiple data sets over the course of execution are quite common [6, 17]. In our experiments, we consider a hypothetical application with a read compute loop that can be restructured to read (and process) files of varying sizes, maintaining a fixed ratio of compute time to I O volume, as illustrated in Figure 4. The compute time for 8 Access Pattern File Size (MB) ....
SMIRNI, E., AYDT, R. A., CHIEN, A. A., AND REED, D. A. I/O Requirements of Scientific Applications: An Evolutionary View. In Fifth International Symposium on High Performance Distributed Computing (1996), pp. 49-- 59.
....programs, it is not sufficient for the kinds of access patterns common in parallel programs. Many studies of the I O access patterns in parallel programs have shown that each process of a parallel program may need to access several relatively small, noncontiguous pieces of data from a file [3, 12, 36, 50, 51, 56]. In addition, many all processes may need to access the file at about the same time, and, although the accesses of each process may 2 Unix does have functions readv and writev, but they allow noncontiguity only in memory and not in the file. POSIX has a function lio listio that allows users to ....
....only large, contiguous pieces of data, level 0 is equivalent to level 2, and level 1 is equivalent to level 3. Users need not create derived datatypes in such cases, as level 0 requests themselves will likely perform well. Many real parallel applications, however, do not fall into this category [3, 12, 36, 50, 51, 56]. We note that the MPI standard does not require an implementation to perform any of these optimizations. Nevertheless, even if an implementation does not perform any optimization and instead translates level 3 requests into several level 0 requests to the file system, the performance would be no ....
Evgenia Smirni, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed. I/O requirements of scientific applications: An evolutionary view. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59. IEEE Computer Society Press, 1996.
....Therefore, sequential supercomputer input output systems are typically optimized for high throughput. Uniprocessor input output, whether from scientific applications or engineering workstations, is quite regular. In contrast, multiprocessor input output access patterns exhibit greater variation [18, 81], making it more difficult for file systems to optimize input output. Furthermore, as processor speeds increase, it becomes computationally feasible to examine larger problems and new application areas, exacerbating the input output bottleneck. This bottleneck seriously impacts an important class ....
....or simply unknown. Furthermore, input output requirements are a complex function of the interaction between system software and executing applications and may change unpredictably during program execution. 1. 2 Application Input Output Requirements Application level characterization studies [18, 81] demonstrate that there is wide variability in input output access patterns, and moreover, performance is extremely sensitive to these variations. Many of these characterization efforts are part of the Scalable Input Output (SIO) Initiative, a multidisciplinary collaboration of the Caltech Center ....
[Article contains additional citation context not shown here]
Smirni, E., Aydt, R. A., Chien, A. A., and Reed, D. A. I/O Requirements of Scientific Applications: An Evolutionary View. In Fifth International Symposium on High Performance Distributed Computing (1996), pp. 49--59.
....Energy, the National Aeronautics and Space Administration, and the National Science Foundation. 1 Introduction Numerous studies of the I O characteristics of parallel applications have shown that many applications need to access a large number of small, noncontiguous pieces of data from a file [1, 3, 9, 11, 12, 16]. For good I O performance, however, the size of an I O request must be large (on the order of megabytes) The I O performance suffers considerably, on the other hand, if applications access data by making many small I O requests. Such is the case when applications perform I O by using the Unix ....
E. Smirni, R. Aydt, A. Chien, and D. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59. IEEE Computer Society Press, 1996.
....each process can make Unix I O calls on its own, independent of other processes. However, as we explain below, using this interface often turns out to be very inefficient. The main reason is that the access patterns in parallel programs are quite different from those in uniprocess programs [21, 4, 2, 26]. In parallel programs, each process may need to access a noncontiguous data set. In many cases, the accesses of different processes may be interleaved in the file, and together they may span large, contiguous portions of the file. With the Unix I O interface, the programmer has no means of ....
E. Smirni, R. Aydt, A. Chien, and D. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59. IEEE Computer Society Press, 1996.
....bit with the access patterns of data within each file. They found a huge variation in request sizes, amount of I O, number of files, and so forth. Their primary conclusion is thus that file systems should be adaptable to different access patterns, preferably under control of the application. ffl [SACR96] They study two applications (electron scattering and computational fluid dynamics) over several versions, using Pablo to capture the I O activity. They thus watch as application developers improve the applications use of I O modes and request sizes. Both applications move through three phases: ....
Evgenia Smirni, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed. I/O requirements of scientific applications: An evolutionary view. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59, 1996.
....(on the order of kilobytes or even less) The small I O requests made by parallel programs are a result of the combination of two factors: 1. In many parallel applications, each process needs to access a large number of relatively small pieces of data that are not contiguously located in the file [1, 6, 20, 29, 30, 35]. 2. Most parallel file systems have a Unix like API (application programming interface) that allows a user to access only a single, contiguous chunk of data at a time from a file. 1 Noncontiguous data sets must therefore be accessed by making separate function calls to access each individual ....
....Figure 5: Detailed code for the distributed array example of Figure 2 using a level 3 request cases, as level 0 requests themselves will likely perform well. Most real parallel applications, however, do not fall into this category. Several studies of I O access patterns in parallel applications [1, 6, 20, 29, 30, 35] have shown that each process in a parallel program may need to access a number of relatively small, noncontiguous portions of a file. From a performance perspective, it is critical that the I O interface can express such an access pattern, as it enables the implementation to optimize the I O ....
E. Smirni, R. Aydt, A. Chien, and D. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing, pages 49--59. IEEE Computer Society Press, 1996.
No context found.
Evgenia Smirni, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed. I/O requirements of scientific applications: An evolutionary view. In Hai Jin, Toni Cortes, and Rajkumar Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applications, pages 576--594. IEEE Computer Society Press and Wiley, New York, NY, 2001.
No context found.
E. Smirni, R.A. Aydt, A.A. Chien, and D.A. Reed, I/O Requirements of Scientific Applications: An Evolutionary View, Proc. High Performance Distributed Computing, pp. 4959, 1996.
....storage resources (i.e. disks) is critical for high performance. The potential benefits of such an integrated approach raises a number of issues that have not been addressed in traditional processor scheduling or I O research. Currently we are engaged in research that builds on our prior work [35, 36, 37] in both processor and I O scheduling. We propose coordinated resource management of both computational resources and secondary storage resources. We will exploit knowledge of the I O characteristics of I O intensive parallel applications and propose new policies that incorporate this knowledge ....
E. Smirni, R.A. Aydt, A.A. Chien and D.A. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. Fifth IEEE International Symposium on High Performance Distributed Computing, Syracuse, New York, August 1996, pp. 49-59.
....under NASA Contract NAG 1 613. achieving high performance for applications with large input output components. Moreover, most current parallel file systems were constructed as extensions of workstation file systems and optimized for large, sequential data transfers. Recent experimental studies [14, 2, 22, 23, 16] have shown that parallel applications have much more complex access patterns, with greater spatial and temporal variability, than first suspected. Although there is a large, complementary body of experimental data on disk behavior [19, 18] for sequential file systems, there is much less ....
....a small sample of the possibilities and attempting to extract more general patterns. Though the benchmarks and MESSKIT chemistry application we studied on the Intel Paragon XP S are but a few samples from a large space of possible input output patterns, earlier application characterization studies [2, 22, 23, 16, 14] suggest that our selections are representative of current practice. Although a wider range of experiments is desirable, the level of instrumentation and experiments we conducted required access to the operating system code and single user time to load experimental operating system kernels. This ....
Smirni, E., and Reed, D. A. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High-Performance Distributed Computing (Aug. 1996), pp. 49--59.
....extraction, the Pablo toolkit supports both real time reduction of input output performance data and capture of detailed event traces. These two options trade computation perturbation for input output perturbation. Extensive use of the Pablo toolkit for application input output characterization [2, 21, 16] has shown that the instrumentation overhead is negligible for most application codes. 3.3 Physical I O Instrumentation Device drivers define the interface between file system services and input output devices, isolating the idiosyncrasies of specific devices be3 Parameter 64 Disks 16 Disks 12 ....
....we noted earlier, one of the primary goals of the Scalable I O (SIO) Initiative is analyzing the input output patterns present in a large suite of scientific and engineering codes. These span a broad range of disciplines and have been the subject of several application characterization studies [2, 21, 16]. As an initial basis for integrated analysis of application and physical input output analysis, we selected one code (MESSKIT) from the SIO suite. This code has been the subject of earlier application analysis [2] and is representative of the input output patterns observed in parallel scientific ....
Smirni, E., Aydt, R. A., Chien, A. A., and Reed, D. A. I/O Requirements of Scientific Applications: An Evolutionary View. In High Performance Distributed Computing (1996), pp. 49--59.
....see [17] 3.3 Classification and Policy Control In earlier work we examined the utility of purely qualitative classifications using a neural network based classifier. Based on our ongoing characterization of scientific application input output patterns as part of the Scalable I O Initiative [3, 25, 23], we partitioned access patterns based on three broad features: read write mix, sequentiality, and request size; see Table 1. At periodic intervals corresponding to some number of accesses or number of bytes accessed, the neural network based classifier produced qualitative classifications of ....
Smirni, E., Aydt, R. A., Chien, A. A., and Reed, D. A. I/O Requirements of Scientific Applications: An Evolutionary View. In Fifth International Symposium on High Performance Distributed Computing (1996), pp. 49--59.
....via global pointers from the sensor actuator manager. By changing the mapping, one can apply the rule set using different sensors, choose different policies, or even control different systems. 8 Parallel Input Output Example Our recent characterization studies of parallel input output patterns [3,18,24,25,23,22] have shown that parallel applications exhibit a wide variety of input output request patterns, with both very small and very large request sizes, sequential and non sequential access, and a variety of temporal variations. Small input output requests are best managed by aggregation, prefetching, ....
Evgenia Smirni and Daniel A. Reed. I/O Requirements of Scientific Applications: An Evolutionary View. In Proceedings of the Fifth IEEE International Symposium on High-Performance Distributed Computing, pages 49--59, August 1996.
No context found.
E. Smirni, R. A. Aydt, A. A. Chien, and D. A. Reed. I/O requirements of scientific applications: An evolutionary view. In Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing (HPDC), pages 49--59. IEEE, 1996.
No context found.
E. Smirni, R. A. Aydt, A. A. Chien, and D. A. Reed. I/O requirements of scientific applications: An evolutionary view. In Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing (HPDC), pages 49--59. IEEE, 1996.
No context found.
E. Smirni, R. A. Aydt, A. A. Chien, and D. A. Reed. I/O requirements of scientific applications: An evolutionary view. In Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing (HPDC), pages 49--59. IEEE, 1996.
No context found.
E. Smirni, R. A. Aydt, A. A. Chien, and D. A. Reed. I/O requirements of scientific applications: An evolutionary view. In Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing (HPDC), pages 49--59. IEEE, 1996.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC