| Smirni, E. and D. Reed. Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications. In Performance Evaluation: An International Journal, Volume 33, Number 1, pages 27--44, June, 1998. |
....and represents a significant obstacle to achieving good performance. The problem is often not with the hardware; many parallel I O subsystems offer excellent performance. Rather, the problem arises from other factors, primarily the I O patterns exhibited by many parallel scientific applications [9, 18, 2, 3, 22, 24, 25, 28]. In particular, each processor tends to make a large number of small I O requests, incurring the high cost of I O on each such request. One reason for this access pattern is that parallel scientific codes frequently involve large arrays distributed across the processor s local memory. After a ....
Smirni, E. and D. Reed. Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications. In Performance Evaluation: An International Journal, Volume 33, Number 1, pages 27--44, June, 1998.
....and represents a significant obstacle to achieving good performance. The problem is often not with the hardware; many parallel I O subsystems offer excellent performance. Rather, the problem arises from other factors, primarily the I O patterns exhibited by many parallel scientific applications [9, 18, 2, 3, 22, 24, 25, 28]. In particular, each processor tends to make a large number of small I O requests, incurring the high cost of I O on each such request. One reason for this access pattern is that parallel scientific codes frequently involve large arrays distributed across the processor s local memory. After a ....
Smirni, E. and D. Reed. Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications. In Performance Evaluation: An International Journal, Volume 33, Number 1, pages 27--44, June, 1998.
....derived datatypes [19] Noncontiguous locations in memory can be specified by using a derived datatype in the read write call. The ability of users to specify noncontiguous accesses in a single function call is very important, becausenoncontiguous accesses are very common in parallel applications [1, 4, 21, 26, 27, 32]. Most file systems, however, do not provide functions for noncontiguous I O. The Unix functions readv writev are widely supported, but they allow noncontiguity only in memory and not in the file. Noncontiguous memory accesses are not as commonly needed in parallel applications as noncontiguous ....
....as the default and provide a facility for users to vary these parameters on a per file basis. 9. Variable Caching Prefetching Policies. Parallel applications exhibit such a wide variation in access patterns that any one caching prefetching policy is unlikely to perform well for all applications [27]. The file system must therefore either detect and automatically adapt to changing access patterns [16, 17] or provide an interface for the user to specify the access pattern or caching prefetching policy [2, 22] 10. File Preallocation. It is easy and inexpensive for a file system to provide a ....
E. Smirni and D. Reed. Lessons from Characterizing the Input /Output Behavior of Parallel Scientific Applications. Performance Evaluation: An International Journal, 33(1):27--44, June 1998.
....different parallel machines: HP Exemplar, IBM SP, Intel Paragon, NEC SX 4, and SGI Origin2000. 1 Introduction Numerous studies of the I O characteristics of parallel applications have shown that many applications need to access a large number of small, noncontiguous pieces of data from a file [1, 2, 7, 9, 10]. For good I O performance, however, the size of an I O request must be large (on the order of megabytes) The I O performance suffers considerably if applications access data by making many small I O requests. Such is the case when parallel applications perform I O by using the Unix read and ....
E. Smirni and D. Reed. Lessons from Characterizing the Input /Output Behavior of Parallel Scientific Applications. Performance Evaluation: An International Journal, 33(1):27-- 44, June 1998.
....programs, it is not sufficient for the kinds of access patterns common in parallel programs. Many studies of the I O access patterns in parallel programs have shown that each process of a parallel program may need to access several relatively small, noncontiguous pieces of data from a file [3, 12, 36, 50, 51, 56]. In addition, many all processes may need to access the file at about the same time, and, although the accesses of each process may 2 Unix does have functions readv and writev, but they allow noncontiguity only in memory and not in the file. POSIX has a function lio listio that allows users to ....
....only large, contiguous pieces of data, level 0 is equivalent to level 2, and level 1 is equivalent to level 3. Users need not create derived datatypes in such cases, as level 0 requests themselves will likely perform well. Many real parallel applications, however, do not fall into this category [3, 12, 36, 50, 51, 56]. We note that the MPI standard does not require an implementation to perform any of these optimizations. Nevertheless, even if an implementation does not perform any optimization and instead translates level 3 requests into several level 0 requests to the file system, the performance would be no ....
E. Smirni and D. A. Reed. Lessons from characterizing the input/output behavior of parallel scientific applications. Performance Evaluation: An International Journal, 33(1):27--44, June 1998.
....and a sequence of consecutive phases that are statistically identical are de ned as a working set. The execution behavior of an I O bound program is therefore comprised as a sequence of I O working sets. This general model of program behavior is consistent with results from measurement studies [26, 27]. The time duration of the I O burst was equal to 100 ms in average. The ratio of the I O working set used in simulations was 1 1, that is, for a burst of 100 ms of I O there was a burst of 100 ms of computation in average. Observe that I O requests from di erent jobs to the same disk are queued ....
E. Smirni and D. A. Reed. Lessons from characterizing the input/output behavior of parallel scientic applications. Performance Evaluation, 33:27-44, 1998.
....and a sequence of consecutive phases that are statistically identical are defined as a working set. The execution behavior of an I O bound program is therefore comprised as a sequence of I O working sets. This general model of program behavior is consistent with results from measurement studies [27, 28]. The time duration of the I O burst was equal to 100 ms in average. The ratio of the I O working set used in simulations was 1 1, that is, for a burst of 100 ms of I O there was a burst of 100 ms of computation in average. Observe that I O requests from different jobs to the same disk are queued ....
E. Smirni and D. A. Reed. Lessons from characterizing the input/output behavior of parallel scientific applications. Performance Evaluation, 33:27--44, 1998.
....Energy, the National Aeronautics and Space Administration, and the National Science Foundation. 1 Introduction Numerous studies of the I O characteristics of parallel applications have shown that many applications need to access a large number of small, noncontiguous pieces of data from a file [1, 3, 9, 11, 12, 16]. For good I O performance, however, the size of an I O request must be large (on the order of megabytes) The I O performance suffers considerably, on the other hand, if applications access data by making many small I O requests. Such is the case when applications perform I O by using the Unix ....
E. Smirni and D. Reed. Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications. Performance Evaluation: An International Journal, 33(1):27--44, June 1998.
....(on the order of kilobytes or even less) The small I O requests made by parallel programs are a result of the combination of two factors: 1. In many parallel applications, each process needs to access a large number of relatively small pieces of data that are not contiguously located in the file [1, 6, 20, 29, 30, 35]. 2. Most parallel file systems have a Unix like API (application programming interface) that allows a user to access only a single, contiguous chunk of data at a time from a file. 1 Noncontiguous data sets must therefore be accessed by making separate function calls to access each individual ....
....Figure 5: Detailed code for the distributed array example of Figure 2 using a level 3 request cases, as level 0 requests themselves will likely perform well. Most real parallel applications, however, do not fall into this category. Several studies of I O access patterns in parallel applications [1, 6, 20, 29, 30, 35] have shown that each process in a parallel program may need to access a number of relatively small, noncontiguous portions of a file. From a performance perspective, it is critical that the I O interface can express such an access pattern, as it enables the implementation to optimize the I O ....
E. Smirni and D. Reed. Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications. Performance Evaluation: An International Journal, 33(1):27--44, June 1998.
....derived datatypes [17] Noncontiguous locations in memory can be specified by using a derived datatype in the read write call. The ability of users to specify noncontiguous accesses in a single function call is very important, because noncontiguous accesses are very common in parallel applications [1, 4, 20, 25, 26, 31]. Most file systems, however, do not provide functions for noncontiguous I O. The Unix functions readv writev are widely supported, but they allow noncontiguity only in memory and not in the file. Noncontiguous memory accesses are not as commonly needed in parallel applications as noncontiguous ....
....as the default and provide a facility for users to vary these parameters on a per file basis. 9. Variable Caching Prefetching Policies. Parallel applications exhibit such a wide variation in access patterns that any one caching prefetching policy is unlikely to perform well for all applications [26]. The file system must therefore either detect and automatically adapt to changing access patterns [15, 16] or provide an interface for the user to specify the access pattern or caching prefetching policy [2, 21] 10. File Preallocation. It is easy and inexpensive for a file system to provide a ....
E. Smirni and D. Reed. Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications. Performance Evaluation: An International Journal, 33(1):27--44, June 1998.
.... the underlying file system [1, 18] Other studies concentrated on workload measurements at supercomputing centers without distinguishing among applications but provide information about how parallel I O systems are used by the majority of applications [10] Finally, a series of inductive studies [3, 19, 20] concentrated on the requirements of individual applications as well as their temporal and spatial access patterns. In this paper, we present a workload characterization study that applies a detailed model of the I O requirements of parallel scientific applications and proposes a functional form ....
....and use this characterization to design and evaluate resource management policies for parallel file systems. The selected codes are representative of the I O behavior and requirements that have been identified as characteristic of scientific applications in recent characterization studies [18, 20]. As an example of scientific application behavior, Figure 1 illustrates the I O activity across time of four applications: an implementation of the Schwinger Multichannel method for calculating lowenergy electron molecule collisions (ESCAT) a) two different Fortran implementations of electronic ....
[Article contains additional citation context not shown here]
E. Smirni and D. A. Reed. Lesson from characterizing the input/output behavior of parallel scientific applications. Performance Evaluation, 33:27--44, 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC