| N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993. |
....work together to manage the parallel streams. The example in Figure 3 shows the familiarity of operations when parallel streams are used the only difference from standard C operations is a shape pointer argument added to fopen, which lets an input file specify the shape at run time. shape [2][4]S; int scalarvar; FILE:S outfile, errfile; int:S someint; with (S) outfile = fopen( temp , w , S) fwrite( someint,sizeof(someint) 1,outfile) fclose(outfile) where (someint 0) errfile = fopen( debug , a , S) fprintf(errfile, Error, someint was d n ,someint) ....
....general parallel file operations. We say files relying on this generality use Independent Buffering Mode (IB) because each VP s stream is managed separately. Many debugged, production quality, data parallel applications do not require such generality; they perform very 5 regular file operations [1, 2, 5, 7, 8, 10]. When regular operations are performed, all VPs read or write the same amount of data at the same time. In such instances, a single value can act as the file pointer for all the VP streams. During a write, each VP copies data from its parallel variable into its file buffer. Under the assumption ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....includes [GMG 88, LC91, Gup92, HKT92] There has been a lot of interest in developing runtime libraries for improving I O performance of 27 I O intensive (not just out of core) parallel applications. Chameleon was the first runtime system which provided extensive support for parallel I O [GGL93] del Rosario et al. proposed a twophase access strategy for efficient access of distributed arrays [dRBC93, BdRC93] This strategy was later extended by Kotz to optimize disk accesses [Kot94] PASSION runtime system builds on [BdRC93] and provides runtime routines to access distributed ....
N. Galbreath, W. Gropp, and D. Levine. Applications-Driven Parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....it off. 1 Introduction If you are already familiar with latex, we don t have anything new to tell you about latex; but you still might enjoy looking at the lists below. If you are new to latex, you might appreciate our examples of how to cite papers [Fox90] and of how to cite multiple papers [Bord93, Maier93, Galbreath93]. Or, you might like to examine some examples of lists. 1.1 Numbered Lists Here s a numbered list: 1. If you haven t explored the Internet yet, it s time to get started. 2. The best way to do it is to get your sysadmin to install Mosaic (ftp the appropriate binary from ftp.ncsa.uiuc.edu) the ....
N. Galbreath, W. Gropp, and D. Levine, `Applications-Driven Parallel I/O,' Proceedings of Supercomputing '93, pages 462-471, 1993.
....4: C data structures for array i o library The original checkpoint facility of our flow solver is implemented so that each processor opens its own set of files to store the portion (chunks) of distributed arrays assigned to that processor. This is a common approach for implementing checkpoints [Galbreath93]. The advantages of this approach are the simple coding that it requires, and its reasonable performance. The first disadvantage of this approach is the large numbers of files that are created for checkpoints when many processors and or arrays are involved in the computation. For example, when ....
....and expect our library will be layered above file systems like Vesta. Bordawekar93] describes run time primitives to support a two phase access strategy for conducting parallel i o. Such a facility is useful, although it is not needed so far in the write intensive applications we studied. [Galbreath93] reports on experiences with parallel applications at Argonne National Laboratory. Like ours, and in contrast to many of the efforts discussed above, their work emphasizes the value of abstractions. The interfaces we propose are at a more abstract level than those of [Galbreath93] We decouple the ....
[Article contains additional citation context not shown here]
N. Galbreath, W. Gropp, and D. Levine, Applications-Driven Parallel I/O, Proceeding of Supercomputing '93, pages 462-471, 1993.
....programmer s interfaces (APIs) for describing user level access patterns. Such interfaces allows the users to directly read or write (subsections of) data structures like matrices. Examples of such libraries include PASSION [CBD 95] PANDA [SCJ 95] Jovian [BBS 94] and Chameleon [GGL93] Though these APIs are sufficiently expressive, they are dependent on the high level programming paradigm. For example, a majority of these libraries can only express regular distributed matrix computations. In the present form, these interfaces can not express irregular matrix operations ....
N. Galbreath, W. Gropp, and D. Levine. Applications-Driven Parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....irregular distributions of data, and shows that disk directed i o is never slower in the test cases. Kotz95b] uses this approach to implement an out of core LU decomposition problem and shows that it is much better than the traditional caching scheme. Among other approaches to collective i o, Galbreath93] buffers several i o requests before issuing disk i o requests. In this approach, each processor contributes its requests to a buffer, then the buffers are gathered together and written into or read from by a master processor. Bennett94] gives a strategy to minimize the number of i o requests ....
N. Galbreath, W. Gropp, and D. Levine, Applications-Driven Parallel I/O, Proceedings of Supercomputing '93, pages 462-471, 1993.
....column, these p p processes can each access n= p p data elements, while the other processes access 0 2 We assume that p is a square. 15 system advantages disadvantages nCUBE [16] simple partitioning based on bit permutations all sizes must be powers of 2 array partitioning library [7, 6, 23] supports common array partitioning patterns, high level of abstraction must access full array in one operation nested strided [50] supports the common multidimensional access patterns user needs to compute offsets and strides in all dimensions, must access a full multidimensional ....
....to store array data, the same partitioning scheme can be used. In effect, the distribution of the array data among the processes induces a partitioning of the file segment that stores the array. This has been suggested in a number of libraries, especially in the context of providing I O for HPF [7, 6, 23]. Naturally, it allows all the common partitioning patterns to be expressed. The interface supported by these libraries is a high level interface suitable for direct use by programmers, and using the same abstraction (i.e. partitioned arrays) An analogous low level interface has also been ....
[Article contains additional citation context not shown here]
N. Galbreath, W. Gropp, and D. Levine, "Applications-driven parallel I/O". In Supercomputing '93, pp. 462--471, Nov 1993.
....I O patterns [24] In addition to the commercial offerings (IBM SP2 PIOFS [6] Intel iPSC CFS [25, 27] and Paragon PFS [11, 28] nCUBE [8] and Thinking Machines CM 5 sfs [2, 20] there has been a recent flurry of activity in the research community. PIOUS [22, 23] and PETSc Chameleon I O [14] are both widely available nonproprietary portable parallel I O interfaces. PIOUS is a PVM based parallel file interface. Files can be declustered across disks in a round robin fashion. Access modes support globally shared and independent file pointers, and file per node accesses. PETSc Chameleon ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....forecasting and astronomy simulation. These applications often deal with large multidimensional arrays and are i o intensive [Sio94] with the need to periodically output the current state of computation, read array data that cannot completely fit into memory, or do checkpoint restart operations [Galbreath93] Because of the slow rate of improvement in speed of access to disk storage, these applications are usually i o bottlenecked. Disk striping has been used to alleviate the CPU disk hardware mismatch by using multiple disks in parallel to increase i o bandwidth. However, the aggregate bandwidth of ....
N. Galbreath, W. Gropp, and D. Levine, Applications-Driven Parallel I/O, Proceedings of Supercomputing '93, pages 462-471, 1993.
....The volume of metadata describing the mode switches and VP activity in every IB segment would be huge. As we see in Section 5.2, IB s cost relative to NB and CB is not prohibitive. Finally, files used by most data parallel applications would remain in NB or CB modes during their lifetimes [5, 7, 14, 16, 21], and Stream s optimistic approach matches their needs without becoming overly complex. 1 Note that operations allowed in NB mode form a subset of those allowed in CB. NB CB IB Figure 4: The segmented nature of a Stream file. The first segment is written using NB mode, and VP blocks of size ....
.... built using a simple utility program in some situations (e.g. if a file has a different distribution than the program that will read it) Because most data parallel applications rely on regular array oriented I O, the use of NB as an interface to the outside world should work in most situations [5, 7, 14, 16, 21]. Files with CB and IB segments require explicit conversions. A file containing a CB or IB segment can be converted to NB using a high level C program. The program has each VP reading its existing stream and writing its data, or a fixed value upon reaching EOF on its input, in NB mode. 7 ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....I O nodes. Most are based on a fairly traditional Unix like interface, in which individual processes make a request to the file system for each piece of the file they read or write. Increasingly common, however, are specialized interfaces to support multidimensional matrices [CFPB93, SW94, GL91, GGL93, BdC93, BBS 94, Mas92] and interfaces that support collective I O [GGL93, BdC93, BBS 94, Mas92] With a collective I O interface, all processes make a single joint request to the file system, rather than numerous independent requests. Disk directed I O is a promising new technique that ....
....individual processes make a request to the file system for each piece of the file they read or write. Increasingly common, however, are specialized interfaces to support multidimensional matrices [CFPB93, SW94, GL91, GGL93, BdC93, BBS 94, Mas92] and interfaces that support collective I O [GGL93, BdC93, BBS 94, Mas92] With a collective I O interface, all processes make a single joint request to the file system, rather than numerous independent requests. Disk directed I O is a promising new technique that takes advantage of a collective I O interface, and leads to much better ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....certain optimizations. Another advantage is that if any processor needs to access some data which was previously modified by some other processor, it can be done using just a read call without any additional synchronization. The idea of collective I O has also been used in other schemes such as in [1, 11, 10]. In the next section, we describe the Extended Two Phase Method for reading sections of outof core arrays. The method for writing sections is analogous and is discussed in Section 6. 5 Reading Sections of Out of Core Array Let us assume that each processor needs to read some regular section of ....
N. Galbreath, W. Gropp, and D. Levine. Applications-Driven Parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, November 1993.
....what we need: the sample size is small; the programs are parallelized sequential programs, not parallel programs per se; and the I O itself was not parallelized. Cypher et al. 17] studied individual parallel scientific applications, measuring temporal patterns in I O rates. Galbreath et al. [18] present a useful high level characterization based on anecdotal evidence. Bagrodia et al. 19] have proposed using Pablo to analyze and characterize specific applications, and Crandall et al. performed such an analysis on three scientific applications [20] As part of the CHARISMA project, we ....
.... modes that specify whether and how parallel processes share a common file pointer [14, 21, 22, 23, 24, 25] Some systems are based on a memory mapped interface [26, 27] and two provide a way for the user to specify per process logical views of the file [28, 29] Some provide SIMDstyle transfers [30, 31, 25, 18]. Finally, in addition to shared file pointers, MPI IO allows applications to describe a mapping from a linear file to the compute nodes running the application in terms of higher level data structures [32] Clearly, the industrial and research communities have not yet settled on a single new ....
N. Galbreath, W. Gropp, and D. Levine, "Applications-driven parallel I/O", in Proceedings of Supercomputing '93, 1993, pp. 462--471.
....I O algorithms. Reddy et al. 29] studied I O from parallelized sequential applications, but their applications were handpicked and I O was not parallel. Cypher et al. 8] studied self selected parallel scientific applications, mainly to establish temporal patterns in I O rates. Galbreath et al. [15] have used anecdotal evidence to provide a high level picture of I O from some parallel applications. Recently Bagrodia et al. have proposed using Pablo to analyze and characterize specific applications [2] The only file system workload study of a production parallel scientific computation ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....records read or written by different processors, exposing the I O model to the application writer. Units of I O seem to be either (sub)matrices (1 5 dimensions) or items in a collection of objects (100 10000 bytes each) Data set sizes varied up to 1 TB; bandwidth needs varied up to 1 GB s. ffl [GGL93] They give a useful overview of the I O requirements of many applications codes, in terms of input, output, scratch files, debugging, and checkpointing. ffl [Moo95] They briefly describe the I O requirements for four production oceanography programs running at Oregon State University. The ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....if one million particle are simulated in a particular run of the three dimensional dsmc 3d code, then each snapshot is 24 MB (3 double precision numbers per particle) Checkpointing. Checkpointing is often required for long running applications which can get interrupted for a variety of reasons [12]. Checkpointing is also used for parametric studies, i.e. modifying some of the checkpointed values and restarting the computation [11] Out of Core Computations. Several scientific and engineering computations operate on large data structures which do not fit into the main memory. Such codes ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven Parallel I/O. In Proceedings Supercomputing '93, pages 462--471, November 1993.
....performance. Most current multiprocessor file systems are derivatives of Unix file systems. Typical Unix workloads [29, 11] however, differ significantly from scientific multiprocessor workloads [24, 31] Scientific programs use files for checkpointing, applicationcontrolled virtual memory [9, 12], and visualization output, which are not common in Unix workloads. Furthermore, parallel scientific programs exhibit patterns that are more complicated than simple sequential patterns observed in vector scientific or Unix workloads. For example, many exhibit forward jumping sequential patterns ....
....For example, many exhibit forward jumping sequential patterns [24, 31] many of which are actually complex strided patterns [26, 27] Clearly, parallel file systems must be redesigned to fit these common access patterns. Several recent works have proposed changes to the file system interface [3, 7, 8, 10, 12, 18, 19, 26]. One such proposed interface is collective I O. In a traditional file system interface, processes within a parallel job often have to express the transfer of a large object (e.g. a large matrix) as small, non contiguous, per processor requests, thereby losing valuable semantic information that a ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages
....I O algorithms. Reddy et al. 29] studied I O from parallelized sequential applications, but their applications were handpicked and I O was not parallel. Cypher et al. 8] studied selected parallel scientific applications, mainly to establish temporal patterns in I O rates. Galbreath et al. [15] used anecdotal evidence to provide a high level picture of I O from some parallel applications. Bagrodia et al. proposed using Pablo to analyze and characterize specific applications [2] The only file system workload study of a production parallel scientific computation environment was that of ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....95] Most are based on a fairly traditional Unix like interface, in which individual processes make a request to the file system for each piece of the file they read or write. Increasingly common, however, are specialized interfaces to support multidimensional matrices [GL91, Mas92, BdC93, CFPB93, GGL93, BBS 94, SW94] and interfaces that support collective I O [Mas92, BdC93, GGL93, BBS 94, CFH 95] With a collective I O interface, all processes make a single joint request to the file system, rather than numerous independent requests. In this paper we assume that the multiprocessor ....
....processes make a request to the file system for each piece of the file they read or write. Increasingly common, however, are specialized interfaces to support multidimensional matrices [GL91, Mas92, BdC93, CFPB93, GGL93, BBS 94, SW94] and interfaces that support collective I O [Mas92, BdC93, GGL93, BBS 94, CFH 95] With a collective I O interface, all processes make a single joint request to the file system, rather than numerous independent requests. In this paper we assume that the multiprocessor has an architecture like that in Figure 1, in which there are two types of processor ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, 1993.
....is extremely tedious and error prone. It provides only limited insight, especially if the programs contain large data structures or execute for a long time. Some of the limitations of the manual approach can be avoided if parts of the program s state are saved on a file at regular intervals [10]. This approach requires modifications to the program source code to insert state saving routines. The files, produced by the two programs, are compared using a file comparison utility, such as diff on Unix. One limitation of this technique is that the required disk space grows linearly with the ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings Supercomputing-93, Portland, Oregon, pages 462--471. IEEE, 1993.
....record in a commercial database is usually very small. 13.1 Introduction Any application, sequential or parallel, may need to access data stored in files for many reasons, such as reading the initial input, writing the results, checkpointing for later restart, data analysis, and visualization [18]. In this chapter we are concerned mainly with parallel applications consisting of multiple processes (or threads 1 ) that need to access data stored in files. We define parallel I O as concurrent requests from multiple processes of a parallel program for data stored in files. Accordingly, at ....
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing '93, pages 462--471. IEEE Computer Society Press, November 1993.
....program is restarted from a checkpoint. In this case, distributed arrays are read from previously stored files. All I O is performed using the Chameleon I O library, which provides a portable set of routines for high performance I O that hide the details of the actual implementation from the user [13]. The 3 user can select at run time whether a file is stored as a simple Unix file (for compatibility with other tools) or as a parallel file readable only with the Chameleon I O library routines (for maximum performance) In the astrophysics application, all data is stored in simple Unix files ....
N. Galbreath, W. Gropp, and D. Levine. Applications-Driven Parallel I/O. In Proceedings of Supercomputing '93, pages 462--471, November 1993.
....shown that collective I O can improve performance significantly [5, 27, 16, 24] However, collective I O cannot be done with the Unix API. Over the past few years, many research parallel file systems and I O libraries have been developed that perform various optimizations, including collective I O [28, 11, 20, 17, 3, 10, 25, 9, 19]. Each of these, however, has a different API with varying degrees of portability and generality. The only standard, portable API that has been available on all machines is the Unix API. Therefore, most users write applications for the Unix API and get bad performance for reasons explained above. ....
N. Galbreath, W. Gropp, and D. Levine. Applications-Driven Parallel I/O. In Proceedings of Supercomputing '93, pages 462--471. IEEE Computer Society Press, November 1993.
No context found.
N. Galbreath, W. Gropp, and D. Levine. Application driven parallel I/O. In Proceedings of Supercomputing 93, November 1993. Also Argonne technical report MCS-P381-0893.
No context found.
N. Galbreath, W. Gropp, and D. Levine, Applications-driven parallel i/o, Supercomputing, 1993.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC