| Rajeev Thakur and Alok Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996. |
....37 38 39 40 41 42 43 44 45 46 47 I O POSIX provides a model of a widely portable file system, but the portability and optimization needed for parallel I O cannot be achieved with the POSIX interface. The significant optimizations required for efficiency (e.g. grouping [15] collective buffering [1, 2, 16, 19, 22], and disk directed I O [13] can only be implemented if the parallel I O system provides a high level interface supporting partitioning of file data among processes and a collective interface supporting complete transfers of global data structures between process memories and files. In addition, ....
Rajeev Thakur and Alok Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
.... also been observed by a number of other authors [5, 13, 21, 26, 28, 31, 36] Several runtime support libraries and file systems have been developed to support efficient I O in a parallel environment [15, 34] most noticeable among these is the PASSION library designed by Alok Choudhary s group [39, 40]. They usually provide a collective I O interface, in which all processing nodes cooperate to make a single large I O request. With these collective I O interfaces, the I O requests still need to be inserted by the programmers, and data processing usually cannot begin until the entire collective ....
Rajeev Thakur and Alok Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming, 5(4):301--317, Winter 1996.
.... the design of our middleware, has also been observed by Skillicorn [26, 25] Several runtime support libraries and file systems have been developed to support efficient I O in a parallel environment [11, 23] most noticeable among these is the PASSION library designed by Alok Choudhary s group [27, 28]. They usually provide a collective I O interface, in which all processing nodes cooperate to make a single large I O request. With these collective I O interfaces, the I O requests still need to be inserted by the programmers, and data processing usually cannot begin until the entire collective ....
Rajeev Thakur and Alok Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....by runtime libraries, which are linked to the application programs. Thus, the application program performs the data accesses itself without the need for dedicated I O server programs. Examples for this group are the Two Phase method [3] the Jovian framework [1] and the Extended Two Phase method [27]. 2.1.2 I O Level Methods The I O level methods try to reorganize the disk access requests of the application programs to achieve better performance. This is done by independent I O node servers, which collect the requests and perform the accesses. Therefore, the disk requests (of the ....
Rajeev Thakur and Alok Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-ofCore Arrays. Scientic Programming, 5(4):301-317, Winter 1996.
....48 Chapter 9 I O 9.1 Introduction POSIX provides a model of a widely portable file system, but the portability and optimization needed for parallel I O cannot be achieved with the POSIX interface. The significant optimizations required for efficiency (e.g. grouping [15] collective buffering [1, 2, 16, 19, 22], and disk directed I O [13] can only be implemented if the parallel I O system provides a high level interface supporting partitioning of file data among processes and a collective interface supporting complete transfers of global data structures between process memories and files. In addition, ....
Rajeev Thakur and Alok Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....across different applications and different number of processors, and look into approaches to estimate such changes to make our cost models more accurate. 5 Related Work Several runtime support libraries and file systems have been developed to support efficient I O in a parallel environment [2, 8, 12, 13, 19, 21, 24, 25]. These systems mainly focus on supporting regular strided access to uniformly distributed datasets, such as images, maps, and dense multidimensional arrays. ADR differs from these systems in several ways. First, ADR is able to carry out range queries directed at irregular spatially indexed ....
R. Thakur and A. Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....I O, the application uses a particular interface to alert the file system to a collective operation. In a two phase read, for example, data is read from disks sequentially and redistributed to the processors involved in the collective. A variation of two phase I O called extended two phase I O [20] uses collective I O in conjunction with dynamic partitioning of the I O workload among processors to balance load and improve performance. As we have described, disk directed I O is another method for optimizing collective requests, where the complete request is passed to the I O processors to be ....
THAKUR, R., AND CHOUDHARY, A. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Tech. Rep. CACR-103, Scalable I/O Initiative, Center for Advanced Computing Research, Caltech, June 1995. Revised November 1995.
....imbalance incurred during the local reduction phase, while for FRA and SRA it is due to constant overheads in the initialization and global reduction phases. 5 Related Work Several runtime support libraries and file systems have been developed to support efficient I O in a parallel environment [4, 9, 16, 18, 24, 28, 34, 35]. These systems mainly focus on supporting regular strided access to uniformly distributed datasets, such as images, maps, and dense multi dimensional arrays. They also usually provide a collective I O interface, in which all processing nodes cooperate to make a single large I O request. ADR ....
R. Thakur and A. Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....data dimensions can be spatial coordinates, time, or varying experimental conditions such as temperature, velocity or magnetic field. The increasing importance of such applications has been widely recognized. Runtime systems like the Active Data Repository [5, 6] and the Passion runtime library [18, 19] allow high performance on data intensive applications, but do not address the need for programming with high level abstractions. We target two high level programming models for this important class of computations: 1. Object Oriented (Java Based) Object oriented features like encapsulation and ....
Rajeev Thakur and Alok Choudhary.An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming, 5(4):301--317, Winter 1996.
No context found.
R. Thakur and A. Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....of platforms. The most popular implementation, ROMIO [17] is implemented portably on top of an abstract I O device layer [14, 16] that enables portability to new underlying I O systems. One of the most important features in ROMIO is collective I O operations, which adopt a two phase I O strategy [11, 12, 13, 15] and improve the parallel I O performance by significantly reducing the number of I O requests that would otherwise result in many small, noncontiguous I O requests. However, MPI IO reads and writes data in a raw format without providing any functionality to effectively manage the associated ....
....Fortran function calls with nfmpi . 4.1. Interface Design Our parallel netCDF API is built on top of MPI IO. The parallel netCDF built on MPI IO can benefit from several well known optimizations already used in existing MPI IO implementations, such as data sieving and two phase I O strategies [11, 12, 13, 15] in ROMIO. Figure 3 describes the overall architecture for our design. In parallel netCDF, a file is opened, operated, and closed by the participating processes in a communication group. In order for these processes to operate on the same file space, especially upon the structural information ....
R. Thakur and A. Choudhary. "An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays," Scientific Programming, 5(4):301-317, Winter 1996.
....that shows the best I O performance. 32 64 32 64 32 64 0.0 20.0 40.0 60.0 80.0 100.0 I O Bandwidth (MB Sec. Original) Level 1) Level 2 3) SDM Figure 7. I O bandwidth for RT 5. Related Work Several efforts have sought to optimize I O in parallel file systems and runtime libraries [3, 5, 6, 14, 16, 18, 22, 27, 31]. SRB (Storage Resource Broker) 2] provides an uniform interface to access various storage systems, such as file systems, Unitree, HPSS and database objects. However, it does not fully support the optimizations implemented in MPIIO. Shoshani et al. 28, 29] describe an architecture for op6 ....
R. Thakur and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....it is often the case that in the aggregate the whole array is being written to or read from the file system. The application can make use this knowledge to significantly improve its I O performance. The technique of collective I O has been developed to better utilize the parallel I O subsystem [6, 19, 20, 2, 15, 18, 3]. In this approach, the processors exchange information about their individual I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. Two ....
....I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. Two significant implementation techniques for collective I O are two phase I O [6, 19, 20] and disk directed I O [13, 16] In disk directed I O, the collective I O request is sent to the I O processors which collectively determine and carry out the optimal I O strategy. In the two phase approach, the application processors collectively determine and carry out the optimized approach. In ....
[Article contains additional citation context not shown here]
Thakur, R. and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 5(4):301-317, Winter 1996.
....it is often the case that in the aggregate the whole array is being written to or read from the file system. The application can use this knowledge to significantly improve its I O performance. The technique of collective I O has been developed to better utilize the parallel I O subsystem [6, 19, 20, 2, 15, 18, 3]. In this approach, the processors exchange information about their individual I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. Two ....
....I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. Two significant implementation techniques for collective I O are two phase I O [6, 19, 20] and disk directed I O [13, 16] In disk directed I O, the collective I O request is sent to the I O processors which collectively determine and carry out the optimal I O strategy. In the two phase approach, the application processors collectively determine and carry out the optimized approach. In ....
[Article contains additional citation context not shown here]
Thakur, R. and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 5(4):301-317, Winter 1996.
....that is, by performing collective I O. Collective I O can be performed in different ways and has been studied by many researchers in recent years. It can be done at the disk level (disk directed I O [8] at the server level (server directed I O [17, 16] or at the client level (two phase I O [4, 21] or collective buffering [12] Each method has its advantages and disadvantages. Since ROMIO is a portable, user level library with no separate I O servers, it performs collective I O at the client level by using a generalized version of two phase I O. ROMIO performs collective I O when the user ....
....The advantage of this method is that by making all file accesses large and contiguous, the I O time is reduced significantly. The added cost of interprocess communication for redistribution is (almost always) small compared with the savings in I O time. The basic two phase method was extended in [21] to access sections of out of core arrays. Since MPI IO is a general parallel I O interface, I O requests in MPI IO can represent any access pattern, not just sections of arrays. The two phase method in [21] must therefore be generalized to handle any noncontiguous I O request. We have implemented ....
[Article contains additional citation context not shown here]
Rajeev Thakur and Alok Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....requests, it is often the case that in the aggregate the whole array is being written to or read from the file. The application can use this knowledge to significantly improve its I O performance. The technique of collective I O has been developed to better utilize the parallel I O subsystem [10, 26, 27, 4, 17, 23, 5, 8]. In this approach, the processors exchange information about their individual I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. ....
....about their individual I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. There are three approaches to collective I O: two phase I O [10, 26, 27], disk directed I O [17, 19] and server directed I O [7, 23] The primary distinction between these approaches is the level at which the optimal I O strategy is derived and carried out. In disk directed I O, the collective I O request is sent to the disk controllers which collectively determine ....
[Article contains additional citation context not shown here]
Thakur, R. and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 5(4):301--317, Winter 1996. 26
....requests, it is often the case that in the aggregate the whole array is being written to or read from the file. The application can use this knowledge to significantly improve its I O performance. The technique of collective I O has been developed to better utilize the parallel I O subsystem [10, 26, 27, 4, 17, 23, 5, 8]. In this approach, the processors exchange information about their individual I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. ....
....about their individual I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. There are three approaches to collective I O: two phase I O [10, 26, 27], disk directed I O [17, 19] and server directed I O [7, 23] The primary distinction between these approaches is the level at which the optimal I O strategy is derived and carried out. In disk directed I O, the collective I O request is sent to the disk controllers which collectively determine ....
[Article contains additional citation context not shown here]
Thakur, R. and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 5(4):301--317, Winter 1996.
....individual requests of each process are noncontiguous. The merged request can therefore be serviced efficiently. Such optimization is broadly referred to as collective I O. Collective I O has been shown to be a very important optimization in parallel I O and can improve performance significantly [5, 14, 25, 30, 33]. Since none of the file systemson which ROMIO is implemented perform collective I O, ROMIO performs two phase collective I O on top of the file system. In the communication phase, interprocess communication is used to rearrange data into large chunks. In the I O phase, processes perform parallel ....
R. Thakur and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....level (server directed I O [8] or at the client level (two phase I O [3] Since ROMIO is a portable, user level library with no separate I O servers, it performs collective I O at the client level. For this purpose, it uses a generalized version of the extended twophase method described in [11]. 4.1 Two Phase I O Two phase I O was first proposed in [3] in the context of accessing distributed arrays from files. Consider the example of reading a two dimensional array from a file into a (block,block) distribution in memory, as shown in Figure 2. Assume that the array is stored in the ....
....distribution. The advantage of this method is that by making all file accesses large and contiguous, the I O time is reduced significantly. The added cost of interprocess communication for redistribution is small compared with the savings in I O time. The basic two phase method was extended in [11] to access sections of out of core arrays. In ROMIO we use a generalized version of this extended two phase method that can handle any noncontiguous I O request as described by an MPI derived datatype, not just sections of arrays. 4.2 Generalized Two Phase I O in ROMIO ROMIO uses two ....
[Article contains additional citation context not shown here]
R. Thakur and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
....temporary buffer (block, distribution Figure 13.7: Reading a distributed array by using two phase I O pared with the savings in I O time. The overall performance, therefore, is close to what can be obtained by making large I O requests in parallel. The basic two phase method was extended in [54] to access sections of outof core arrays. An even more general version of two phase I O is implemented in ROMIO [58] It supports any access pattern, and the user can also control via hints the amount of temporary memory ROMIO uses as well as the number of processes that actually perform I O in ....
Rajeev Thakur and Alok Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming, 5(4):301-- 317, Winter 1996.
....that can improve performance significantly. These extensions allow users to perform bulk (array) I O operations with a single method call. We have implemented these extensions and validated their performance benefits. 2 1. 3 Related Work Other than the large body of work related to parallel I O [4, 8, 9, 13, 23, 27, 28, 32, 33], the work most closely related to ours is the Jaguar project [36, 37] which aims to improve Java I O performance as one of its goals. Jaguar allows the Java runtime system to be extended with new primitive operations that enable efficient access to hardware resources. These primitives are ....
Thakur, R. and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 5(4):301-317, Winter 1996.
....patterns exhibited by many parallel scientific applications [1, 5] In particular, each processor tends to make a large number of small I O requests, incurring the high cost of I O on each such request. The technique of collective I O has been developed to better utilize the parallel I O subsystem [2, 7, 8]. In this approach, the processors exchange information about their individual I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. Two ....
....I O requests to develop a picture of the aggregate I O request. Based on this global knowledge, I O requests are combined and submitted in their proper order, making a much more efficient use of the I O subsystem. Two significant implementation techniques for collective I O are two phase I O [2, 7] and disk directed I O [4, 6] In the two phase approach, the application processors collectively determine and carry out the optimized approach. In this paper, we deal only with the two phase approach. Consider a collective read operation. If the data is distributed across the processors in a ....
Thakur, R. and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 5(4):301-317, Winter 1996.
....undoubtedly benefit from the proposed methods. Finally, we note that the proposed methods are not just useful for I O, but also for interprocess communication, and would therefore benefit networking applications as well. 7. RELATED WORK Other than the large body of work related to parallel I O [1, 4, 5, 8, 14, 16, 17, 20, 21], the work most closely related to ours is the Jaguar project [23, 24] which aims to improve Java I O performance as one of its goals. Jaguar allows the Java runtime system to be extended with new primitive operations that enable efficient access to hardware resources. These primitives are ....
Thakur, R. and A. Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 5(4):301-317, Winter 1996.
....storage, retrieval and processing of very large multi dimensional datasets. An initial discussion of a framework for scientific data management similar to the one described in this paper is given in [6] Several efforts have involved optimizing I O in parallel file systems and runtime libraries [3, 4, 7, 13, 16, 18, 22, 27, 31]. However, file systems and libraries have a lower level interface than SDM, requiring more work from the user. 6 Conclusions and Future Work We have presented the design and implementation of an environment for high performance scientific data management, called Scientific Data Manager (SDM) ....
Rajeev Thakur and Alok Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
No context found.
Rajeev Thakur and Alok Choudhary. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming, 5(4):301--317, Winter 1996.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC