32 citations found. Retrieving documents...
Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In Proceedings of the IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, Newport Beach, CA, 1993.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Optimizing Noncontiguous Accesses in MPI-IO - Thakur, Gropp, Lusk (2002)   (4 citations)  (Correct)

....that is, by performing collective I O. Collective I O can be performed in different ways and has been studied by many researchers in recent years. It can be done at the disk level (disk directed I O [8] at the server level (server directed I O [17, 16] or at the client level (two phase I O [4, 21] or collective buffering [12] Each method has its advantages and disadvantages. Since ROMIO is a portable, user level library with no separate I O servers, it performs collective I O at the client level by using a generalized version of two phase I O. ROMIO performs collective I O when the user ....

....ROMIO to perform collective optimizations, and ROMIO therefore implements them internally as level 0 requests. Some level 1 requests, such as those that represent a read broadcast type of access pattern, are optimized collectively, however. 6. 1 Two Phase I O Two phase I O was first proposed in [4] in the context of accessing distributed arrays from files. Consider the example of reading a two dimensional array from a file into a (block,block) distribution in memory, as shown in Figure 7. Assume that the array is stored in the file in row major order. As a result of the distribution in ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved Parallel I/O via a Two-Phase Run-time Access Strategy. In Proceedings of the Workshop on I/O in Parallel Computer Systems at IPPS '93, pages 56--70, April 1993. Also published in Computer Architecture News, 21(5):31--38, December 1993.


DPFS: A Data Parallel File System environment - Sueur, Dekeyser, Marquet   (Correct)

....are good when arrays in memory correspond to contiguous storage locations on disk. The file system performs poorly if the elements are scattered across the file [9] To avoid this we used the two phase access strategy developed by Juan Miguel Rosario, Rajesh Bordawekar and Alok Choudhary[3]. This strategy involves a division of the parallel I O functions into two separate phases. First, data accesses are performed on the I O nodes in respect of the data distribution over the disks. In a second phase, the data are redistributed to match the distribution required by the application. ....

Juan Miguel del Rosario and Rajesh Bordawekar and Alok Choudhary, Improved Parallel I/O via a Two-Phase Run-time Access Strategy, IPPS'93 Workshop on Input/Output in Parallel Computer Systems, April 1993.


Overview Of The MPI-IO Parallel I/O Interface - Corbett, Feitelson, Fineberg, .. (1995)   (16 citations)  (Correct)

....from a single (SPMD or MIMD) parallel application run on many nodes, The application data is distributed among the nodes, and is read written to a single logical file, itself spread across nodes and disks. The significant optimizations required for efficiency (e.g. grouping [25] twophase I O [9], and disk directed I O [18] can only be implemented as part of a parallel I O environment if it supports a high level interface to describe the partitioning of file data among processes and a collective interface describing complete transfers of global data structures between process memories ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), pages 31--38, December 1993.


Airdisks and AirRAID: Modeling and scheduling periodic wireless .. - Jain, Werth (1995)   (7 citations)  (Correct)

....of the data on the disk can be changed easily in response to changes in the data access patterns. For magnetic disks such operations are sometimes carried out in order to overcome the I O performance bottleneck presented by magnetic disks in parallel computing. For example, del Rosario et al. [6] describe compiler techniques to detect different data access patterns in different phases of a parallel program; they then show how changing the data layout between two phases of a parallel program can significantly improve overall performance. However, such operations are expensive for magnetic ....

Juan Miguel del Rosario, R. Bordawekar, and Alok Chaudhary. Improved parallel I/O via a two-phase run-time access strategy. In Proc. Workshop on I/O in Parallel Computer Systems, pages 56--70, 1993. Also in ACM SIGARCH Comp. Arch. News., Dec. 1993. -- 11 --


Scheduling on airdisks: Efficient access to personalized.. - Gondhalekar, Jain (1997)   (4 citations)  (Correct)

....of the data on the disk can be changed easily in response to changes in the data access patterns. For magnetic disks such operations are sometimes carried out in order to overcome the I O performance bottleneck presented by magnetic disks in parallel computing. For example, del Rosario et al. [9] describe compiler techniques to detect different data access patterns in different phases of a parallel program; they then show how changing the data layout between two phases of a parallel program can significantly improve overall performance. However, such operations are expensive for magnetic ....

Juan Miguel del Rosario, R. Bordawekar, and Alok Chaudhary. Improved parallel I/O via a two-phase run-time access strategy. In Proc. Workshop on I/O in Parallel Computer Systems, pages 56--70, 1993. Also in ACM SIGARCH Comp. Arch. News., Dec. 1993.


Expanding the Potential for Disk-Directed I/O - Kotz (1995)   (2 citations)  (Correct)

....distribution, each processor independently computes the locations of the records it requires from the file, and reads those records. With a data dependent distribution, however, there is no way for processors to request their own set of data. A reasonable solution is similar to two phase I O [dBC93] each processor reads some convenient subset of data from the file, examines each record to compute the distribution function, and then sends the data to the appropriate processor. In both cases we assume that the distribution function can only decide to which processor each record belongs, and ....

....records. Consider the 64 byte records, which present an interesting picture. Despite nearly doubling the amount of message traffic, traditional caching with data dependent redistribution was faster than the direct access version. Here we were essentially using a pipelined form of two phase I O [dBC93] Compute processors requested whole blocks from the I O nodes, rather than small records. The larger requests reduced the overhead bottleneck experienced at I O nodes, which in this case more than offset the extra work at the compute nodes and in the network. Disk directed I O was always better, ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


Disk-directed I/O for MIMD Multiprocessors - Kotz (1994)   (32 citations)  (Correct)

.... that a large, contiguous, parallel file transfer is in progress is lost through this low level interface. A collective I O interface, in which all CPs cooperate to make a single, large request, retains this semantic information, making it easier to coordinate I O for better performance [dBC93, Nit92, PGK88] Collective I O need not involve matrices. Many out ofcore parallel algorithms do I O in memoryloads, that is, they repeatedly load some subset of the file into memory, process it, and write it out [CK93] Each transfer is a large, but not necessarily contiguous, set of data. ....

....application process must call ReadCP once for each contiguous chunk of the file, no matter how small. Each IOP attempts to dynamically optimize the use of the disk, cache, and network interface. Two phase I O. Figure 1b sketches an alternative proposed by del Rosario,Bordawekar, and Choudhary [dBC93, BdC93] which permutes the data among the CP memories before writing or after reading. Thus, there are two phases, one for I O and one for an in memory permutation. The permutation is chosen so that requests to the IOPs conform to the layout of the file, that is, the requests are for large ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a twophase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31-- 38.


Disk-directed I/O for MIMD Multiprocessors - Kotz (1996)   (32 citations)  (Correct)

....how small. Figure 2a shows the function called by the application on the CP to read its part of a file, and the corresponding function executed at the IOP to service each incoming CP request. Two phase I O. Figure 2b sketches an alternative proposed by del Rosario, Bordawekar, and Choudhary [dBC93, TCB 96] which permutes the data among the CP memories before writing or after reading. Thus, there are two phases, one for I O and one for an in memory permutation. a) Traditional parallel file system ReadCP(file, read parameters, destination address) for each file block needed to ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


Disk-directed I/O for MIMD Multiprocessors - Kotz (1994)   (32 citations)  (Correct)

.... that a large, contiguous, parallel file transfer is in progress is lost through this low level interface. A collective I O interface, in which all CPs cooperate to make a single, large request, retains this semantic information, making it easier to coordinate I O for better performance [dBC93] Collective I O need not involve matrices. Many out of core parallel algorithms do I O in memoryloads, that is, they repeatedly load some subset of the file into memory, process it, and write it out [CK93] Each transfer is a large, but not necessarily contiguous, set of data. Traditional ....

....application process must call ReadCP once for each contiguous chunk of the file, no matter how small. Each IOP attempts to dynamically optimize the use of the disk, cache, and network interface. Two phase I O. Figure 2b sketches an alternative proposed by del Rosario, Bordawekar, and Choudhary [dBC93, BdC93] which permutes the data among the CP memories before writing or after reading. Thus, there are two phases, one for I O and one for an in memory permutation. The permutation is chosen so that requests to the IOPs conform to the layout of the file, that is, the requests are for large ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


Disk-directed I/O for MIMD Multiprocessors - David Kotz (1994)   (32 citations)  (Correct)

.... that a large, contiguous, parallel file transfer is in progress is lost through this low level interface. A collective I O interface, in which all CPs cooperate to make a single, large request, retains this semantic information, making it easier to coordinate I O for better performance [dBC93, Nit92, PGK88] Collective I O need not involve matrices. Many out of core parallel algorithms do I O in memoryloads, that is, they repeatedly load some subset of the file into memory, process it, and write it out [CK93] Each transfer is a large, but not necessarily contiguous, set of data. ....

....application process must call ReadCP once for each contiguous chunk of the file, no matter how small. Each IOP attempts to dynamically optimize the use of the disk, cache, and network interface. Two phase I O. Figure 2b sketches an alternative proposed by del Rosario, Bordawekar, and Choudhary [dBC93, BdC93] which permutes the data among the CP memories before writing or after reading. Thus, there are two phases, one for I O and one for an in memory permutation. The permutation is chosen so that requests to the IOPs conform to the layout of the file, that is, the requests are for large ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


Flexibility and Performance of Parallel File Systems - Kotz, Nieuwejaar (1996)   (6 citations)  (Correct)

.... [CFP 95] HFS [KS96] PIOUS [MS94] RAMA [MK95] PPFS [HER 95] Scotch [GSC 95] and Galley [NK96a, NK96b] Many more techniques for improving the performance of parallel file systems have been described, including caching and prefetching [KE93b, KE93a, PGG 95] two phase I O [dBC93] disk directed I O [Kot94] compute node caching [PEK96] chunking [SW94] compression [SW95] filtering [Kot95, BP88] and so forth. The diversity of current systems and techniques indicates that there is clearly no consensus about the structure of, interface to, or even functionality of ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


HFS: A flexible file system for shared-memory multiprocessors - Krieger (1994)   (17 citations)  (Correct)

....the choice of policies used by the file system can be greatly simplified if the application can identify its expected demands in advance. Most researchers studying file system issues for parallel supercomputers have recognized the need for cooperation between the application and the file system [26, 34, 49, 68, 83, 103]. Our approach differs from others in that HFS gives the application the ability to explicitly customize the implementation of a file to conform to its specific requirements. We believe that this extra level of control is important, given the current lack of understanding of the requirements of ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38. BIBLIOGRAPHY 109


Parallel I/O - Thakur, Gropp   Self-citation (Parallel)   (Correct)

....as well as the fact that all processes need to access the file simultaneously, the implementation (of the API) can read the entire file contiguously and simply send the right pieces of data to the right processes. This optimization, known as collective I O, can improve performance significantly [13, 28, 48, 58]. The I O API thus plays a critical role in enabling the user to express I O operations conveniently and also in conveying sufficient information about access patterns to the I O system so that the system can perform I O efficiently. Another problem with commercial parallel file system APIs is ....

....access even further. Instead of reading large chunks and discarding the unwanted data as in data sieving, the unwanted data can be communicated to other processes that need it. Such optimization is broadly referred to as collective I O, and it has been shown to improve performance significantly [13, 28, 48, 58, 66]. Collective I O can be performed in different ways and has been studied by many researchers in recent years. It can be done at the disk level (diskdirected I O [28] at the server level (server directed I O [48] or at the client level (two phase I O [13] or collective buffering [37] Each ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In Proceedings of the Workshop on I/O in Parallel Computer Systems at IPPS '93, pages 56--70, April 1993. Also published in Computer Architecture News, 21(5):31--38, December 1993.


Implementation and Evaluation of Collective I/O in the .. - Rajesh Bordawekar.. (1996)   (1 citation)  Self-citation (Rajesh Parallel)   (Correct)

....generate hints about the collective request which can be used optimizing I O. The collective I O can be supported by both runtime libraries and file systems. 2. 2 Collective I O Implementations In this section, we describe two different implementations of collective I O, namely, Two phase I O [dBC93, BdC93] and Disk directed I O [Kot94, Kot96] November 1996 Scalable I O Initiative 4 Tech. Report CACR 128 2.2.1 Two phase I O Two phase I O is an implementation of collective I O in the file space, i.e. its conforming pattern matches the order in which data is stored in files. Two phase ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved Parallel I/O via a TwoPhase Run-time Access Strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


VIP-FS: A VIrtual, Parallel System for High Performance.. - Harry, Rosario..   Self-citation (Del Alok Parallel)   (Correct)

....in the distributed application perform I O access with some global pattern, then it is useful to employ a more efficient access strategy. The two phase access strategy has been shown to provide more consistent performance across a wider variety of data distributions than direct access methods [7]. With two phase access, all clients access data approximately simultaneously. The file system schedules access so that data sotrage or retrieval from the I O devices follow a near optimal pattern with a reduction in the total number of requests for the entire I O operation. In a second stage, the ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In The 1993 IPPS Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993.


The Design of VIP-FS: A Virtual, Parallel File System.. - Rosario, Harry.. (1994)   (3 citations)  Self-citation (Del Alok Parallel)   (Correct)

....in the distributed application perform I O access with some global pattern, then it is useful to employ a more efficient access strategy. The two phase access strategy has been shown to provide more consistent performance across a wider variety of data distributions than direct access methods [7]. With two phase access, all clients access data approximately simultaneously. The file system schedules access so that data sotrage or retrieval from the I O devices follow a near optimal pattern with a reduction in the total number of requests for the entire I O operation. In a second stage, the ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a twophase run-time access strategy. In The 1993 IPPS Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993.


Jovian: A Framework for Optimizing Parallel I/O - Robert Bennett (1994)   (6 citations)  Self-citation (Parallel)   (Correct)

....use of a varying number of coalescing processes (fixed before program execution) to carry out this aggregation process. We will call this kind of optimization collective local I O optimization. There are at least two other projects that aim to perform collective local I O optimizations. Choudhary [2, 6] assumes that each processor has access to either a physical or logical I O device. They partition parallel I O into two phases, where processors first read data in a layout that corresponds to the logical disk layout, and then perform in memory permutations to lay out the data as required. The ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. ACM Computer Architecture News, 21(5):31--38, December 1993.


The Impact of Spatial Layout of Jobs on Parallel I/O.. - Mache, Lo, Livingston..   Self-citation (Parallel)   (Correct)

....caching, and by ineffective use of the I O network. To contention network ION ION ION CN . I O nodes compute node network Figure 2: Parallel I O causing network contention improve these inefficiencies, a variety of approaches have been proposed, including collective I O algorithms [7, 33] and disk directed I O [17, 39] In this paper, we focus on the effective use of the I O network. 2.2 Network Contention Moving data over the interconnection network e.g. between I O nodes and compute nodes is quite costly. This cost is exacerbated by network contention. Network ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In Proceedings of the IPPS '93 Workshop on Input /Output in Parallel Computer Systems, pages 56--70, Newport Beach, CA, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


HPF with Parallel I/O Extensions - Bordawekar, Choudhary   Self-citation (Rajesh Alok Parallel)   (Correct)

....size id 16 Theta16, each disk can be associated to maintain scratch files of a 2 Theta2 processor sub array. ffl FILEPROC: This directive is also similar to the PROCESSORS directive in HPF except that it specifies the processors which really participate in performing I O. From our earlier studies [4, 7], we observed that the best performance need not necessarily be obtained when all processors performing computations also perform I O. Thus, this provides the user the flexibility to specify a set of processors to perform I O. This directive is optional, and if not specified, the default is the ....

....directive. Thus a set of files mapped to the a file template will have a set of arrays (may have different sizes) associated with them. As a result, a compiler can optimize the parallel accesses (e.g. read write) of distributed arrays from to the associated files using various strategies (e.g. [7]) HPF with Parallel I O Extensions 4 3 Compiler and Runtime Support for Parallel I O Information provided by the compiler directives is used to extract parameters about array and file distributions, which in turn are used in the runtime primitives. In the following we briefly discuss the ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a twophase run-time access strategy. The 1993 IPPS workshop on Input/Output in Parallel Computer Systems, April 1993.


ENWRICH: A Compute-Processor Write Caching Scheme for.. - Apratim Purakayastha (1996)   (3 citations)  Self-citation (Parallel)   (Correct)

....For example, many exhibit forward jumping sequential patterns [24, 31] many of which are actually complex strided patterns [26, 27] Clearly, parallel file systems must be redesigned to fit these common access patterns. Several recent works have proposed changes to the file system interface [3, 7, 8, 10, 12, 18, 19, 26]. One such proposed interface is collective I O. In a traditional file system interface, processes within a parallel job often have to express the transfer of a large object (e.g. a large matrix) as small, non contiguous, per processor requests, thereby losing valuable semantic information that a ....

.... complex access patterns [26] Two phase I O, proposed by Del Rosario et al. is an efficient implementation of a large transfer operation where data is permuted in CP memory before a collective I O operation so that the I O operation conforms to the actual file layout for better I O performance [10]. Disk directed I O (discussed in Section 1) proposed by Kotz, efficiently implements collective I O, out of core computations, data dependent distributions, and both regular and irregular requests [20, 21] 3 Design and Operation of ENWRICH Write caches are normally used to delay writes, ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


Exploiting Mapped Files for Parallel I/O - Krieger, Reid, Stumm (1995)   (1 citation)  Self-citation (Parallel)   (Correct)

.... Some have developed complete parallel files systems [10, 19, 4, 8] Others have developed servers or runtime libraries for optimizing I O performance that run on multiplesystems [25, 22, 27, 11, 3, 12] Some research has concentrated on developing specific techniques to improve I O performance [15, 7] that could be incorporated into larger systems. Research has also been carried out specifically on developing application interfaces [14, 3, 13] and compiler interfaces [26, 1] All of these approaches to parallel I O systems consider interface issues to a varying degree. Most existing parallel ....

.... problems of a simple independent read write interface some researchers have turned to much higher level interfaces where the programmer specifies I O requests in terms of entire arrays or large portions of arrays, for example, and the underlying system can optimize each type of high level request [15, 7, 11, 25]. The performance of such array based systems are impressive, and certainly interfaces tuned for arrays must be supported by any parallel I O system that seeks to address the requirements of scientific applications. However, not all I O intensive parallel applications are array based [5, 29] and ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a twophase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31-- 38.


ENWRICH: A Compute-Processor Write Caching Scheme for.. - Purakayastha, Ellis.. (1995)   (3 citations)  Self-citation (Parallel)   (Correct)

.... 95] Files are much larger and file lifetimes much longer than in conventional Unix workloads [PEK 95] Clearly, parallel file systems must be redesigned to fit these common access patterns. Several recent works have proposed changes to the file system interface [BGST93, CFF 95, CFPB93, dBC93, GGL93, Kot93, Kot94, NK95] One such proposed interface is collective I O. In a traditional file system interface, processes within a parallel job often have to express the transfer of a large object (e.g. a large matrix) as small, noncontiguous, per processor requests, thereby losing valuable ....

.... [NK95] Two phase I O, proposed by Del Rosario et al. is an efficient implementation of a large transfer operation where data is permuted in CP memory before a collective read or after a collective write, so that the I O operation conforms to the actual file layout for better I O performance [dBC93] Disk directed I O (DDIO) is an efficient implementation technique for collective I O [Kot94] In DDIO, the CPs make a collective I O request, but the IOP dictates all the data transfer. The IOPs are able to optimize performance by conforming I O operations to both the logical and physical file ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a twophase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


Integrating Theory and Practice in Parallel File Systems - Cormen, Kotz (1993)   (40 citations)  Self-citation (Parallel)   (Correct)

....[PFDJ89] gives plenty of control to the user. In fact, the operating system treats each disk as a separate file system and does not decluster individual files across disks. Thus, the nCUBE provides the low level access one needs, but no higher level access. The current nCUBE file system [dBC93] supports declustering and does allow applications to manipulate the striping unit size and distribution pattern. The file system for the Kendall Square Research KSR 1 [KSR92] shared memory multiprocessor declusters file data across disk arrays attached to different processors. The memory mapped ....

....primitives to control file declustering, caching, and prefetching. The performance of Intel s CFS when reading or writing a two dimensional matrix, for example, depends heavily on the layout of the matrix across disks and across memories of the multiprocessor, and also on the order of requests [dBC93, BCR92, Nit92, GP91, GL91] del Rosario et al. dBC93] find that the nCUBE exhibits similar inefficiencies: when reading columns from a two dimensional matrix stored in row major order, read times increase by factors of 30 50. One solution is to transfer data from disk into memory and then ....

[Article contains additional citation context not shown here]

Juan Miguel del Rosario, Rajesh Borawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993.


Techniques and Optimizations for Developing Irregular.. - Peter Brezany (1996)   (1 citation)  Self-citation (Alok Parallel)   (Correct)

....each with a private memory, requires that data be partitioned into subsets and that these smaller pieces be transferred into or out of the local processor memories. Parallel file systems have been developed but their scalability has not been demonstrated for very large numbers of storage devices [5, 7, 17, 6, 4, 18, 20, 23]. Furthermore, with current languages and compilers, optimal use of scalable parallel file systems requires a substantial amount of tedious program restructuring. The need for high performance I O is so significant that almost all the present generation parallel computers such as the Paragon, ....

....in core array (i.e. partition) is distributed in the BLOCK BLOCK fashion. Like GPM, PIM also performs the I O in global name space and the computation in local name space. In both models, data read and write can be easily performed using the collective methods such as the two phase access method [6, 17]. In this paper we only consider the LPM. A qualitative comparison of these models can be found in [2] 3 Irregular Problems Parallelizing irregular problems is a challenging problem and is of growing importance. In such problems, access patterns to major data arrays are only known at runtime. ....

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a twophase run-time access strategy. In IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, 1993. Also published in Computer Architecture News 21(5), December 1993, pages 31--38.


Lightweight I/O for Scientific Applications - Oldfield, Maccabe, Arunagiri.. (2006)   (Correct)

No context found.

Juan Miguel del Rosario, Rajesh Bordawekar, and Alok Choudhary. Improved parallel I/O via a two-phase run-time access strategy. In Proceedings of the IPPS '93 Workshop on Input/Output in Parallel Computer Systems, pages 56--70, Newport Beach, CA, 1993.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC