18 citations found. Retrieving documents...
David Kotz. Applications of parallel I/O. Technical Report PCS-TR96297, Dept. of Computer Science, Dartmouth College, October 1996. Release 1.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
RAMA: An easy-to-use, high-performance parallel file system - Miller, Katz (1997)   (2 citations)  (Correct)

....requests such as those required by compilations. A common method of coping with the difficulties in efficiently using parallel file systems is to provide additional primitives to control data placement and manage file reads and writes efficiently. Systems such as PASSION [2] and disk directed I O [12] use software libraries to ease the interface between applications and a massively parallel file system. These systems rely on the compiler to orchestrate the movement of data between disk and processors in a parallel application. Parallel I O libraries can gather small requests into the large ....

....will not fit in memory. In such cases, the application has wrung all possible locality from the data by doing its own caching. Nonetheless, it is likely that RAMA will either cache blocks at the node on whose disk the data is stored or use more complex policies such as those discussed in Refs. [1,12], improving performance in situations where many nodes require access to the same few blocks. 3.3. Data integrity and availability As with most file systems, data integrity and availability are major issues in RAMA. Previous parallel file systems did not address this issue because they were ....

D. Kotz, Disk-directed I/O for MIMD multiprocessors, Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


Optimizing Collective I/O Performance on Parallel.. - Chen, Foster..   (Correct)

....to parallel applications. PPFS [7] focuses on efficient caching and prefetching support for parallel applications. MPI IO [3] provides a portable I O interface to MPI programs; it also supports collective I O interfaces. Parallel I O techniques such as two phase I O [2] and disk directed I O [9] for collective I O operations have been adopted in many of these libraries. However, no published studies examine the major performance factors of these systems for a wide range of I O patterns, problem sizes, and execution environments. Little work has been done on automatic performance ....

David Kotz. Disk-directed i/o for MIMD multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994. Revised November 8, 1994.


A Model and Compilation Strategy for Out-of-Core Data.. - Rajesh Bordawekar Alok (1995)   (18 citations)  (Correct)

....file. Otherwise, a two phase method [CBH 94] can be used where the owner reads the data and sends it to the requesting processor when needed. A final possibility applies if there is some processing capability at the I O node itself, in which case disk directed I O can be used to send the data [Kot94] The most important point to note here is that data needed by other processors is communicated while the slab is in memory when possible. Thus, this method reduces the number of disk accesses, produces smaller individual messages (although the total communication volume is the same) and is ....

D. Kotz. Disk-Directed I/O for MIMD multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


Overview Of The MPI-IO Parallel I/O Interface - Corbett, Feitelson, Fineberg, .. (1995)   (16 citations)  (Correct)

....MIMD) parallel application run on many nodes, The application data is distributed among the nodes, and is read written to a single logical file, itself spread across nodes and disks. The significant optimizations required for efficiency (e.g. grouping [25] twophase I O [9] and disk directed I O [18]) can only be implemented as part of a parallel I O environment if it supports a high level interface to describe the partitioning of file data among processes and a collective interface describing complete transfers of global data structures between process memories and the file. In addition, ....

David Kotz. Disk-directed I/O for MIMD multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 20 Chapter 1 1994. Also in Proceedings of the First Symposium on Operating Systems Design and Implementation, USENIX, pages 61--74, November 1994.


A Model and Compilation Strategy for Out-of-Core Data Parallel.. - Bordawekar (1995)   (18 citations)  (Correct)

....file. Otherwise, a two phase method [CBH 94] where the owner reads the data and sends it to the requesting processor when needed can be used. A final possibility applies if there is some processing capability at the I O node itself, in which case disk directed I O can be used to send the data [Kot94] The most important point to note here is that data needed by other processors is communicated while the slab is in memory when possible. Thus, this method reduces the number of disk accesses, produces smaller individual messages (although the total communication volume is the same) and is more ....

D. Kotz. Disk-Directed I/O for MIMD multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


Runtime Support for In-Core and Out-of-Core Data-Parallel Programs - Thakur (1995)   (5 citations)  (Correct)

....have been developed recently. Techniques for improving I O performance using collective I O have also been proposed. Two phase I O is a technique for performing collective I O using a runtime library [37, 12] Disk directed I O is a technique for performing collective I O at the file system level [69, 70, 71]. 1.7 Organization of this Thesis The rest of this thesis is organized as follows. Chapter 2 gives an overview of some of the issues in providing runtime support for in core and out of core data parallel programs. Runtime support for array redistribution is discussed in Chapter 3. Chapter 4 ....

D. Kotz. Disk-directed I/O for an Out-of-Core Computation. Technical Report PCS-TR95-251, Dept. of Computer Science, Dartmouth College, January 1995.


PASSION: Parallel And Scalable Software for Input-Output - Choudhary, Bordawekar.. (1994)   (37 citations)  (Correct)

....requests can be aggregated. Using local dataflow analysis, the compiler determines what data can be accessed using a single I O request. The dataflow information can also be used to place I O calls so that the overall I O cost can be reduced. Strategies like two phase access and disk directed I O [Kot94] can be used to optimize I O from disks. Inter and Intra File Organizations: The final step in the local program optimization involves organization of data across files and within files. Depending on the underlying execution model, the compiler generates local array files. The data is either ....

D. Kotz. Disk-directed I/O for MIMD Multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


HFS: A flexible file system for shared-memory multiprocessors - Krieger (1994)   (17 citations)  (Correct)

....the choice of policies used by the file system can be greatly simplified if the application can identify its expected demands in advance. Most researchers studying file system issues for parallel supercomputers have recognized the need for cooperation between the application and the file system [26, 34, 49, 68, 83, 103]. Our approach differs from others in that HFS gives the application the ability to explicitly customize the implementation of a file to conform to its specific requirements. We believe that this extra level of control is important, given the current lack of understanding of the requirements of ....

David Kotz. Disk-directed I/O for MIMD multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


Parallel Simulation of Parallel File Systems and I/O Programs - Bagrodia, Docy, Kahn   (4 citations)  (Correct)

....Three approaches to collective I O are discussed in [Kot97] traditional caching, two phase I O, and disk directed I O. Traditional caching does no collective I O optimizations, since I O requests are served as they arrive. These three methods were implemented and compared using the STARFISH [Kot96] simulator, which is based on Proteus [BDCW91] a parallel architecture simulation engine. In [BBB94] a hybrid methodology for evaluating the performance of parallel I O subsystems was done. PIOS, a trace driven I O simulator, is used to calculate the performance of the I O system for a subset ....

David Kotz. Tuning STARFISH. Technical Report PCS-TR96-296, Dept. of Computer Science, Dartmouth College, October 1996.


A Model and Compilation Strategy for Out-of-Core.. - Bordawekar.. (1995)   (18 citations)  (Correct)

....file. Otherwise, a two phase method [CBH 94] where the owner reads the data and sends it to the requesting processor when needed can be used. A final possibility applies if there is some processing capability at the I O node itself, in which case disk directed I O can be used to send the data [Kot94] The most important point to note here is that data needed by other processors is communicated while the slab is in memory when possible. Thus, this method reduces the number of disk accesses. The method also produces smaller individual messages (although the total communication volume is the ....

D. Kotz. Disk-directed I/O for MIMD Multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


I/O in Parallel and Distributed Systems - Kotz, Jain (1998)   Self-citation (Kotz Parallel)   (Correct)

....proposed. 2.1 Applications There have been two major application domains where I O in parallel computer systems has traditionally been found to be a bottleneck. One is scientific computing with massive datasets, such as those found in seismic processing, climate modeling, and so forth [dC94, Kot96a] The second is databases [DG92, BDLJ85] The I O bottleneck continues to be a serious concern for scientific computing, particularly Grand Challenge problems, where it is now commonly recognized as an obstacle [Sha95] Many scientific applications generate 1 GB of I O per run [CHKM96, dC94, ....

David Kotz. Applications of parallel I/O. Technical Report PCS-TR96-297, Dept. of Computer Science, Dartmouth College, October 1996. Release 1.


Parallel I/O - Thakur, Gropp   Self-citation (Parallel)   (Correct)

....a few hundred Mbytes sec. In fact, many applications achieve less than 10 Mbytes sec [12] As parallel computers get bigger and faster, scientists are increasingly using them to solve problems that not only need a large amount of computing power but also need to access large amounts of data. See [14, 26, 38] for a list of many such applications. Since I O is slow, the I O speed, and not the CPU or communication speed, is often the bottleneck in such applications. For parallel computers to be truly usable for solving real, large scale problems, the I O performance must be scalable and balanced with ....

David Kotz. Applications of parallel I/O. Technical Report PCS-TR96-297, Dept. of Computer Science, Dartmouth College, October 1996. Release 1. http://www.cs.dartmouth.edu/reports/abstracts/TR96-297.


Disk-directed I/O for MIMD Multiprocessors - Kotz (1994)   (32 citations)  Self-citation (Kotz Mimd)   (Correct)

....system on a simulated MIMD multiprocessor (see below) We did not implement two phase I O because, as we discuss in Section 7.1, disk directed I O obtains all the benefits of two phase I O, and more. In this section, we describe our simulated implementation; more details can be found in [Kot94] Files were striped across all disks, block by block. Each IOP served one or more disks, using one I O bus. Each disk had a thread permanently running on its IOP, that controlled access to the disk. Disk directed I O. Each IOP received one request, creating one new thread. The new thread ....

....so we chose a subset that would push the limits of the system by using the contiguous layout, and exhibit most of the variety shown earlier, by using the patterns ra, rn, rb, and rc with 8 KB records. ra throughputwas normalized as usual. For more details and other variations, see [Kot94] We first varied the number of CPs (Figure 5) holding the number of IOPs and disks fixed, and maintaining the cache size for traditional caching at two buffers per disk per CP. Note that disk directed I O was unaffected. Multiple localities hurt rb as before, but the most interesting effect was ....

David Kotz. Disk-directed I/O for MIMD multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


Disk-directed I/O for MIMD Multiprocessors - Kotz (1996)   (32 citations)  Self-citation (Kotz)   (Correct)

....Memputs, or before writing, using Memgets. When the matrix element size was smaller than the maximum message size, we allowed the Memput and Memget requests to be batched into group requests. This decision nearly always led to better performance, although it was up to 5 slower in some cases [Kot96c] As in a real two phase I O implementation, the code is layered above a traditional file system; we use the traditional parallel file system described below. Disk directed I O. Each IOP received one request, which was handled by a dedicated thread. The thread computed the list of disk blocks ....

....transfer next. When possible the buffer thread sent concurrent Memget or Memput messages to many CPs. When the matrix element size was smaller than the maximum message size, we allowed the Memput and Memget requests to be batched into group requests. This decision always led to better performance [Kot96c] 1 We used a fairly naive approach, with good results [Kot96c] There are more sophisticated techniques [DO96] Traditional parallel file system. Our code followed the pseudo code of Figure 2a. CPs did not cache or prefetch data, so all requests involved communication with the IOP. The CP ....

[Article contains additional citation context not shown here]

David Kotz. Tuning STARFISH. Technical Report PCS-TR96-296, Dept. of Computer Science, Dartmouth College, October 1996.


Disk-directed I/O for MIMD Multiprocessors - Kotz (1996)   (32 citations)  Self-citation (Kotz)   (Correct)

....as much as 18 times faster than the traditional technique. 1 Introduction Scientific applications like weather forecasting, aircraft simulation, molecular dynamics, remote sensing, seismic exploration, and climate modeling are increasingly being implemented on massively parallel supercomputers [Kot96a] Each of these applications has intense I O demands, as well as massive computational requirements. Recent multiprocessors have provided high performance I O hardware [Kot96b] in the form of disks or disk arrays attached to I O processors connected to the multiprocessor s interconnection ....

David Kotz. Applications of parallel I/O. Technical Report PCS-TR96-297, Dept. of Computer Science, Dartmouth College, October 1996.


Disk-directed I/O for MIMD Multiprocessors - Kotz (1994)   (32 citations)  Self-citation (Kotz)   (Correct)

.... not contiguous in the file, but that corresponding sets of slabs for all processors collectively represent a contiguous set of bytes in the file (see Figure 13) In the traditional caching system, each process independently reads and writes the columns of its slab; in the disk directed 10 See [Kot95] for more details. first slab second slab 16 columns, 4 processors 2 columns per slab per processor Figure 13: Example of column cyclic distribution of 16 columns across four processors. Each processor is represented here by a different shade of gray. SLAB COLS is 2 here, meaning each processor ....

....the application was compute bound, and the I O improvements had little effect on execution time. Figure 15 compares disk directed I O and traditional caching on 8 KB blocks instead of 4 KB blocks. Figure 15a shows that disk directed I O often had much less disk traffic than traditional 11 See [Kot95] for the raw data and more details. caching, despite the larger cache. In fact, the difference arises entirely from an increase in traffic caused by traditional caching due to installation reads needed when writing 4 KB columns to to 8 KB blocks. The 16 column slabs were an exception, ....

David Kotz. Disk-directed I/O for an out-of-core computation. Technical Report PCS-TR95251, Dept. of Computer Science, Dartmouth College, January 1995.


Dynamic File-Access Characteristics of a Production Parallel.. - Kotz (1994)   (44 citations)  Self-citation (Kotz)   (Correct)

....in our workload) effectively increasing the request size, lowering overhead, and perhaps eliminating the need for compute node buffers. Strided requests are available in some file system interfaces [5, 9, 17] For some applications, collective I O requests can lead to even better performance [18]. Dependence on Intel CFS. We caution that some of our results may be specific to workloads on Intel CFS file systems, or to NASA Ames s workload (computational fluid dynamics) Although the exact numbers are workload specific, we believe that the conclusions above are applicable to scientific ....

D. Kotz. Disk-directed I/O for MIMD multiprocessors. Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.


Data Management Techniques To Handle Large Data Arrays In Hdf - Velamparampil (1998)   (1 citation)  (Correct)

No context found.

David Kotz. Applications of parallel I/O. Technical Report PCS-TR96297, Dept. of Computer Science, Dartmouth College, October 1996. Release 1.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC