| Thinking Machines Corporation. CM-5 I/O System Programming Guide, September 1993. 8 |
....or the VPs local view. The global view, implemented by Thinking Machines for their C I O interface using CMFSwritefile( CMFSreadfile( and later fwrite and fread, transfers all VPs values for a variable, regardless of the VPs context (based on a where statement) to or from a single stream [12]. These operations can be performed in parallel at high speed, but they provide little flexibility and abandon the local VP view that makes programming easy in the first place. If we take the local view into account, VPs may or may not read or write during an operation; each may transfer ....
Thinking Machines Corporation. CM-5 I/O System Programming Guide, September 1993. 8
....developers can use a set of software libraries to perform various specialized operations. In this section, we describe the special libraries used in our benchmarks: the CM Fortran utility library and the CMMD communication library. 2.2. 1 CM Fortran Utility Library The CM Fortran utility library [33, 32] supports parallel input output for data parallel programs. It provides users with the capability to read and write arrays that are distributed across the CM 5 node memories. Because the array data distribution depends on the size of the CM 5 partition used, the library supports three file ....
....the CMMD input output benchmarks, and the multiple, concurrent input output benchmarks. Below, we first discuss the New Mexico Order intermediate form that is important for understanding input output performance. Then, we describe each set of benchmarks in detail. 50 0 1 2 3 Computation Nodes A[1 32] A[33 64] A[64 96] A[97 128] Figure 8.1: Array Data Distribution (Four Nodes) A[15 28] A[71 84] A[127 128] A[1 14] A[57 70] A[113 126] A[29 42] A[85 98] A[43 56] A[99 112] 0 1 2 3 Computation Nodes Figure 8.2: Array Data Distribution After Data Reordering (New Mexico Order) 8.1 New ....
Thinking Machines Corporation. CM-5 I/O System Programming Guide, Version 7.2, Sept 1993.
....space. For larger machines, the SDA system of disk I O scales up to 3.2 Terabytes, which requires 384 I O nodes [14] An example of a system with two I O nodes and single parity drive is given in Figure 5. Data being written to (or read from) the SDA are striped across the data disks of the array [13]. A single stripe of data on the CM 5 consists of 16 bytes of data written onto each disk. A data block is the minimum sized block that the SDA can perform an I O operation on, and is 512 bytes per disk, or equivalently, 32 stripes. The portion of the fat tree which contains the parent nodes of ....
Thinking Machines Corporation, Cambridge, MA. CM-5 I/O System Programming Guide, version 7.2, September 1993.
....without knowing implementation details. 2 Related Work The first C file system was built by Thinking Machines. Its interface, similar to that for VP oriented CM Fortran on both the CM 200 and CM 5, includes limited functionality in which all VPs transfer data to or from a single stream [24, 23]. Virtual processor streams for C were proposed by Hatcher [9] whose results with a general implementation are reported in [1] Moore et al. [18] point out the shortcomings of the single stream approach to VP files and suggest the use of high performance modes for parallel streams. Here we extend ....
Thinking Machines Corporation. CM-5 I/O System Programming Guide, September 1993.
....without knowing implementation details. 2 Related Work The first C file system was built by Thinking Machines. Its interface, similar to that for VP oriented CM Fortran on both the CM 200 and CM 5, includes limited functionality in which all VPs transfer data to or from a single stream [24, 23]. Virtual processor streams for C were proposed by Hatcher [9] whose results with a general implementation are reported in [1] Moore et al. [18] point out the shortcomings of the single stream approach to VP files and suggest the use of high performance modes for parallel streams. Here we extend ....
....of bytes. Some systems, including Vesta [3] PIOUS [19] and others [12] support multifiles , in which a parallel file is broken into multiple subfiles or segments, typically one per physical processor. The CM 200 supports parallel files, in which each physical processor accesses its own subfile [23]. The notion of parallel VP streams is a large scale generalization of this idea, which can simplify I O programming significantly. Unfortunately, one cannot simply implement VP streams as segments or subfiles on top of these systems. PIOUS would require the opening of many thousands of files, and ....
Thinking Machines Corporation. Connection Machine I/O System Programming Guide, October 1991.
....details, ensure high performance modes are used. 2 Related Work The first C file system was built by Thinking Machines. Its interface, similar to that for VP oriented CM Fortran on both the CM 200 and CM 5, includes limited functionality in which all VPs transfer data to or from a single stream [25, 26]. Virtual processor streams for C were proposed by Hatcher [9] whose results with a general implementation are reported in [1] Moore et al. [20] point out the shortcomings of the single stream approach to VP files and suggest the use of highperformance modes for parallel streams. Here we extend ....
Thinking Machines Corporation. CM-5 I/O System Programming Guide, September 1993.
....details, ensure high performance modes are used. 2 Related Work The first C file system was built by Thinking Machines. Its interface, similar to that for VP oriented CM Fortran on both the CM 200 and CM 5, includes limited functionality in which all VPs transfer data to or from a single stream [25, 26]. Virtual processor streams for C were proposed by Hatcher [9] whose results with a general implementation are reported in [1] Moore et al. [20] point out the shortcomings of the single stream approach to VP files and suggest the use of highperformance modes for parallel streams. Here we extend ....
....of bytes. Some systems, including Vesta [3] PIOUS [21] and others [13] support multifiles , in which a parallel file is broken into multiple subfiles or segments, typically one per physical processor. The CM 200 supports parallel files, in which each physical processor accesses its own subfile [25]. The notion of parallel VP streams is a large scale generalization of this idea, which can simplify I O programming significantly. Unfortunately, one cannot simply implement VP streams as segments or subfiles on top of these systems. PIOUS would require the opening of many thousands of files, and ....
Thinking Machines Corporation. Connection Machine I/O System Programming Guide, October 1991.
....disk arrays attached to different processors. The memory mapped interface uses virtual memory techniques to page data to and from the file, which does not provide sufficient control to an application trying to optimize disk I O. Reads and writes in the Thinking Machines Corporation s DataVault [TMC91] are controlled directly by the user. Writes must be fully striped, however, thus limiting some algorithms. Neither the file system for the newer Scalable Disk Array [TMC92, LIN 93] nor the file system for the MasPar MP 1 and MP 2 [Mas91, Mas92] support independent I O as we have defined it. ....
Thinking Machines Corporation, Cambridge, Massachusetts. Connection Machine I/O System Programming Guide, October 1991.
....58 globally interfaced with the message passing, matrix filling algorithm. Therefore, in our implementation, the matrix fill and matrix factorization and solution (factor solve) are two distinct program units. A high speed device, such as the scalable data array (SDA) or DataVault (see [73]) is used to link these two stages. The message passing MoM matrix filling program fills the matrix and writes it to a file in the format required for the factor solve stage. The matrix is subsequently read in by the data parallel matrix solver stage. There is very little performance penalty for ....
Thinking Machines Corporation, CM-5 I/O System Programming Guide, Version 7.2, September 1993.
....provide sufficient control to an application trying to optimize disk I O. To bypass parity within a RAID 3, one can deliberately fail the RAID parity disk, but then parity is lost for all files, not just large temporary files. Reads and writes in the Thinking Machines Corporation s DataVault [TMC91] are controlled directly by the user. Writes must be fully striped, however, thus limiting some algorithms. Neither the file system for the newer Scalable Disk Array [TMC92, LIN 93, BGST93] nor the file system for the MasPar MP 1 and MP 2 [Mas92] support independent I O as we have defined ....
Thinking Machines Corporation, Cambridge, Massachusetts. Connection Machine I/O System Programming Guide, October 1991.
....appear to the coder as an identical operation (though obviously requiring an identifying pathname) as opening a file on a normal disk attached to the control processor. This is in contrast, for example, with the CM 2 I O system which had a unique space of commands for working on the DataVault [8]. 5 I O Modes We satisfy many of the stated goals by one major addition to Unix I O: an I O mode is now associated with each and every file descriptor. There are four available I O modes: Local, Synchronous Sequential, Synchronous Broadcast, and Independent. Note that the latter three modes are ....
Thinking Machines Corporation. Connection Machine I/O System Programming Guide. 1991.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC