91 citations found. Retrieving documents...
Thinking Machines Corporation. CM Fortran Reference Manual. Cambridge, MA, Dec. 1992.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

High Performance Fortran Support For The Paradigm Compiler - Hodges (1995)   (3 citations)  (Correct)

....Fortran language, rather than by using directives in comments. Other features of the language include the ability to specify very complex data mappings through the use of mapping arrays and the ability to explicitly map loops onto particular sets of processors. 3.1. 4 CM Fortran CM Fortran (CMF) [17] is a superset of Fortran 77 that includes the array operations and intrinsics from Fortran 90, alignment and data layout directives, and a singlestatement FORALL. A set of simple alignment and layout directives allows the programmer to explicitly map data to processors. The single statement ....

Thinking Machines Corporation, Cambridge, MA, CM Fortran Reference Manual, version 2.1, 1994.


Quantitative Performance Modeling of Scientific Computations and.. - Toledo (1995)   (2 citations)  (Correct)

....also consists of sequential control and vector operations, but scalars are treated as vectors of length 1 and not as distinct data structures. The running time of many data parallel programs is dominated by the running time of vector operations. Many data parallel languages, including CM Fortran [109] and NESL, do not exploit control flow parallelism and parallelism in scalar operations. Since most programmers want their data parallel programs to run well on parallel computers with a large number of processors, they write programs in which most of the work is expressed in vector operations ....

....computation are of no interest. Moreover, it is just plain slow. PERFSIM is a benchmapping system that accelerates the profiling process by estimating the running time of most of the expensive operations in a program, while refraining from actually performing them. PERFSIM analyzes CM Fortran [109] programs running on the Connection Machine CM 5 [110] By combining execution of the control structure and scalar operations in a program with analysis of vector operations, PERFSIM can execute a program on a workstation and in seconds and generate performance data that would take several minutes ....

[Article contains additional citation context not shown here]

Thinking Machines Corporation, Cambridge, MA. CM Fortran Reference Manual, December 1992. 148


Compilation Techniques for Optimizing Communication on.. - Gong, Gupta, Melhem (1993)   (15 citations)  (Correct)

.... this section is developed for the following language: S S; S IIf Ethen Selse S I L L Do i: l,h,m; i: ID [f(i, i, ID2 [g(i, i) This language is motivated by those data parallel programming languages in which array references are cru cial for achieving parallelism [16] 8] [13]. For exam ple, Fortran 90 array expression can be expressed using the above language. Therefore although the above language is simple, it expresses an important class of programs involving data level parallelism. The data flow algorithm computes four synthesized attributes ref(S) def(S) ....

Thinking Machines Corporation, "CM Fortran Reference Manual," 1991.


Fortran D Language Specification - Fox, Hiranandani, Kennedy, Koelbel.. (1991)   (61 citations)  (Correct)

....applications that are not supported by existing parallel languages. The extensions proposed in Fortran D are compatible with both Fortran 77 and Fortran 90, a version of Fortran with explicit manipulation of high level array structures. Fortran 90D can be viewed as a refinement of CM Fortran [TMC89] consistent with a parallel Fortran 77. We believe that Fortran D is powerful enough to express most fine grain parallel computations, but also simple enough that a sophisticated compiler can produce efficient programs for different parallel architectures. In particular, Fortran D is well suited ....

....semantics of the FORALL loop to be quite close to sequential Fortran. In particular, FORALL loops are deterministic. As a result we believe that it will be easy to understand and use for scientific programmers. The FORALL loop possesses similar semantics to the CM Fortran FORALL statement [TMC89, ALS90] and the Myrias PARDO loop. In fact, the FORALL statement in CM Fortran may be used as a special form of the FORALL loop one that has only one statement in the loop body. 5.4 Reductions A reduction is an operation on a collection of data that results in new data of lesser ....

[Article contains additional citation context not shown here]

Thinking Machines Corporation, Cambridge, MA. CM Fortran Reference Manual, version 5.2-0.6 edition, September 1989.


A Comparison Of Parallelization Techniques For A Finite.. - Kumaran, Miller (1995)   (1 citation)  (Correct)

....to the user community. 1. CM 5: CM 5 is a MIMD machine with processing nodes connected together using a fat tree. Each processing node has four vector units and a scalar unit. The vector units work in SIMD mode. We have used CMMD [TMC, 1993a] an explicit message passing language, and CM Fortran [TMC, 1993b] to implement the algorithms. 2. CM 200: CM 200 is a SIMD machine with the inter connection network forming a hypercube. Each node of the hypercube has a VLSI chip with 16 bit serial processing elements and routing hardware. Each pair of nodes shares a floating point chip. The machine we used ....

Thinking Machines Corporation. 1993 CM Fortran Reference manual. Cambridge, MA. 24


A Partitioning-Independent Paradigm for Nested Data.. - Engelhardt, Wendelborn (1995)   (5 citations)  (Correct)

....data parallel paradigm may be implemented by distributed memory models of computation, that account for data parallelism s promise of explicitly parallel high performance scalable computation. Several data parallel language implementations now exist for modern distributed memory multiprocessors [20, 22, 4], and numerous high level languages have embraced the data parallel principle (eg Fortran90 [14] High Performance Fortran [12] However, it has been noted [4, 3] that almost all discussion of data parallelism to date has limited itself to the simplest and least expressive form: flat (or ....

....contiguous blocks of indices, while the latter partitions each inner aggregate across a different (disjoint) set of 4 processing elements. For purposes of comparison, equivalent unstructured data parallel programs were constructed both in terms of the abstract machine and the release CMFortran [20] compiler. These programs were run across aggregates of equivalent size (number of data items) to their nested counterparts. Columns 4 and 5 of the Table record their performance. Total Execution Time (in seconds) factor nested nested flat CMF (s) block) best) block) flat) 1 0.006762 ....

Thinking Machines Corporation, Cambridge, Massachusetts. CM Fortran Reference Manual (version 2.1 Beta), December 1992.


Simulation of Ground Vibration on a Massively.. - Berglund And Erlingsson (1982)   (Correct)

....subroutine, pshift. The CMSSL subroutine pshift takes advantage of the hypercube net and its bandwidth [T1] On CMs with 64 bit floating point units (FPU) and version 1. 0 of the CM Fortran compiler a slice wise code organization should be used, which can use the FPU registers efficiently [T2], T1] To optimize a slicewise code one shall try to break up the code in communication and elemental computation blocks. This can be achieved by saving the shifted fields in temporaries and doing the computations after the communication. For example, the five point stencil operator u n 1 (i; ....

Thinking Machines Corporation. CM Fortran Reference Manual. Version 1.0 and 1.1.


Protocol Compilation: High-Performance Communication for Parallel .. - Felten (1993)   (25 citations)  (Correct)

....of many data parallel languages. Early data parallel languages include Dino [Rosing et al. 90, Rosing 91] Kali [Koelbel et al. 90] and C [Rose Steele 87] The current generation of data parallel languages is more mature, and offers a richer set of directives to control data distribution [TMC 89, TMC 90, Fox et al. 91, Chapman et al. 92] A consortium of researchers and vendors is now developing a standardized language called High Performance Fortran (HPF) with some data parallel features [HPFF 93] HPF will be supported by all the major parallel system vendors, and will provide a common ....

....be built on top of these data movement systems. 119 8.1.2 Compiling to Message Passing Code Many researchers have studied how to compile languages into message passing code. These languages include Dino [Rosing 91] Dataparallel C [Hatcher Quinn 91] C [Rose Steele 87, TMC 90] CM Fortran [TMC 89] Fortran D [Fox et al. 91, Tseng 93] Vienna Fortran [Chapman et al. 92] NESL [Blelloch et al. 93] and many others. While several such compilers generate efficient message passing code, they all treat send and receive as indivisible primitives. The code generated by these systems could still ....

Thinking Machines Corporation, Cambridge, MA. CM Fortran Reference Manual, Version 5.2, 1989.


Language Constructs for Modular Parallel Programs - Foster (1993)   (Correct)

....a blocked distribution, each processor is allocated a contiguous block of array elements [18] Elements of a distributed array are accessed in the same manner as ordinary array elements. More complex distributions and data structures, such as those supported in data parallel programming languages [30, 18], can be integrated in the same manner, but are not supported in the current PCN compiler. A keyword port is used to declare these distributed arrays. For example, the following procedure declares a distributed array P of single assignment variables with one element on each node of the current ....

....have been used to achieve portability by hiding information concerning the size and topology of a physical computer. Martin [26] Hudak [23] and Taylor [29] have investigated notations for specifying process mapping on a (potentially infinite) processing surface. In data parallel languages [30, 18], data distribution is specified with respect to a virtual computer, as proposed here; however, hierarchies cannot be defined. While these systems succeed in decoupling mapping or data distribution from other aspects of a parallel algorithm, they do not permit resource allocation decisions to be ....

Thinking Machines Corporation, CM Fortran Reference Manual, Thinking Machines, Cambridge, Mass., 1989.


Toward Automatic Partitioning of Arrays on Distributed Memory.. - Feautrier (1993)   (10 citations)  (Correct)

....between virtual processors which are implemented on the same real processor is simply a copy operation, proper choice of the folding function will offer some opportunities for further reduction of the traffic. This two tier mapping system is reminiscent of the Connection Machine software [CMF89], or of the templates in HPF [Lov92] the main difference being that templates or so called geometries are multidimensional objects. I will return to that point later. In this paper, I will be mainly concerned with the determination of the virtual mapping. Some indications on the choice of the ....

Thinking Machine Corp., Cambridge, MA. CM Fortran Reference Manual, Version 5.2, 1989.


POLYSHIFT Communications Software for the Connection.. - George, Brickner.. (1993)   (2 citations)  (Correct)

....hardware features which are used for the 2 implementation of PSHIFT. Section 3 describes the software architecture of the PSHIFT routine, and Section 4 discusses its implementation in some more detail. Section 5 describes the interface of the PSHIFT library routine to Connection Machine Fortran [18, 19], a subset of Fortran 90 [14] with extensions. Calling sequences and supported functionality are reported. Section 6 presents some performance measurements and a performance model. We conclude with a section summarizing our experience from developing and using the PSHIFT library routine, and ....

Thinking Machines Corp. CM Fortran Reference Manual, Version 2.1, 1993.


A Vector Space Framework for Parallel Stable Permutations - Shalaby, Johnsson (1995)   (Correct)

....address space is often preferable for computations modeling phenomena in multi dimensional physical spaces. In such cases, a block mapping often reduces communication by preserving locality of reference [11, 17] Therefore, language directives have been introduced for data mapping [10, 41, 43], allowing for the mapping of multi dimensional index spaces through block or cyclic allocation, or some combination thereof. Randomized data allocation [32, 33] can be very efficient for some computations and is used on the TERA computer [1] and was an option for some software library functions ....

....array location before the permutation. Then, Pi(a) 2 A r points to the array location whose data will be in location a after the permutation, that is, location a gets the content of location Pi(a) This get oriented interpretation is consistent with Fortran 90 [31] HPF [10] and CM Fortran [41]. Figure 5 exemplifies this interpretation by illustrating an arithmetic add 1 (or CSHIFT( 1) permutation Pi, on A 2 , such that f0; 1; 2; 3g Pi Gamma f1; 2; 3; 0g. The boxes and their tags stand for the memory locations and their numeral tags respectively, with the binary numbers inside ....

[Article contains additional citation context not shown here]

Thinking Machines Corp. CM Fortran Reference Manual, Version 2.1, 1993.


Compiler Support for Machine-Independent Parallelization of.. - von Hanxleden (1994)   (7 citations)  (Correct)

....any constructs that are specific to either architecture. F77MIMD A version of Fortran 77 designed to run on a MIMD machine, which assumes a separate name space for each processor. F90 SIMD A version of Fortran 90 designed to run on a SIMD machine, similar to Connection Machine Fortran [Thi91] or MasPar Fortran [Mas91] There are two important differences relative to the F77 variants: ffl By default, scalars of F77 will be replicated in F90 SIMD ; i.e. they will be declared as vectors of size P , where processor p owns the p th element. ffl In keeping with Fortran 90 convention, ....

Thinking Machines Corporation, Cambridge, MA. CM Fortran Reference Manual, 1991.


Toward Automatic Distribution - Feautrier (1994)   (46 citations)  (Correct)

....between virtual processors which are implemented on the same real processor is simply a copy operation, proper choice of the folding function will offer some opportunities for further reduction of the traffic. This two tier mapping system is reminiscent of the Connection Machine software [CMF89] or of the template in HPF [Lov92] the main difference being that templates or so called geometries are multidimensional objects. I will return to that point later. In this paper, I will be mainly concerned with the determination of the virtual mapping. Some indications on the choice of the ....

Thinking Machine Corp., Cambridge, MA. CM Fortran Reference Manual, Version 5.2, 1989.


High Performance, Scalable Scientific Software Libraries - Johnsson, Mathur (1994)   (1 citation)  (Correct)

....Table 1.1 Peak local and global performance per node and efficiencies achieved for a few different types of computations on the CM 5. 64 bit precision. 9. Functionality supporting traditional numerical methods used in scientific and engineering computation. Connection Machine Fortran, CMF, Thi93a] and C [Thi92] are languages with an array syntax designed and implemented by Thinking Machines Corp. for the programming of the Connection Machine systems [Thi91a, Thi91b] CMF was modeled after Fortran 90 while this language still was in a proposal stage. Many of the new features in High ....

....the memory associated with a functional unit. Axis weights are used to control the length of the segment of an axis assigned to a node. Weights do not affect the total number of elements assigned to a node. It is also possible to directly specify the shape of the subarray assigned to a node [Thi93a] 1.6 Data allocation When there are more data elements than processing nodes, several data elements are allocated to the same node, for an even data distribution. An even data distribution is a simple rule aimed at achieving load balance without sophisticated data dependence analysis. With ....

[Article contains additional citation context not shown here]

Thinking Machines Corp. CM Fortran Reference Manual, Version 2.1, 1993.


Performance Measurement Of Interpreted, Just-In-Time Compiled.. - Newhall (1999)   (Correct)

....correlate performance measures, like synchronization times, with source code fragments or data parallel arrays; the execution of application binary code can be mapped to the application developer s view of the program. Another example is the NV performance tool model [33] for measuring CM Fortran [66] programs. NV is designed to map performance data collected for the execution of low level code to its high level view in terms of parallel data structures and statements in data parallel Fortran programs. NV is not integrated with the Fortran compiler. Instead, it uses static information from the ....

Thinking Machines Corporation, Cambridge Massachusetts. CM Fortran Reference Manual. January 1991.


A Parallel Reduced Hessian SQP Method for Shape Optimization - Ghattas, Orozco   (4 citations)  (Correct)

....elapsed time reflects other processes running on the FE suggests that the true efficiency is higher than computed. 18 Ghattas and Orozco Second, it is relatively straightforward to speed up the movement of mesh information from the FE to processors using CM Fortran Utility Library subroutines [37]. Third, we can do away with most of the FE to processor data movement, and the need to remesh m times per iteration, by introducing analytical derivatives. Finally, we present results of a scalability test to assess the behavior as m and n remain fixed and the number of processors is increased. ....

Thinking Machines Corporation, CM Fortran Reference Manual, Version 1.1, 1991.


Interprocedural Compilation of Fortran D - Hall, Hiranandani, Kennedy, Tseng (1996)   (3 citations)  (Correct)

....polyhedra instead of RSDs for greater flexibility [3, 4] Few other distributed memory compilation systems have discussed interprocedural issues, especially interprocedural optimization. The CM Fortran compiler utilizes user defined interface blocks to specify a data partition for each procedure [36]. Array parameters are then copied to buffers of the expected form at run time if needed, eliminating the need for interprocedural analysis. C [34] and Dataparallel C [21] specify parallelism through the use of parallel functions. Arguments to procedures in Id Nouveau [33] and Kali [29] are ....

Thinking Machines Corporation, Cambridge, MA. CM Fortran Reference Manual, version 1.0 edition, February 1991.


Comparing CM Fortran and Message-Passing Fortran.. - Ravikanth Ganesan Kannan   (Correct)

....many processing units [5] Only one instruction can execute at a time and every processor executes the same instruction. Advantages of a SIMD machine include its simple architecture which makes the machine potentially inexpensive, and its synchronous control structure which makes programming easy [1, 9]. The MIMD architecture is based on the duplication of control units, a separate unit for each individual processor. Different processors can execute different instructions at the same time [7] It is more flexible for different problem structures and can be applied to general applications. This ....

....within its own partition. Each partition is supervised by a partition manager who serves requests from programs for access outside the respective partition. The CM 5 supports both SIMD and MIMD programming modes. 2.1. 1 The SIMD Programming Mode The SIMD programs were written in CM Fortran [9]. CM Fortran, henceforth referred to as CMF, allows the user to map array elements or array sections onto different virtual processors by using appropriate compiler directives like cmf layout and cmf align. These directives allow the user to control the layout of arrays as :NEWS, SERIAL, or :SEND ....

[Article contains additional citation context not shown here]

Thinking Machines Corporation "CM Fortran Reference Manual" Version 5.2-0.6, Sep. 1989.


High Performance Fortran Migration (HPF - Aehpf Via Cm-Fortran   (Correct)

....of the HPF, so unsurprisingly many of the features of CMF are included. CM FORTRAN was developed from the FORTRAN 90 draft standard, with the addition of some array operations (such as forall) from earlier versions that were removed in the final draft. More details of its evolution can be found in [4]. A Connection Machine specific utility library [5] along with a scientific library [6] are also provided. HPF ae HPF MIGRATION VIA CMF 3 1.3 Migration Strategy Figure 1 shows how various FORTRAN variants overlap. Table 1 at the end of this document summarises some of the important features ....

Thinking Machines Corporation. CM FORTRAN Reference Manual. Cambridge, Massachusetts, July 1991. Version 1.1.


The Performance Realities Of Massively Parallel.. - Lubeck, Simmons.. (1992)   (6 citations)  (Correct)

....Therefore, much of the time in the code is spent simply updating particle positions in the geometry. Not all particle transport problems have this characteristic. The cross section data for neutrons are accessed using the CM Fortran utility routines that store the data on each floating point node [19]. Neutrons in the simulation are gathered into batches and processed when the batch size becomes large. The CM Fortran code was developed primarily by Eldon Linnebur of the LANL Computational Physics Group and was translated for the Cray by us. PUEBLO is a three dimensional, finitedifference, ....

Thinking Machines Corporation, "CM Fortran Reference Manual," Version 1.0, 1991.


Communication Efficient Multi-processor FFT - Johnsson, Jacquemin, Krawitz (1991)   (8 citations)  Self-citation (Corp)   (Correct)

....array among 8 processors are illustrated in Figure 5. We consider the impact of these forms of data allocation on the data motion requirements for the FFT. For multi dimensional arrays each axis is often encoded separately, as for instance is the case in the Connection Machine programming systems [21]. Still, there is an issue of how 6 Consecutive data allocation Cyclic data allocation P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 0 4 8 12 16 20 24 28 1 5 9 13 17 21 25 29 2 6 10 14 18 22 26 30 3 7 11 15 19 23 27 31 P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ....

Thinking Machines Corp. CM Fortran Reference Manual, Version 2.1, 1993.


Implementation of a Portable Nested Data-Parallel.. - Blelloch, Hardwick.. (1993)   (97 citations)  (Correct)

No context found.

Thinking Machines Corporation. CM Fortran Reference Manual. Cambridge, MA, Dec. 1992.


High Performance Fortran Language Specification - Fortran (1992)   (756 citations)  (Correct)

No context found.

Thinking Machines Corporation, Cambridge, Massachusetts. CM Fortran Reference Manual, July 1991.


High Performance Fortran Language Specification - High Performance Fortran (1992)   (756 citations)  (Correct)

No context found.

Thinking Machines Corporation, Cambridge, Massachusetts. CM Fortran Reference Manual, July 1991.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC