| Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary (1993). |
.... precision of the prediction include dynamic application factors, such as data dependent computation, and dynamic system effects, such as the effect of the memory hierarchy (see discussion) we used as the test beds are the KSR 1 [1] which supports shared memory programming model, and the CM 5 [2], which supports both message passing and data parallel programming models. The problems we used as test seeds are Gauss elimination (GE) all pairs shortest path (APSP) and a large electromagnetic simulation (EM) application [7] 3.1. Architectural Characteristics 3.1.1. The Shared Memory ....
....subpage (the basic data transfer unit in the KSR 1) A processor waits for an empty slot to transmit a message. A single bit in the header of the slot identifies it as empty or full as the slots rotate through a ring interface of the processor. 3.1.2. The Connection Machine CM 5. The CM 5 [2] is the newest member of the Thinking Machines Connection Machine family. It is a distributed memory multiprocessor system which can be scaled up to 16K processors and supports both SIMD and MIMD programming models. Each CM 5 node consists of a SPARC processor operating at either 32 or 40 MHz, 32 ....
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary. 1993.
.... for low latency communication by adapting the number of memory elements associated with each processing element (optimal PE granularity) configuring the physical I O resource to match the applications needs (local memory hierarchy, global network) and by adding special hardware structures [19, 60] such as fast barrier or broadcast support for machine subsets or the entire machine, to optimize performance. For example, experience over the last ten years demonstrates that intraprocessor communication mechanisms (data shared through the cache) are much more efficient than even the best ....
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary. 245 First Street, Cambridge, MA 02154-1264, October 1991.
....language C . In the SPMD model, each processing node executes a portion of the same program, but local memory and machine state can vary across the processors. The SPMD model efficiently simulates the data parallel SIMD model normally associated with massively parallel programming. References [40] and [34] provide an overview for the CM 5, and both [43] and [45] contain detailed descriptions of the data parallel platform. Note that a CM 5 machine with vector units has four vector units per node, and the analysis given here will remain the same. See Figure 2 for the general organization of ....
Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, January 1992. 34
....2. Background To place virtual memory in context, we need a clear definition of the physical machine, the programming model, the execution model and the operating system. A MIMD machine, with distributed physical memory and constant delay interconnection network is assumed in this study [17]. The I O subsystem is of primary importance to virtual memory, and we assume a subsystem similar to the Cray T3D: a few I O gateways that serve all processors [11] The processors may logically view one big common file or many independent smaller files. Although several parallel programming ....
Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, October 1991.
....requirements or the availability of a new allocation mechanism or algorithm. Several examples of allocation tools will be discussed in this Section and the possibility of incorporation into the Mini Grid architecture will be examined. 2.2. 1 Connection Machine The Connection Machine, or CM5 [4] and [13] was rst released by Thinking Machines in October, 1991. It tried to combine the positive aspects of both the MIMD and SIMD machines. The CM5 supports the full data parallel model by providing high performance for branching and synchronization alike [4] The CM 5 operating system, ....
....The Connection Machine, or CM5 [4] and [13] was rst released by Thinking Machines in October, 1991. It tried to combine the positive aspects of both the MIMD and SIMD machines. The CM5 supports the full data parallel model by providing high performance for branching and synchronization alike [4]. The CM 5 operating system, CMOST, is an enhanced version of the UNIX operating system. It supports most of the standards in UNIX and uses the network standards to communicate to all of its processors through three separate network connections. The basic architecture of the CM 5 can be seen in ....
Thinking Machines Corporation. The Connection Machine: CM-5 Technical Summary. Technical report, Thinking Machines Corporation, Cambridge, Massachusetts, October 1991.
....application) under low contention because shared memory offers low overhead data access. Our implementations run on the MIT Alewife multiprocessor [2] The message passing implementation was ported from a message passing implementation that runs on Thinking Machines CM 5 family of multicomputers [25]. The original CM 5 implementation written by Kirk Johnson won first place in an Internet newsgroup contest [14] the goal of which was to solve the triangle puzzle in the shortest time. Alewife efficiently supports both message passing and cache coherent shared memory programming models in ....
Thinking Machines Corporation. Connection Machine CM-5 Technical Summary. Nov. 1993. 656565
....methodology, called benchmapping, are demonstrated in Chapters 4 and 5 using two benchmapping systems called PERFSIM and BENCHCVL.PERFSIM is a profiler for data parallel Fortran programs. It runs on a workstation and produces the profile of the execution of a program on the Connection Machine CM 5 [110] quicker than the profile can be produced by running the program on a CM 5. BENCHCVL predicts the running time of data parallel programs written in the NESL language [17] on several computer systems. Applications of benchmapping, including program profiling and tuning, making acquisition and ....
....plain slow. PERFSIM is a benchmapping system that accelerates the profiling process by estimating the running time of most of the expensive operations in a program, while refraining from actually performing them. PERFSIM analyzes CM Fortran [109] programs running on the Connection Machine CM 5 [110]. By combining execution of the control structure and scalar operations in a program with analysis of vector operations, PERFSIM can execute a program on a workstation and in seconds and generate performance data that would take several minutes or more to generate by running the program on an ....
[Article contains additional citation context not shown here]
Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, January 1992.
....or via a direct, dedicated link, as is sometimes done for a parallel file system. Among the possible network choices, HIPPI is CUlTently the most popular one, and most manufacturers of distributed memory parallel systems either provide or have announced a H1PPI connection (e. g, CM 2 [20] CM 5 [7], iSC 860 [12] NCube2 [14] iWarp [4] Paragon XP S [11] Maspar [2, 15] As far as an application on the parallel system is concerned, the exact characteristics of the external links do not matter, and the I O node provides an appropriate abstrac tion. We can think of the I O nodes as ....
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary. Thinking Machines Corporation, 1991.
....some of these systems. Many recent systems and proposals advocate provisions for direct user level access to message protocols. The messaging interface is typically either memory mapped or register based. The Connection Machine CM 5 provides access to the network through a memory mapped interface [21]. Register based approaches provide tighter coupling by moving the network interface into the processor and providing direct access to the interface through special registers [5, 9] One of the problems with the above systems is that they are typically optimized for short messages, thus limiting ....
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, 1991.
....using a custom low latency network. These parallel platforms have been called massively parallel processors (MPPs) Since relatively little extra hardware except the custom network is required, this approach has enjoyed popularity in older machines such as Intel iPSC860 [Int90] and TMC CM 5 [Thi91] It is still present today in commercial systems such as IBM SP 2. At a high level of detail, there is not much difference between MPPs and a collection of workstations with the exception of the custom network. Recently however, the network technology has caught up with the other components of ....
....by the network hardware did not allow the efficient implementation of active messages with large data blocks. Therefore, the cost of breaking a bulk data transfer into small active messages was too high to realize bandwidth comparable to the one achieved by the default CM 5 messaging library [Thi91] In contrast, on the COW, the use of the channel interface has been depreciated mainly because the active message layer can support messages up to four Kbytes, which are big enough to offer the raw hardware thorughput. While the channel interface is still supported in Blizzard, it is being ....
[Article contains additional citation context not shown here]
Thinking Machines Corporation. The connection machine CM-5 technical summary, 1991.
....to get good MIMD performance, extracting SLP should not detract from existing MIMD parallel performance. 2. 4 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Illiac IV [11] and later with the Thinking Machines CM 1 and CM 2 [25, 26] and the Maspar MP 1 [4, 6] The association of the term SIMD with this type of computer is what led us to use Superword Level Parallelism when discussing short SIMD operations. SIMD supercomputers were implemented using thousands of small processors that 14 worked synchronously on a single ....
Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-200 Technical Summary, June 1991.
....to get good MIMD performance, extracting SLP should not detract from existing MIMD parallel performance. 2. 4 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Illiac IV [11] and later with the Thinking Machines CM 1 and CM 2 [25, 26] and the Maspar MP 1 [4, 6] The association of the term SIMD with this type of computer is what led us to use Superword Level Parallelism when discussing short SIMD operations. SIMD supercomputers were implemented using thousands of small processors that 14 worked synchronously on a single ....
Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-2 Technical Summary, April 1987.
.... For example, given an object whose purpose is to calculate a surface map from an array of point values, it could be annotated with the attributes fast, Fortran and CM5 to indicate that it was a high performance solution, written in Fortran [40, 126] that can only be run on a Connection Machine CM5 [175]. This form of resolution is more complex than a name matching scheme, as the resolution system has to determine which attributes should be matched, which should be taken as mandatory, and the ordering of attributes with regard to preferences specified by the client. Furthermore, binding of ....
Thinking Machines Corporation. The Connection Machine CM5 Technical Summary, 1991.
....with the uniform case, many aspects are easily extended to the most general, non uniform case as well. 2 Figure 1: 5 point Uniform Grid Computation In order to perform such 5 point computations over a discretized domain on a distributed memory parallel computer (like the Connection Machine CM5 [Thi91] or a network of high performance workstations) the computational load should be balanced across processors in a way that minimizes interprocessor communication. This communication will occur at the common boundaries of the regions that each processor will occupy. It is therefore necessary to ....
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, October 1991.
....CM 2 [Hil85, 7 This figure and others in this section correspond roughly to the CRAY T3D. Do not take these figures too precisely; we only wish to convey a feel for their relative order of magnitude. 23 Thi91a] and MasPar s MP 2 [Mas91] Examples of MIMD machines are Thinking Machines CM 5 [Thi91b, L 92] and Cray Research s T3D [Oed93, Cra93] From this difference in instruction fetching, several other hardware differences follow as corollaries: Fine Grain vs Coarse Grain: In an SIMD machine, the back end processors do not need to fetch and decode instructions. Therefore, they do not ....
Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, October 1991.
....are required to get good MIMD performance, extracting a small amount of SLP would not detract from existing MIMD parallel performance. 2.2. 3 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Thinking Machines CM1 and CM 2 [28, 29] and Maspar MP 1 [6, 8] The association of the term SIMD with these types of computers is what led us to utilize the term Superword Level Parallelism when discussing short SIMD parallelism. These supercomputers were implemented using thousands of small processors which worked synchronously on ....
Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-200 Technical Summary, June 1991.
....of parallelism than the vector parallelism associated with traditional vector supercomputers. We denote this parallelism Superword Level Parallelism since parallelism comes in the form of superwords containing packed data. Note that SLP also differs from traditional large scale SIMD parallelism [6, 8, 28]. SIMD supercomputers require large amounts of parallelism in order to achieve speedups, whereas SLP can be profitable when such parallelism is scarce. In some sense, superword level parallelism is actually a restricted type of ILP. ILP techniques have been very successful in the general purpose ....
....are required to get good MIMD performance, extracting a small amount of SLP would not detract from existing MIMD parallel performance. 2.2. 3 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Thinking Machines CM1 and CM 2 [28, 29] and Maspar MP 1 [6, 8] The association of the term SIMD with these types of computers is what led us to utilize the term Superword Level Parallelism when discussing short SIMD parallelism. These supercomputers were implemented using thousands of small processors which worked synchronously on ....
Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-2 Technical Summary, April 1987.
No context found.
Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary (1993).
No context found.
Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary, 1993.
No context found.
Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary, 1993.
No context found.
Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary, 1993.
No context found.
Thinking Machine Corporation, The Connection Machine CM-5 Technical Summary, 1993.
No context found.
Thinking Machines Corporation. Connection machine cm-5 technical summary, 1992.
No context found.
Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, October 1991.
No context found.
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, 1991.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC