178 citations found. Retrieving documents...
Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary (1993).

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Unknown - Thomasian And Bay   (Correct)

.... precision of the prediction include dynamic application factors, such as data dependent computation, and dynamic system effects, such as the effect of the memory hierarchy (see discussion) we used as the test beds are the KSR 1 [1] which supports shared memory programming model, and the CM 5 [2], which supports both message passing and data parallel programming models. The problems we used as test seeds are Gauss elimination (GE) all pairs shortest path (APSP) and a large electromagnetic simulation (EM) application [7] 3.1. Architectural Characteristics 3.1.1. The Shared Memory ....

....subpage (the basic data transfer unit in the KSR 1) A processor waits for an empty slot to transmit a message. A single bit in the header of the slot identifies it as empty or full as the slots rotate through a ring interface of the processor. 3.1.2. The Connection Machine CM 5. The CM 5 [2] is the newest member of the Thinking Machines Connection Machine family. It is a distributed memory multiprocessor system which can be scaled up to 16K processors and supports both SIMD and MIMD programming models. Each CM 5 node consists of a SPARC processor operating at either 32 or 40 MHz, 32 ....

Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary. 1993.


MORPH: A System Architecture for Robust High Performance Using .. - Chien, Gupta (1996)   (5 citations)  (Correct)

.... for low latency communication by adapting the number of memory elements associated with each processing element (optimal PE granularity) configuring the physical I O resource to match the applications needs (local memory hierarchy, global network) and by adding special hardware structures [19, 60] such as fast barrier or broadcast support for machine subsets or the entire machine, to optimize performance. For example, experience over the last ten years demonstrates that intraprocessor communication mechanisms (data shared through the cache) are much more efficient than even the best ....

Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary. 245 First Street, Cambridge, MA 02154-1264, October 1991.


Scalable Data Parallel Algorithms for Texture.. - Bader.. (1993)   (5 citations)  (Correct)

....language C . In the SPMD model, each processing node executes a portion of the same program, but local memory and machine state can vary across the processors. The SPMD model efficiently simulates the data parallel SIMD model normally associated with massively parallel programming. References [40] and [34] provide an overview for the CM 5, and both [43] and [45] contain detailed descriptions of the data parallel platform. Note that a CM 5 machine with vector units has four vector units per node, and the analysis given here will remain the same. See Figure 2 for the general organization of ....

Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, January 1992. 34


A Virtual Memory Model for Parallel Supercomputers - Reis, Scherson (1996)   (Correct)

....2. Background To place virtual memory in context, we need a clear definition of the physical machine, the programming model, the execution model and the operating system. A MIMD machine, with distributed physical memory and constant delay interconnection network is assumed in this study [17]. The I O subsystem is of primary importance to virtual memory, and we assume a subsystem similar to the Cray T3D: a few I O gateways that serve all processors [11] The processors may logically view one big common file or many independent smaller files. Although several parallel programming ....

Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, October 1991.


Allocation and Scheduling for a Computational Grid - Lewis (2001)   (Correct)

....requirements or the availability of a new allocation mechanism or algorithm. Several examples of allocation tools will be discussed in this Section and the possibility of incorporation into the Mini Grid architecture will be examined. 2.2. 1 Connection Machine The Connection Machine, or CM5 [4] and [13] was rst released by Thinking Machines in October, 1991. It tried to combine the positive aspects of both the MIMD and SIMD machines. The CM5 supports the full data parallel model by providing high performance for branching and synchronization alike [4] The CM 5 operating system, ....

....The Connection Machine, or CM5 [4] and [13] was rst released by Thinking Machines in October, 1991. It tried to combine the positive aspects of both the MIMD and SIMD machines. The CM5 supports the full data parallel model by providing high performance for branching and synchronization alike [4]. The CM 5 operating system, CMOST, is an enhanced version of the UNIX operating system. It supports most of the standards in UNIX and uses the network standards to communicate to all of its processors through three separate network connections. The basic architecture of the CM 5 can be seen in ....

Thinking Machines Corporation. The Connection Machine: CM-5 Technical Summary. Technical report, Thinking Machines Corporation, Cambridge, Massachusetts, October 1991.


A Case Study of Shared Mmeory and Message Passing: The Triangle.. - Lew   (Correct)

....application) under low contention because shared memory offers low overhead data access. Our implementations run on the MIT Alewife multiprocessor [2] The message passing implementation was ported from a message passing implementation that runs on Thinking Machines CM 5 family of multicomputers [25]. The original CM 5 implementation written by Kirk Johnson won first place in an Internet newsgroup contest [14] the goal of which was to solve the triangle puzzle in the shortest time. Alewife efficiently supports both message passing and cache coherent shared memory programming models in ....

Thinking Machines Corporation. Connection Machine CM-5 Technical Summary. Nov. 1993. 656565


Quantitative Performance Modeling of Scientific Computations and.. - Toledo (1995)   (2 citations)  (Correct)

....methodology, called benchmapping, are demonstrated in Chapters 4 and 5 using two benchmapping systems called PERFSIM and BENCHCVL.PERFSIM is a profiler for data parallel Fortran programs. It runs on a workstation and produces the profile of the execution of a program on the Connection Machine CM 5 [110] quicker than the profile can be produced by running the program on a CM 5. BENCHCVL predicts the running time of data parallel programs written in the NESL language [17] on several computer systems. Applications of benchmapping, including program profiling and tuning, making acquisition and ....

....plain slow. PERFSIM is a benchmapping system that accelerates the profiling process by estimating the running time of most of the expensive operations in a program, while refraining from actually performing them. PERFSIM analyzes CM Fortran [109] programs running on the Connection Machine CM 5 [110]. By combining execution of the control structure and scalar operations in a program with analysis of vector operations, PERFSIM can execute a program on a workstation and in seconds and generate performance data that would take several minutes or more to generate by running the program on an ....

[Article contains additional citation context not shown here]

Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, January 1992.


Architecture Implications of High-Speed I/O for.. - Gross, Steenkiste (1994)   (Correct)

....or via a direct, dedicated link, as is sometimes done for a parallel file system. Among the possible network choices, HIPPI is CUlTently the most popular one, and most manufacturers of distributed memory parallel systems either provide or have announced a H1PPI connection (e. g, CM 2 [20] CM 5 [7], iSC 860 [12] NCube2 [14] iWarp [4] Paragon XP S [11] Maspar [2, 15] As far as an application on the parallel system is concerned, the exact characteristics of the external links do not matter, and the I O node provides an appropriate abstrac tion. We can think of the I O nodes as ....

Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary. Thinking Machines Corporation, 1991.


Integration of Message Passing and Shared Memory.. - Heinlein.. (1994)   (40 citations)  (Correct)

....some of these systems. Many recent systems and proposals advocate provisions for direct user level access to message protocols. The messaging interface is typically either memory mapped or register based. The Connection Machine CM 5 provides access to the network through a memory mapped interface [21]. Register based approaches provide tighter coupling by moving the network interface into the processor and providing direct access to the interface through special registers [5, 9] One of the problems with the above systems is that they are typically optimized for short messages, thus limiting ....

Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, 1991.


Fine-Grain Distributed Shared Memory on Clusters of Workstations - Schoinas (1997)   (3 citations)  (Correct)

....using a custom low latency network. These parallel platforms have been called massively parallel processors (MPPs) Since relatively little extra hardware except the custom network is required, this approach has enjoyed popularity in older machines such as Intel iPSC860 [Int90] and TMC CM 5 [Thi91] It is still present today in commercial systems such as IBM SP 2. At a high level of detail, there is not much difference between MPPs and a collection of workstations with the exception of the custom network. Recently however, the network technology has caught up with the other components of ....

....by the network hardware did not allow the efficient implementation of active messages with large data blocks. Therefore, the cost of breaking a bulk data transfer into small active messages was too high to realize bandwidth comparable to the one achieved by the default CM 5 messaging library [Thi91] In contrast, on the COW, the use of the channel interface has been depreciated mainly because the active message layer can support messages up to four Kbytes, which are big enough to offer the raw hardware thorughput. While the channel interface is still supported in Blizzard, it is being ....

[Article contains additional citation context not shown here]

Thinking Machines Corporation. The connection machine CM-5 technical summary, 1991.


Exploiting Superword Level Parallelism with Multimedia.. - Larsen (2000)   (20 citations)  (Correct)

....to get good MIMD performance, extracting SLP should not detract from existing MIMD parallel performance. 2. 4 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Illiac IV [11] and later with the Thinking Machines CM 1 and CM 2 [25, 26] and the Maspar MP 1 [4, 6] The association of the term SIMD with this type of computer is what led us to use Superword Level Parallelism when discussing short SIMD operations. SIMD supercomputers were implemented using thousands of small processors that 14 worked synchronously on a single ....

Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-200 Technical Summary, June 1991.


Exploiting Superword Level Parallelism with Multimedia.. - Larsen (2000)   (20 citations)  (Correct)

....to get good MIMD performance, extracting SLP should not detract from existing MIMD parallel performance. 2. 4 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Illiac IV [11] and later with the Thinking Machines CM 1 and CM 2 [25, 26] and the Maspar MP 1 [4, 6] The association of the term SIMD with this type of computer is what led us to use Superword Level Parallelism when discussing short SIMD operations. SIMD supercomputers were implemented using thousands of small processors that 14 worked synchronously on a single ....

Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-2 Technical Summary, April 1987.


The Provision Of Relocation Transparency Through A Formalised.. - Falkner (2000)   (1 citation)  (Correct)

.... For example, given an object whose purpose is to calculate a surface map from an array of point values, it could be annotated with the attributes fast, Fortran and CM5 to indicate that it was a high performance solution, written in Fortran [40, 126] that can only be run on a Connection Machine CM5 [175]. This form of resolution is more complex than a name matching scheme, as the resolution system has to determine which attributes should be matched, which should be taken as mandatory, and the ordering of attributes with regard to preferences specified by the client. Furthermore, binding of ....

Thinking Machines Corporation. The Connection Machine CM5 Technical Summary, 1991.


Distributed Genetic Algorithms for Partitioning Uniform Grids - Christou (1996)   (3 citations)  (Correct)

....with the uniform case, many aspects are easily extended to the most general, non uniform case as well. 2 Figure 1: 5 point Uniform Grid Computation In order to perform such 5 point computations over a discretized domain on a distributed memory parallel computer (like the Connection Machine CM5 [Thi91] or a network of high performance workstations) the computational load should be balanced across processors in a way that minimizes interprocessor communication. This communication will occur at the common boundaries of the regions that each processor will occupy. It is therefore necessary to ....

Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, October 1991.


A Framework for Parallel Job Scheduling - Subramanian (1995)   (Correct)

....CM 2 [Hil85, 7 This figure and others in this section correspond roughly to the CRAY T3D. Do not take these figures too precisely; we only wish to convey a feel for their relative order of magnitude. 23 Thi91a] and MasPar s MP 2 [Mas91] Examples of MIMD machines are Thinking Machines CM 5 [Thi91b, L 92] and Cray Research s T3D [Oed93, Cra93] From this difference in instruction fetching, several other hardware differences follow as corollaries: Fine Grain vs Coarse Grain: In an SIMD machine, the back end processors do not need to fetch and decode instructions. Therefore, they do not ....

Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, October 1991.


Exploiting Superword Level Parallelism with Multimedia.. - Larsen, Amarasinghe (2000)   (20 citations)  Self-citation (Ma)   (Correct)

....are required to get good MIMD performance, extracting a small amount of SLP would not detract from existing MIMD parallel performance. 2.2. 3 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Thinking Machines CM1 and CM 2 [28, 29] and Maspar MP 1 [6, 8] The association of the term SIMD with these types of computers is what led us to utilize the term Superword Level Parallelism when discussing short SIMD parallelism. These supercomputers were implemented using thousands of small processors which worked synchronously on ....

Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-200 Technical Summary, June 1991.


Exploiting Superword Level Parallelism with Multimedia.. - Larsen, Amarasinghe (2000)   (20 citations)  Self-citation (Ma)   (Correct)

....of parallelism than the vector parallelism associated with traditional vector supercomputers. We denote this parallelism Superword Level Parallelism since parallelism comes in the form of superwords containing packed data. Note that SLP also differs from traditional large scale SIMD parallelism [6, 8, 28]. SIMD supercomputers require large amounts of parallelism in order to achieve speedups, whereas SLP can be profitable when such parallelism is scarce. In some sense, superword level parallelism is actually a restricted type of ILP. ILP techniques have been very successful in the general purpose ....

....are required to get good MIMD performance, extracting a small amount of SLP would not detract from existing MIMD parallel performance. 2.2. 3 SIMD Parallelism SIMD parallelism came into prominence with the advent of massively parallel supercomputers such as the Thinking Machines CM1 and CM 2 [28, 29] and Maspar MP 1 [6, 8] The association of the term SIMD with these types of computers is what led us to utilize the term Superword Level Parallelism when discussing short SIMD parallelism. These supercomputers were implemented using thousands of small processors which worked synchronously on ....

Thinking Machines Corporation, Cambridge, MA. Connection Machine CM-2 Technical Summary, April 1987.


An Efficient Data Parallel Algorithm for 2-D Convolutions - Sandra Dykes Xiaodong   (Correct)

No context found.

Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary (1993).


Comparative Evaluation and Case Studies of Shared-Memory.. - Data-Parallel Execution..   (Correct)

No context found.

Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary, 1993.


Evaluation and Measurement of Multiprocessor Latency Patterns - Xiaodong Zhang Yong   (Correct)

No context found.

Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary, 1993.


Comparative Evaluation and Case Studies of Shared-Memory.. - Data-Parallel Execution..   (Correct)

No context found.

Thinking Machines Corporation, The Connection Machine CM-5 Technical Summary, 1993.


Performance Predictions on Implicit Communication Systems - Xiaodong Zhang Zhichen   (Correct)

No context found.

Thinking Machine Corporation, The Connection Machine CM-5 Technical Summary, 1993.


The Ganglia Distributed Monitoring System: Design.. - Massie, Chun, Culler (2003)   (12 citations)  (Correct)

No context found.

Thinking Machines Corporation. Connection machine cm-5 technical summary, 1992.


Unresponsiveness-Tolerant Collective Communication - Pakin (2001)   (Correct)

No context found.

Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Summary, October 1991.


Data Locality Optimization of Shared Memory Programs on NUMA.. - Tao   (Correct)

No context found.

Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, 1991.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC