| Charles E. Leiserson et al. The Network Architecture of the Connection Machine CM-5. In Symposium on Parallel Algorithms and Architectures (April 1992). |
....in a broad taxonomy (for instance, as distributed memory or shared memory) experience has shown that each platform is unique, with its own artifacts, constraints, and enhancements. For example, the Thinking Machines CM 5, a distributed memory computer, is interconnected by a fat tree data network [47], but includes a separate network that can be used for fast barrier synchronization. The SGI Origin [46] provides a global address space to its shared memory; however, its non uniform memory access requires the programmer to handle data placement for e#cient performance. Distributed memory cluster ....
C. E. Leiserson, Z. S. Abuhamdeh, D. C. Douglas, C. R. Feynman, M. N. Ganmukhi, J. V. Hill, W. D. Hillis, B. C. Kuszmaul, M. A. St. Pierre, D. S. Wells, M. C. Wong-Chan, S.-W. Yang, and R. Zak. The network architecture of the Connection Machine CM-5. J. Parallel & Distributed Comput., 33(2):145--158, 199.
.... , a single link with bandwidth the level of the hierarchy above this one, and a single link with bandwidth to the level of the hierarchy below this one (possibly the local subnetworks) Routing in the SOENet is performed with a randomized algorithm similar to the one used in the CM 5[17, 16]. Unlike the CM 5, which has only long links in its network, our algorithm utilizes extra bandwidth on the short links to load balance the long links. 3.2 Scaling of SOENet Figure 2 illustrates how a SOENet torus network scales from a single node to maximum capacity. For clarity the figure shows ....
C. E. Leiserson et al. The network architecture of the Connection Machine CM-5. Journal of Parallel and Distributed Computing, 33(2):145--158, 1996.
.... , a single link with bandwidth the level of the hierarchy above this one, and a single link with bandwidth to the level of the hierarchy below this one (possibly the local subnetworks) Routing in the SOENet is performed with a randomized algorithm similar to the one used in the CM 5[17, 16]. Unlike the CM 5, which has only long links in its network, our algorithm utilizes extra bandwidth on the short links to load balance the long links. 3.2 Scaling of SOENet Figure 2 illustrates how a SOENet torus network scales from a single node to maximum capacity. For clarity the figure shows ....
C. Leiserson et al. The Network Architecture of the Connection Machine CM-5. Proc. 1992.
....between processors has a start up overhead of v, while the data transfer rate is 1 p. For our complexity analysis we assume that v and p are constant, independent of the link congestion and distance between two nodes. With new techniques such as wormhole routing and randomized routing [15, 14, 8, 17], the distance between communicating processors seems to be less of a determining factor on the amount of time needed to complete the communication. Further, the effect of link contention (due to several messages traversing common links along their routes) is limited due to presence of virtual ....
C. Leiserson et al. The Network Architecture of the Connection Machine CM-5, Proc. dth Annual A CM Symposium on Parallel Algorithms and Architectures, San Diego, CA, 1992. 16
....to either the cache or primary memory. The former scheme used, for example, in the CM 5 is optimized for low latency short messages, while the latter used, for example, in the Paragon is optimized for high bandwidth long messages. Table 5. 1 confirms that claim using data obtained from [17] and [35] The latter reference cites the performance of a Paragon running SUNMOS PUMA, an operating system and software architecture geared towards more efficient messaging than is provided with the software bundled with the Paragon. Clock frequencies listed are for the processor clock. Clock ....
C. Leiserson et al. The network architecture of the Connection Machine CM-5. In Proceedings of the Symposium on Parallel Algorithms and Architectures, 1992. Available from ftp://cmns.think.com/doc/Papers/net.ps.Z.
....1: The organization of a CM 2 Sprint node nodes are connected via a fat tree communications network. The CM 5 is an MIMD machine which can run in Single Program Multiple Data (SPMD) mode to simulate SIMD operation. An in depth look at the network architecture of this machine is described in [29]. The nodes operate in parallel and are interconnected by a fat tree data network. The fat tree resembles a quad tree, with each processing node (PN) as a leaf and data routers at all internal connections. In addition, the bandwidth of the fat tree increases as you move up the levels of the tree ....
....A regular grid shift is an example of a block permutation pattern of data movement as shown in Subsection 2.2.2. For the corresponding regular block communications, the CM 5 can achieve bandwidths of 15 megabytes sec per processor to put messages into and take messages out of the data network [29]. The runtime system (RTS) of the CM 5 will choose a data section for a message of between 1 and 5 words in length. If we assume that the data section of a message, fi, is four 4 byte words, and the header and trailer are an additional 4 bytes, then fl = 16 4 152 20 1:27s packet. To support ....
[Article contains additional citation context not shown here]
C. E. Leiserson, Z. S. Abuhamdeh, D. C. Douglas, C. R. Feynman, M. N. Ganmukhi, J. V. Hill, W. D. Hillis, B. C. Kuszmaul, M. A. St. Pierre, D. S. Wells, M. C. Wong, S. W. Yang, and R. Zak. The Network Architecture of the Connection Machine CM-5. (Extended Abstract), July 28, 1992.
....leaves are used to map processes, but distances between them are computed by considering the whole tree. This graph is used Figure 8: The tree leaf graph of height 3. Processors are drawn in black and routers in grey. to represent multi stage machines with constant bandwidth, such as the CM5 [33] for which experiments have shown that bandwidth is constant between every pair of processors and hardly depends on network congestion [35] or the SP 2 with power of two number of nodes. The two additional parameters cluster and weight serve to model heterogeneous architectures for which ....
C. Leiserson, Z. Abuhamdeh, D. Douglas, C. Feynman, M. Ganmukhi, J. Hill, W. Hillis, B. Kuszmaul, M. Pierre, D. Wells, M. Wong, S. Yang, and R. Zak. The network architecture of the Connection Machine CM-5. Technical report, Thinking Machines, juillet 1992.
....number of nodes and should provide full bisection bandwidth across any arbitrary bisection of the parallel machine. As a tradeoff the performance of a single link in such a network could be a secondary concern. The best example of such a network is the fat tree used in the Thinking Machines CM 5 [11]. Networks for parallel computers should be designed around a sophisticated tradeoff of technology factors (i.e. best possible pin counts, clock speeds) and the links should be as fast as possible, allowing only simple networks like tori or hierarchical rings. Representatives of this line of ....
Charles E. Leiserson, Zahi S. Abuhamdeh, David C. Douglas, Carl R. Feynman, Mahesh N. Ganmukhi, Jeffrey V. Hill, W. Daniel Hillis, Bradley C. Kuszmaul, Margaret A. St. Pierre, David S. Wells, Monica C. Wong, Shaw-Wen Yang, and Robert Zak. The Network Architecture of the Connection Machine CM-5. Journal of Parallel and Distributed Computing, 33(2):145--58, March 1996.
....number of nodes and should provide full bisection bandwidth across any arbitrary bisection of the parallel machine. As a tradeoff the performance of a single link in such a network could be a secondary concern. The best example of such a network is the fat tree used in the Thinking Machines CM5 [11]. Networks for parallel computers should be designed around a sophisticated tradeoff of technology factors (i.e. best possible pin counts, clock speeds) and the links should be as fast as possible, allowing only simple networks like tori or hierarchical rings. Representatives of this line of ....
Charles E. Leiserson, Zahi S. Abuhamdeh, David C. Douglas, Carl R. Feynman, Mahesh N. Ganmukhi, Jeffrey V. Hill, W. Daniel Hillis, Bradley C. Kuszmaul, Margaret A. St. Pierre, David S. Wells, Monica C. Wong, Shaw-Wen Yang, and Robert Zak. The Network Architecture of the Connection Machine CM-5. Journal of Parallel and Distributed Computing, 33(2):145--58, March 1996.
....in a broad taxonomy (for instance, as distributed memory or shared memory) experience has shown that each platform is unique, with its own artifacts, constraints, and enhancements. For example, the Thinking Machines CM5, a distributed memory computer, is interconnected by a fat tree data network [48], but includes a separate network that can be used for fast barrier synchronization. The SGI Origin [47] provides a global address space to its shared memory; however, its non uniform memory access requires the programmer to handle data placement for efficient performance. Distributed memory ....
C. E. Leiserson, Z. S. Abuhamdeh, D. C. Douglas, C. R. Feynman, M. N. Ganmukhi, J. V. Hill, W. D. Hillis, B. C. Kuszmaul, M. A. St. Pierre, D. S. Wells, M. C. Wong-Chan, S.-W. Yang, and R. Zak. The network architecture of the Connection Machine CM-5. J. Parallel & Distributed Comput., 33(2):145-- 158, 199.
....a gate level implementation of a fat tree network. 1 Introduction Fat tree networks are established as area universal communication networks due to the seminal work of Charles E. Leiserson [8, 3] culminating in the implementation of the Connection Machine CM 5 at Thinking Machines Corporation [9]. Today, advances in semiconductor technology enable us to integrate multiprocessor machines on a single chip, as explored in the Raw project [12] for example. As the number of processors on a chip increases, employing one or more fat tree networks as interconnection medium is an attractive ....
Charles E. Leiserson, Zahi S. Abuhamdeh, David C. Douglas, Carl R. Feynman, Mahesh N. Ganmukhi, Jerey V. Hill, W. Daniel Hillis, Bradley C. Kuszmaul, Margaret A. St. Pierre, David S. Wells, Monica C. Wong-Chan, Shaw-Wen Yang, and Robert Zak. The Network Architecture of the Connection Machine CM-5. Journal of Parallel and Distributed Computing, 33(2):145-158, March 1996.
....such that there are many more request packets. As a result, collisions with request packets are more likely than collisions with acknowledgment packets. 5. Related Work Similarly segregated network architectures can also be found in parallel machines such as the Cray T3D [10] and the CM 5 [13]. These machines contain a lowlatency network mainly used for synchronization operations in addition to a general purpose network. Compared with Clint, the physical span of these networks is, however, much more limited. The use of multiple networks with different characteristics is examined in ....
C. Leiserson et al.: The Network Architecture of the Connection Machine CM-5. Proc. 1992.
....or the availability of a new allocation mechanism or algorithm. Several examples of allocation tools will be discussed in this Section and the possibility of incorporation into the Mini Grid architecture will be examined. 2.2. 1 Connection Machine The Connection Machine, or CM5 [4] and [13] was rst released by Thinking Machines in October, 1991. It tried to combine the positive aspects of both the MIMD and SIMD machines. The CM5 supports the full data parallel model by providing high performance for branching and synchronization alike [4] The CM 5 operating system, CMOST, is an ....
Charles E. Leiserson. The Network Architecture of the Connection Machine CM-5. pages 1-18, October 1992. 83
....one process, messages can arrive for a process that is swapped out; swapping out a parallel process requires no global coordination, though it may impact performance. Such freedom dramatically reduces the parallel job swapping time needed by traditional MPP s. Traditional MPP s such as the CM 5[5] support only The smallest allocation unit is two transmit and two receive queues. An inverse table could be associated with each receive queue so that only messages from a specified set of sources will be accepted. Such additional hardware would protect against untrusted OS or sP code at a ....
....sP bus address space Primary Opcode encoding. 2.2.1 sBIU User Space The sBIU User Space uses a 3b wide Secondary Opcode, in sPAddr[4:6] and has the encoding as shown in Table 2. One can think of sPAddr[4] as distinguishing between SRAM Space (cacheable) and the other spaces (uncacheable) sPAddr[5] differentiates QPtr Space from the ShTx and ShRx spaces, and sPAddr[6] differentiates between ShTx and ShRx Spaces. Width (bits) Bit field Content Description 3 [4:6] 0XX SRAM Space. 10X QPtr Space. 110 ShTx Space. 111 ShRx Space. Table 2: sBIU sP bus User Space Secondary Opcode encoding. SRAM ....
[Article contains additional citation context not shown here]
C. E. Leiserson et al. The Network Architecture of the Connection Machine CM-5. In Proceedings of the 1992.
....the problems that ServerNet solves and remaining issues. I Introduction Ever since the first development of multicomputers a decade ago, interconnection networks have developed dramatically in such areas as connection topologies, switching strategies, routing algo rithms, flow control, etc [18, 10, 29, 20, 7, 9]. The main focus of network architects has been on efficient network resource utilization to improve throughput and average latency. Recently, we have observed explosive growth of many new applications requiring continuous data communications on computer networks, for instance, video ....
C. Leiserson et al. The network architecture of the Connection Machine CM-5. In Pro- ceedings of the 5'ymposium on Parallel Algorithms and Architectures, 1992. Available from ftp://cmns. think. com/doc/Papers/net. ps. Z.
....elegant recursive construction developed by Valerio et al. 16] tries to maximize the number of processors when the degree of the internal switches and the diameter of the network are physically constrained. Fat trees have been adopted by several parallel computers as the Connection Machine CM 5 [10], the Data Diffusion Machine [15] and the Meiko CS 2 [14] Unfortunately, not much is known on the communication performance of the fat trees. Most of the literature deals with the CM 5 and focuses on raw network performance [7] 12] 13] Typical communication patterns include simple sends and ....
C. E. Leiserson et al. The Network Architecture of the Connection Machine CM-5. In Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 272--285, June 1992.
....to another memory bank attached to the same vector unit chip (two memory banks are attached to each vector unit chip) ffl Data transfer from one memory bank to another attached to another vector unit chip on the same node. ffl Data transfer from one node to another using the CM 5 s data network [73]. ffl Data transfer from one node to another using the CM 5 s control network [73] The control network supports specialized communication patterns such as reductions and broadcasts. The topology of the data network is a 4 ary tree, in which links at different levels of the tree have different ....
....attached to each vector unit chip) ffl Data transfer from one memory bank to another attached to another vector unit chip on the same node. ffl Data transfer from one node to another using the CM 5 s data network [73] ffl Data transfer from one node to another using the CM 5 s control network [73]. The control network supports specialized communication patterns such as reductions and broadcasts. The topology of the data network is a 4 ary tree, in which links at different levels of the tree have different capacities. In particular, the capacity of links doubles at every level as we ....
Charles E. Leiserson et al. The network architecture of the Connection Machine CM5. In Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 272--285, 1992.
No context found.
Charles E. Leiserson et al. The Network Architecture of the Connection Machine CM-5. In Symposium on Parallel Algorithms and Architectures (April 1992).
No context found.
C. E. Leiserson, Z. Abuhamdeh, D. Douglas, C. Feynman, M. Ganmukhi, J. Hill, D. Hillis, B. Kuszmaul, M. St. Pierre, D. Wells, M. Wong, S. Yang, and R. Zak. The network architecture of the connection machine CM-5. In Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 272--285, 1992.
No context found.
C. E. Leiserson, Z. Abuhamdeh, D. Douglas, C. Feynman, M. Ganmukhi, J. Hill, D. Hillis, B. Kuszmaul, M. St. Pierre, D. Wells, M. Wong, S. Yang, and R. Zak. The network architecture of the connection machine CM-5. In Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 272--285, 1992.
No context found.
C. E. Leiserson et al. The network architecture of the Connection Machine CM-5. Parallel and Distributed Computing, 33(2):145--158, March 1996.
No context found.
Charles E. Leiserson, Zahi S. Abuhamdeh, David C. Douglas, Cral R. Feynman, Mahesh N. Ganmukhi, Jeffrey V. Hill, W. Daniel Hillis, Bradley C. Kuszmaul, Margaret A St Pierre, David S. Wells, Monica C. Wong, Shaw-Wen Yang, and Robert Zak. The network architecture of the Connection Machine CM--5. In SPAA '92, pages 272--285. ACM Press, 1992.
No context found.
Charles E. Leiserson et al. The Network Architecture of the Connection Machine CM-5. In Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 272--285, June 1992.
No context found.
Charles E. Leiserson et al. The Network Architecture of the Connection Machine CM-5. In Proceedings of the Fourth ACM Symposium on Parallel Algorithms and Architectures, pages 272--285, June 1992.
No context found.
C. Leiserson, Z. Abuhamdeh, D. Douglas, M. Ganmukhi C. Feynman, J. Hill, W. Hillis, B. Kuszmaul, M. St. Pierre, D. Wells, M. Wong, S. Yang, and R. Zak. The network architecture of the connection machine cm-5. In the proceedings of SPAA92, 1992.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC