11 citations found. Retrieving documents...
H. H. Hum and G. R. Gao. Supporting a dynamic SPMD model in a multithreaded architecture. In Proc. of Compcon Spring'93, 1993.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Analysis of Communications and Overhead Reduction in.. - Roh, Najjar (1995)   (Correct)

....model. There currently exists a wide array of multithreading processor architecture models reflecting a large set of design parameters. These design parameters include: ffl A blocking [Smi81, ACC 90, DFK 92, WG89] or nonblocking thread execution model [NPA92, SYH 89, SYH 91, HG93, HTG94, Vas94] Note that in a nonblocking execution the synchronization points are statically determined by the compilers while in a blocking This work is supported by NSF Grant MIP 9113268 execution the hardwrae must support dynamic synchronization. ffl The use of hardware contexts with ....

H. H. Hum and G. R. Gao. Supporting a dynamic SPMD model in a multithreaded architecture. In Proc. of Compcon Spring'93, 1993.


Code Generations, Evaluations, and Optimizations in Multithreaded.. - Roh (1995)   (Correct)

....the request and its reply represent separate transactions. The non blocking model requires a simpler processor architecture but the thread size is necessarily smaller than in the blocking model; examples include Monsoon [PC90] T [NPA92] EM4 [SYH 89a, SYH 91] TAM [CSS 91] EARTH [HG93, HTG94] S TAM [Vas94] and Pebbles. A machine that supports the blocking thread model obviously can support the non blocking threads, albeit not as efficiently as another machine which is specifically designed to run non blocking threads. Since the non blocking threads are smaller and also more ....

....more threads to mask the memory access latency. During periods of limited parallelism and single thread executions, performance may not fare well. Overall, Tera claims a peak performance of 256 Gigaflops on a 256 node system. 2.1. 10 McGill Concordia EARTH A primary feature of the EARTH model [HG93, HTG94] formerly known as MTA) being developed at McGill and Concordia is the use of an off the shelf microprocessor as an execution processor. Each node contains a microprocessor and an external synchronization unit that performs the synchronizations and interfaces to the network. Unlike T, ....

H. H. Hum and G. R. Gao. Supporting a dynamic SPMD model in a multithreaded architecture. In Proc. of Compcon Spring'93, 1993.


The Threaded Communication Library: Preliminary Experiences on .. - Elmasri, al. (1994)   (1 citation)  (Correct)

....and those from Matrix Multiply for small matrices point to the importance of caching incoming data such that the AP does not encounter avoidable cache misses. This can be accomplished by having a common cache in which globally shared data are stored when the CP receives them from the network [HG93] 5 Related Work In the area of architectural support for communication synchronization, many commercial products and academic projects have intelligent devices either processors or specialized hardware to handle the non computational activities in a node of a parallel machine. In Paragon ....

Herbert H. J. Hum and Guang R. Gao. Supporting a dynamic SPMD model in a multi-threaded architecture. In Digest of Papers, 38th IEEE Computer Society International Conference, COMPCON Spring '93, pages 165--174, San Francisco, California, February 22--26, 1993. IEEE Computer Society Press.


Generation, Optimization and Evaluation of Multi-Threaded.. - Roh, Najjar, Shankar, Böhm   (Correct)

....runs to completion once started, it requires all its inputs to be present before execution starts. This model requires a simpler processor architecture but the thread size is often smaller than in the blocking model; examples include Monsoon [25] and T [24] EM 4 and EM 5 [29, 30] TAM [7] MTA [14, 15], S TAM [38] and Pebbles. Unlike Pebbles, most other non blocking models rely on variants of the Explicit Token Store architecture [6] The T [24] node is centered around the Motorola M88110MP with hardware support for fast context switching among threads. Its successor, T NG, is based on the ....

H. H. Hum and G. R. Gao. Supporting a dynamic SPMD model in a multithreaded architecture. In Proc. of Compcon Spring'93, 1993.


Nomadic Threads: A Migrating Multithreaded Approach to Remote .. - Jenks, Gaudiot (1996)   (4 citations)  (Correct)

....remote data access, because threads are sent to the node that contains a needed data item instead of bringing the data item to the node that contains the consumer thread. Nomadic Threads is a thread migration approach. Several thread migration approaches have been proposed previously. One approach [16] uses hardware support and explicit remote data access instructions to implement thread migration. Three remote data access operations are provided: one always causes the thread to transfer to the remote node that holds the data, another always fetches the data to the current node, and the third ....

H. H. J. Hum and G. R. Gao, "Supporting a Dynamic SPMD Model in a Multi--Threaded Architecture," in Proc. Compcon'93, pp. 165-174, 1993.


The Threaded Communication Library: Preliminary Experiences.. - Elmasri, Hum, Gao (1995)   (1 citation)  (Correct)

....and those from Matrix Multiply for small problem sizes point to the importance of caching incoming data such that the AP does not encounter avoidable cache misses. This can be accomplished by having a common cache in which globally shared data are stored when the CP receives them from the network [7]. 5 Related Work In the area of architectural support for communication synchronization, many commercial products and academic projects have intelligent devices either processors or specialized hardware to handle the non computational activities in a node of a parallel machine. Examples of ....

H. H. J. Hum and G. R. Gao. Supporting a dynamic SPMD model in a multi-threaded architecture. In Digest of Papers, COMPCON Spring '93, pages 165--174, San Francisco, Calif., Feb. 1993.


Building Multithreaded Architectures with Off-the-Shelf.. - Hum, al. (1993)   (15 citations)  Self-citation (Hum)   (Correct)

....low sending receiving overheads. Therefore, thread switching generally uses hardware assist in order to lower the associated costs of switching far below the costs of software overheads in conventional parallel processors. In our previous work, some initial thoughts on the MTA model were presented [14]. The intent of this report is to formally define the basic MTA model and present various features which we believe are beneficial to the efficient and effective support of both scientific numerical computations and non numerical applications. In particular, the MTA model is to specify features ....

....register ( sr) The execution pipeline processes instructions in an active thread, using standard RISC register to register and load store operations. Moreover, it can inject messages into the interconnection network and issue synchronization signals to the local SU. 1 A version of the MTA in [14] specifies one such mechanism. EU SU event queue queue ready PE PE . PE Interconnection Network cache Figure 1: The Multi Threaded Architecture The buffer between the SU output and EU input is called the ready queue, and holds the threads that are waiting for execution. A thread becomes ....

Herbert H. J. Hum and Guang R. Gao. Supporting a dynamic SPMD model in a multithreaded architecture. In Digest of Papers, 38th IEEE Computer Society International Conference, COMPCON Spring '93, pages 165--175, San Francisco, California, February 1993.


A Design Study of the EARTH Multiprocessor - Humy (1995)   (31 citations)  Self-citation (Gao)   (Correct)

....answer these questions. In this paper, we address both the issue of nonintrusiveness of multithreading support, and the benefits of hiding communication synchronization latencies, by performing experiments on an emulation platform of our Efficient Architecture for Running THreads (EARTH) 1 [11, 13]. Each node in an EARTH computer consists of an Execution Unit (EU) which is an off the shelf high end RISC processor for executing threads sequentially, and a Synchronization Unit (SU) supporting dataflow like thread synchronizations and communication with remote processors. The SU can be ....

....circumvent. One of our recent efforts has been focused on providing a combined hardware software approach to caching globally shared data. We intend to add minimal hardware to conventional caches and employ the underlying split phase transaction mechanisms in the maintenance of cached shared data [11]. Another effort is to provide prioritized ready thread ids to increase cache reuse [11] Also, we intend to have support for algorithmic speculative execution. In speculative execution such as parallel OR search, we need to support non deterministic arrivals of completion signals from parallel ....

[Article contains additional citation context not shown here]

Herbert H. J. Hum and Guang R. Gao. Supporting a dynamic SPMD model in a multi-threaded architecture. In Digest of Papers, COMPCON Spring '93, pp. 165-- 174, San Francisco, Calif., Feb. 1993.


The Multi-Threaded Architecture Multiprocessor - Hum, Maquelin, Theobald.. (1994)   (4 citations)  Self-citation (Hum Gao)   (Correct)

.... latencies be hidden while still attaining the expected speedups In this paper, we address both the issue of non intrusiveness of multithreading support, and the benefits of hiding communication synchronization latencies, by presenting our MultiThreaded Architecture (MTA) model [11, 13] and its implementations. Each node in an MTA consists of an Execution Unit (EU) which is an off the shelf high end RISC processor for executing threads sequentially, and a Synchronization Unit (SU) supporting dataflow like thread synchronizations and communication with remote processors. The SU ....

....this paper, we have described an MTA which has basic multithreading operations, block move operations, and an automatic load balancing mechanism. We are currently redefining MTA to include support of caching globally shared data. We intend to incorporate a version of the global cache as defined in [11] which minimally extends a conventional cache to include information regarding which processor owns the data in the cache lines. This globally shareddata cache contains only data which is shared by more than one node, and the on chip cache of the EU will be used to data private to the node. Also, ....

Herbert H. J. Hum and Guang R. Gao. Supporting a dynamic SPMD model in a multithreaded architecture. In Digest of Papers, 38th IEEE Computer Society International Conference, COMPCON Spring '93, pages 165--174, San Francisco, California, February 22--26, 1993. IEEE Computer Society Press.


Costs and Benefits of Multithreading with Off-the-Shelf RISC .. - Olivier Maquelin (1995)   (6 citations)  Self-citation (Hum Gao)   (Correct)

....of the multithreading overheads shows that they could be reduced significantly without having to switch to a custom processor design. 1. 1 The EARTH MANNA system The results discussed in this paper were gained with our implementation of the EARTH (Efficient Architecture for Running THreads) model [4, 6, 5] on top of the MANNA (Massively parallel Architecture for Non numerical and Numerical Applications) multiprocessor [3] developed at GMD FIRST in Berlin, Germany. Each node of a MANNA machine consists of two Intel i860 XP RISC CPUs, clocked at 50MHz, 32 MB of dynamic RAM and a bidirectional ....

Herbert H. J. Hum and Guang R. Gao. Supporting a dynamic SPMD model in a multi-threaded architecture. In Digest of Papers, 38th IEEE Comp. Soc. Intl. Conf., COMPCON Spring '93, pages 165--174, San Francisco, Calif., Feb. 1993.


Building Multithreaded Architectures with Off-the-Shelf.. - Hum, al. (1994)   (15 citations)  Self-citation (Hum)   (Correct)

....communications on the overall execution time due to the rapid switching between threads, and the possibility of smoothly integrating communication mechanisms into the processor to yield low sending receiving overheads. In our previous work, some initial thoughts on the MTA model were presented [9, 10]. The intent of this paper is to define formally the basic MTA model and present various features which we believe are beneficial to the efficient and effective support of both numerical and non numerical applications. The main characteristics of the model are: ffl Division of the processor into ....

....hardware wise, the MTA multiprocessor is a distributed memory machine. However, this does not preclude the use of software and or limited hardware mechanisms for projecting a global address space to the compiler, thereby making the MTA a distributed shared memory machine. A version of the MTA in [9] specifies one such mechanism. The EU consists of a register file and a RISC execution pipeline. The register file contains registers for storing temporary values, and special registers such as a frame pointer register ( fp) The pipeline executes instructions in an active thread, and it can also ....

H. H. J. Hum and G. R. Gao, "Supporting a dynamic SPMD model in a multi-threaded architecture," in Digest of Papers, COMPCON Spring '93, San Francisco, Calif., pp. 165--175, Feb. 1993.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC