7 citations found. Retrieving documents...
V.G. Grafe and J.E. Hoch, The Epsilon-2 multiprocessor system, J. Parall. Distr. Comput., 10 (1990), pp. 309318.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Asynchrony in parallel computing: From dataflow to.. - Silc, Robic, Ungerer (1997)   (2 citations)  (Correct)

....the token store. Another instruction of the same thread is executed every eighth processor cycle. Monsoon allows the use of registers (eight register sets are provided) to store intermediate results within a thread, thereby leaving the pure dataAEow execution model. Epsilon 2: Epsilon 2 machine [100,101] developed at Sandia National Laboratories (Livermore, Ca, USA) supports a fully dynamic memory model, allowing single cycle context switches and dynamic parallelization. The system is built around a module consisting of a processor and structure unit, connected via a 4 Theta 4 crossbar to each ....

V.G. Grafe and J.E. Hoch, The Epsilon-2 multiprocessor system, J. Parall. Distr. Comput., 10 (1990), pp. 309318.


A Debugger for Id - Caro (1993)   (Correct)

....Japan s Electro Technical Laboratory (ETL) was designed and built based on TTDA, and the Manchester Dataflow Machine [20] was also implemented based on similar dynamic dataflow concepts. Recent hardware projects include Monsoon [30] EM 4 from ETL [24] and Epsilon 2 from Sandia National Labs [19]. We will use the abstract TTDA machine as a basis to explain the operation of a typical pipelined dataflow processor. Figure 2.2 provides a conceptual view of the TTDA pipeline. A datum or token coming from the Token Queue enters the machine at the top. Each token is divided into two parts: a tag ....

V.G. Grafe and J.E. Hoch. The Epsilon-2 Multiprocessor System. Journal of Parallel and Distributed Computing, January 1990.


Design and Implementation of a Packet Switched Routing Chip - Joerg (1990)   (6 citations)  (Correct)

....the volume of traffic that is sent into the main network. By reducing the number of messages sent into the main network, we allow the network to support higher performance nodes. In some machines it may be desirable to closely bundle three components. For example, the Epsilon 2 dataflow machine [5], being designed at Sandia National Labs, may bundle together a processor, a memory, and an IO unit. A way to produce such Processor Memory from Network from Network Processor Memory to Network to Network B) Processor and Memory bundled together. Processor Memory from Network Processor Memory to ....

G. Grafe and J. E. Hoch. The Epsilon-2 Multiprocessor System. To appear in Journal of Parallel and Distributed Computing, December, 1990.


Efficient Implementation of Sequential Loops in Dataflow Computation - Ang (1993)   (4 citations)  (Correct)

....as it was assumed that most loops would be executed in parallel. This assumption was valid for earlier dataflow machines such as the MIT Tagged Token Dataflow Architecture (TTDA) 2] Sigma 1[9] but not for the newest generation of dataflow machines including Monsoon[6] EM 4[11] and Epsilon 2[7]. On the latter machines, sequential loops use less memory, and can execute in fewer instructions, albeit with lower parallelism than the parallel versions. This characterisation of sequential and parallel loops suggests that programs should have parallel outer loops and sequential inner loops. ....

....The consumption of resources, in the form of token tags is also the same for both cases. For these machines, it is natural to execute loops in parallel to reap full benefit of parallel processing. With the newest generation of dataflow machines (e.g. Monsoon[17, 6] EM 4[11] and the Epsilon 2[7]) the situation is different. These machines use frame memory on each processor to implement token matching(see Section 2.1) This has two consequences for loops: i) every concurrent iteration needs its own frame; ii) sending data between two iterations that do not share (in a time multiplexed ....

[Article contains additional citation context not shown here]

V. G. Grafe and J. E. Hoch. The Epsilon-2 Multiprocessor System. Journal of Parallel and Distributed Computing, 10(4):309--318, 1990.


An Analysis of Latency in Dataflow Execution - Najjar, Miller, Böhm   (Correct)

.... as a conventional processor [EAC87] Reducing matching store bandwidth through simplified operand matching schemes, coupled with the prospect of exploiting a high degree of locality has been the catalyst behind the design of new, second generation hybrid architectures [Ian88, SYH 89, GH90, PC90] By increasing task granularity, hybrid machines execute portions of dataflow graphs as sequential threads, allowing them to exploit thread locality, while reducing matching store bandwidth. Measuring temporal and spatial locality for a single thread of von Neumann code involves tracking ....

V. G. Grafe and J. E. Hoch. The EPSILON-2 multiprocessor system. 10(4), 1990.


Generation and Quantitative Evaluation of Dataflow Clusters - Roh, Najjar, Böhm (1993)   (3 citations)  (Correct)

.... to handle a large peak demand for resources (e.g. matching store) which exacerbates the resource management problem [Cul89] Various multithreaded and hybrid von Neumann dataflow execution models have been proposed that can alleviate these problems [Ian88, PT91, CSS 91, SYH 89, SYH 91, GH90, AE89, NPA92] These coarse grain execution models recognize the advantages of statically scheduling code at compile time, if it is profitable to do so, rather than dynamically at run time. This simple paradigm shift coupled with explicit allocation of resources (e.g. use of explicit token ....

V. G. Grafe and J. E. Hoch. The EPSILON-2 multiprocessor system. Journal of Parallel and Distributed Computing, 10(4), 1990.


An Evaluation of Medium-Grain Dataflow Code - Najjar, Roh, Böhm (1994)   (1 citation)  (Correct)

.... of parallelism is very large, it forces the machine to handle a large peak demand for resources (e.g. matching store) which exacerbates the resource management problem [8] Various multithreaded and hybrid von Neumann dataflow execution models have been proposed that can alleviate these problems [21, 33, 10, 36, 37, 17, 1, 31]. These medium coarse grain execution models recognize the advantages of scheduling code at compile time rather than at run time. This is most often accomplished by increasing the task granularity from a single instruction to many instructions and statically scheduling instructions within the ....

V. G. Grafe and J. E. Hoch. The EPSILON-2 multiprocessor system. Journal of Parallel and Distributed Computing, 10(4), 1990.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC