7 citations found. Retrieving documents...
G. Schmidt. The butterfly parallel processor. In Proc. of ICS, pages 362--365, 1987.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Memory Consistency and Event Ordering in Scalable.. - Gharachorloo.. (1990)   (450 citations)  (Correct)

....still providing a reasonable programming model for the programmer. Architectural optimizations that reduce memory latency are especially important for scalable multiprocessor architectures. As a result of the distributed memory and general interconnection networks used by such multiprocessors [8, 9, 12], requests issued by a processor to distinct memory modules may execute out of order. Caching of data further complicates the ordering of accesses by introducing multiple copies of the same location. While memory accesses are atomic in systems with a single copy of data (a new data value becomes ....

G. E. Schmidt. The Butterfly parallel processor. In Proceedings of the Second International Conference on Supercomputing, pages 362-365, 1987.


Comparative Evaluation of Latency Reducing and.. - Gupta, Hennessy.. (1991)   (103 citations)  (Correct)

....each one on its own. Overall, we show that using suitable combinations of the techniques, performance can be improved by 4 to 7 times. 1 Introduction Large scale shared memory multiprocessors are expected to have remote memory reference latencies of several tens to hundreds of processor cycles [18, 22, 25, 30]. The large latencies arise partly due to the increased physical dimensions of the parallel machine and partly due to the ever increasing clock rates at which the individual processors operate. These large memory latencies can quickly offset any performance gains expected from the use of ....

....small bus based multiprocessors through the use of snoopy cachecoherence protocols [4] the problem is much more complicated for large scale multiprocessors that use general interconnection networks. As a result, some existing large scale multiprocessors do not provide caches (e.g. BBN Butterfly [25]) others provide caches that must be kept coherent by software (e.g. IBM RP3 [22] and still others provide full hardware support for coherent caches (e.g. Stanford DASH [18] In this section we evaluate the performance benefits when both pfivate and shared read wfite data are cacheable as ....

G. E. Schmidt. The Butterfly parallel processor. In Proc. Int. Conf Supercomputing, pages 362365, 1987.


The Effect of Limited Network Bandwidth and its Utilization.. - Kim, Veidenbaum   (Correct)

....techniques in large scale shared memory multiprocessors, prefetching and weak consistency, their interaction, and their memory and network bandwidth requirements under compiler controlled cache coherence management. For multiprocessor systems without cache coherence hardware, such as BBN Butterfly [25], IBM RP3 [24] IBM SP 2 [27] and Cray T3D [18] a software controlled cache coherence management is the only viable solution. Data prefetching has been proposed as a mechanism to hide a large memory latency [3, 12, 14, 15, 21, 23, 26] In data prefetching, future data accesses are predicted, and ....

G. E. Schmidt. The butterfly parallel processor. In International Conference on Supercomputing, pages 362--365, 1987.


Job Scheduling in Multiprogrammed Parallel Systems - Feitelson (1997)   (16 citations)  (Correct)

....to n output ports, with a log n delay and a 1 2 n log n component count. Such networks are used both for shared memory machines and for distributed memory machines. In shared memory machines, the input ports are PEs and the output ports are memory modules; examples include the BBN Butterfly [503], The NYU Ultracomputer [244, 243] the IBM RP3 [458] PASM [535, 534] TRAC [83, 372, chap. 7] and Cedar (where clusters are connected to the network rather than individual PEs) 222, 319] In distributed memory machines, all ports are connected to PEs. Examples include the CM 5 [356] the Meiko ....

G. E. Schmidt, "The butterfly parallel processor". In 2nd Intl. Conf. Supercomputing, vol. I, pp. 362--365, 1987. 156


Architectural and Implementation Tradeoffs in the Design of.. - James Laudon (1992)   (10 citations)  (Correct)

....for memory references is a serious problem for high performance, large scale, shared memory multiprocessors. The combination of large physical distances between processors and memory and the fast cycle time of processors leads to latencies ranging from many tens to hundreds of processor cycles [14, 17, 21]. For these multiprocessors to be effective on a wide range of applications, some mechanism is needed to avoid or hide this memory latency. There are several ways to avoid this large memory latency, including caching of shared data, and restructuring of the application to maintain as much data ....

G. E. Schmidt. The Butterfly parallel processor. In Proceedings of the Second International Conference on Supercomputing, pages 362--365, 1987.


Bootstrapping a Hop-optimal Network in the Weak Sensor.. - Farach-Colton..   (Correct)

No context found.

G. Schmidt. The butterfly parallel processor. In Proc. of ICS, pages 362--365, 1987.


Dynamically Reconfigurable Architecture for a Class of Real-Time.. - Ohkami (1992)   (Correct)

No context found.

G. E. Schmidt, "The Butterfly Parallel Processor," Proceedings of the 2nd International Conference on Supercomputing, Vol.1, 1987, pp.362--365.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC