| Cray Research, Incorporated. Cray T3D Technical Summary, October 1993. |
....on a multiprocessor. By addressing the needs of the workstation environment, our proposal makes multiple contexts more attractive for commodity microprocessors. 1 Introduction Large scale multiprocessors, such as the one shown in Figure 1, are increasingly built using commodity microprocessors [2, 16]. While these commodity microprocessors provide a relatively lowcost compute node, their performance depends heavily on employing a sophisticated cache hierarchy to insulate the processor from the long remote memory latency. Providing the ability to cache shared data [16] can greatly increase the ....
Cray Research, Incorporated. Cray T3D Technical Summary, October 1993.
....parallelized. The TestAndSet and FetchAndAdd primitives can be implemented in parallel using a combining network, and it has been argued that they can be made no slower than a memory reference to the shared memory [14, 28] Furthermore several machines have supported these primitives in hardware [13, 27, 20, 31, 18]. Most current symmetric multiprocessors support the TestAndSet operation directly, but not the FetchAndAdd. On the Sun Enterprise Server, for example, TestAndSet only requires about the same time as a memory reference (15 cycles if in second level cache and 60 cycles if in shared memory) 23] ....
C. R. Inc. Cray T3D: Technical summary, Sept. 1993.
....like the Silicon Graphics Power Series [6] Whilst this machine cannot deliver the sustained performed mentioned earlier in the paper, the study allowed us to evaluate the effectiveness of parallelising the model. There are now much larger and faster shared memory machines (like the CRAY T3D [7], the Convex SPP [8] and the Kendall Square Research KSR2 [9] which are capable of delivering the required performance. These machine utilise a distributed virtual shared memory unlike the current shared memory machines. Figure 2 Sample ozone contours The generation of ozone contours requires ....
Cray Research, Inc., Cray T3D Technical Summary, September 1993.
....in contrast, does not require any modification to the cache mapping hardware. Instead, it makes a change at the level of memory module mapping, which is much easier. In addition, by using a modifiable mask to specify the pid bits (similar to the mask used in the Cray T3D Block Transfer Engine [4]) the b) byte in cache line set addressing bits pid bits tag bits byte in cache line pid bits set addressing bits tag bits a) Figure 4: Two ways of designating the pid bits in the physical address such that they do not overlap with the set addressing bits. byte in cache line set addressing ....
Cray Research Inc. CrayT3D Technical Summary, 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC