| Intel, i860 XP Microprocessor Data Book, Intel Corporation, Santa Clara, CA, 1991. |
....bottlenecks but will also enhance latency tolerance for data movement between localities. Seamless [FiC92c, FiC92d] is a latency tolerant RISC based multiprocessor architecture based on the data movement programming model [FiC92a] In Seamless, the concept of a multicomputer (e.g. the iPSC 860 [Int91b, HeG90], CM5 [Thi91] and nCUBE 2 [Pal88, Tro89] is extended by adding a second processor, the Locality Manager(LM) to each processing element. In Seamless, a processing element this is referred to as a locality . While the idea of adding a second processor to handle communications is not unique (e.g. ....
Intel, i860 XP Microprocessor Data Book, Intel Corporation, Santa Clara, CA, 1991.
.... = 1, then any instruction level concurrency that is exploited increases the utilization beyond 1 since proc = To achieve higher clock rates, modern microprocessors use increasing degrees of pipelining, resulting in higher (3 to 5 cycle floating point latencies are not uncommon) [29, 24, 13]) Memory accesses also incur long latencies, especially in multiprocessors where cache miss rates tend to be higher due to sharing and main memory accesses must traverse multistage networks. In order to maintain good utilization , the exploited concurrency proc must equal for the si ....
....like any other pipelined instruction but with a higher and variable latency. Further, we have not assumed any limit on the number memory loads or cache misses that may be outstanding. While commercially available microprocesssors limit the number of outstanding memory operations to two or three [19, 29, 25, 24], we argue strongly in favor of increasing this number for multiprocessor building blocks. We argue that cacheing will tend to become less effective in large scale multiprocessors, and that the ability to tolerate large mem will be an important feature of the processor architecture. Simply ....
Intel Corporation. i860 XP Microprocessor Data Book, November 1991. Order Number: 240874-002.
....Memory Memory Memory Memory Memory Memory Memory Data(32) Control(6) Figure 8: Bus based (i860XP) and DI based systems. To measure the memory performance, we compare cache refill times over a range of line sizes. As a case study, we compare the bus based memory interface of the Intel i860XP [3] to our DI microprocessor memory interface. To make the comparison fair, we assume the processors have the same internals, with single level on chip cache and approximately the same number of I O pins (see Figure 8) The performance numbers assumed for the calculation are shown in Figure 9 and ....
Intel Corporation. i860 XP Microprocessor Data Book, 1991.
....performance benefits, even on modest size problems and machines. 6.1 Memory Interface Performance A memory hierarchy based on dynamic interconnection can match and in some cases outperform busbased approaches. As a case study, we compare the bus based memory interface of the Intel i860XP [24] to our DI microprocessor memory interface. To make the comparison fair, we assume the processors have the same internals, with single level on chip cache and approximately the same number of I O pins (see Figure 14) The i860XP s memory interface uses 139 pins: 64 data lines, 29 address lines, ....
Intel Corporation. i860 XP Microprocessor Data Book, 1991.
.... ( 1) then any instruction level concurrency that is exploited increases the utilization beyond 1 since proc = To achieve higher clock rates, modern microprocessors use increasing degrees of pipelining, resulting in higher (3 to 6 cycle floating point latencies are not uncommon) [75, 54, 33]) Memory accesses also incur long latencies, especially in multiprocessors where cache miss rates tend to be higher due to sharing and main memory accesses must traverse multistage networks. To maintain good utilization , the exploited concurrency proc must equal for the SI processor, and ....
....be treated like any other pipelined instruction but with a higher and variable latency. Further, no limit was placed on the number memory loads or cache misses that may be outstanding. While commercially available microprocessors limit the number of outstanding memory operations to two or three [43, 75, 57, 54], the results presented here favor increasing this number for multiprocessor building blocks. It is argued that caching will tend to become less effective in large scale multiprocessors, and that the ability to tolerate large mem will be an important feature of the processor architecture. Simply ....
Intel Corporation. i860 XP Microprocessor Data Book, November 1991. Order Number: 240874-002.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC