| R. Rau, M. Schlansker, D. Yen, "The Cydra 5 Stride-Insensitive Memory System," In Proceedings of 1989 International Conference on Parallel Processing, pp 242-246, St. Charles, Illinois, August 1989. |
....references very predictable. Memory operations issued by the vector unit can be efficiently supported by the use of scatter gather operations implemented using intelligent memory controllers. Rau et al. advocate a pseudo random interleaving scheme to make the memory system stride insensitive [11]. Mathew et.al. describe a parallel vector access unit for SDRAM memory systems [10] Their scheme uses a small set of remapping controllers and multiple interleaved DRAM banks each with its own bank controller (BC) The remapping controllers support three types of scatter gather operations and ....
B. R. Rau, M. S. Schlansker and D. W. L. Yen, "The Cydra 5 Stride-Insensitive Memory System," In Proc. Int Conf. on Parallel Processing, 1989, pp. 242-246.
....or bank conflicts and momentary loss in throughput. Techniques which provide statistical guarantees : Here the memory management algorithm (MMA) is designed so that the probability of DRAM row or bank conflicts is redriced. These include designs that randomly select memory locations [24] 25] [26], so that the probability of row or bank conflicts in DRAMs are considerably redriced. Under certain conditions, statistical bounds (such as average delay) can be found. Our work on packet buffer design was first described in [27] 28] and has some similarities with previous work done in [29] 30] ....
B.R. Rau, M.S. Schlansker and D.W.L. Yen, "The Cydra 5 stride-insensitive memory system", In Proc. Int Conf on Parallel Processing, 1989, pp.242-246.
....to the different types of implementation of such correspondence. Vector computers use the interleaved storage scheme [3] Figure 1) whereas other storage schemes such as skewing have been used in array processors [4] and linear transformations [5] have been used in VLIW systems such as the Cydra [6] and scalar multiprocessors such as the RP3 [7] to improve the performance of the memory system. Fig. 1. Interleaved storage scheme Vector processors frequently require the access to vectors or streams, which are A: 0 m n 1 module number displacement m 1 characterized by the address of the ....
B.R. Rau, M.S. Schlansker and D.W.L. Yen, "The Cydra 5 Stride-Insensitive Memory System", Int. Conf. on Parallel Processing, pp. 242-246, 1989.
.... access to a vector for a set of strides [Harp91] Rau [Rau91] analyzes a scheme that assigns storage locations to modules in a pseudo random fashion, rendering memory performance nearly stride insensitive; such a memory system has been incorporated into Cydrome s Cydra 5 Departmental Supercomputer [RaSY89]. As with skewed and dynamic schemes, pseudo random storage schemes benefit from memory module queues. The above studies focus on increasing parallelism for accesses to a single vector beyond that achieved by sequentially interleaved storage. However, for a given storage scheme, it is not clear in ....
Rau-B, Schlansker-M, Yen-D, "The Cydra 5 Stride-Insensitive Memory System ", Proc. 1989 Intl. Conf. Parallel Processing, 1989, pp. 242-246. 147
....used to translate the initial code schedule into a schedule that can tolerate a memory access latency that is Xi F J Pi MII longer, assuming there are J loads in the code. The new code will stall for Xi F J Pi MII fewer cycles on cache misses. The Cydra 5 compilers used this technique [30]. 4 Concluding Remarks This work is based on the premise that the proper focus of performance evaluation studies for scientific computers is on optimum rather than expected performance. Optimum performance studies and metrics have earned a bad reputation because it is easy to derive trivial or ....
B. R. Rau, M. S. Schlansker, and D. W. L. Yen, "The Cydra 5 Stride-Insensitive Memory System," in Proc. of International Conference on Parallel Processing, pp. I-- 242--246, August 1989.
....modulo powerof two. Lawrie and Vora proposed a scheme using prime modulus functions [16] Harper and Jump [11] and Sohi [24] proposed skewing functions. The use of XOR functions was proposed by Frailong et al. 5] and pseudo random functions were proposed by Raghavan Hayes [17] and Rau et al. [18], 19] These schemes each yield a more or less uniform distribution of requests to banks, with varying degrees of theoretical predictability and implementation cost. In principle each of these schemes could be used to construct a conflict resistant cache by using them as the indexing function. ....
....to , though it may be as small as for this scheme to be distinct from conventional block placement. 2.1. 2 Polynomial placement characteristics The class of polynomial hash functions described above have been studied previously in the context of stride insensitive interleaved memories (see [18] and [19] These functions have certain i 1 i 2 . i w , i k h v A P k , 0 2 m 1 , m v n P k w P 1 P 2 . P w , 2 m 2 m 1 1 , P i h v A P , A P A a n 1 2 n 1 a n 2 2 n 2 . a 0 = P A A x ....
B.R. Rau, M.S. Schlansker and D.W.L Yen, "The Cydra 5 Stride-Insensitive Memory System", In Proc Int. Conf. on Parallel Processing, 1989, pp. 242-246.
....proposed a scheme using prime modulus functions [12] Harper and Jump [13] and Sohi [14] proposed skewing functions. The use of xor functions in parallel memory systems was proposed by Frailong et al. 15] and other pseudo random functions were proposed by Raghavan and Hayes [16] and Rau et al. [17], 18] These schemes each yield a more or less uniform distribution of requests to banks, with varying degrees of theoretical predictability and implementation cost. In principle each of these schemes could be used to construct a conflict resistant cache by using them as the indexing function. ....
....the number of input address bits to the polynomial mapping function by ignoring some of the upper bits in A. This does not seriously degrade the quality of the mapping function. Ipoly mapping functions have been studied previously in the context of stride insensitive interleaved memories (see [17], 18] and have certain provable characteristics of significant value for cache indexing. In [24] it was demonstrated that a skewed Ipoly cache indexing scheme shows a higher degree of conflict resistance than that exhibited by conventional set associativity or other (non Ipoly) xor based ....
B. Rau, M. Schlansker, and D. Yen, "The Cydra 5 strideinsensitive memory system," in Proc. Int. Conf. on Parallel Processing, pp. 242--246, 1989.
....Gonz lez , Mateo Valero , Nigel Topham and Joan M. Parcerisa On the effectiveness of XOR . 2 1 Introduction The idea of using XOR functions to map memory addresses onto a set of memory modules has been studied extensively in the last decade; for example, see [7] 13] 19] 10] [14], 9] 15] and [21] It has proven to be an effective way to distribute memory addresses to memory modules in a pseudo random way. In that context, the aim is to allow multiple memory references to proceed in parallel by maximizing the probability that they will access different memory modules. ....
....is not a lack of associativity, but a defective line placement algorithm which fails to disperse data equitably between the available cache lines. 3. 1 XOR mapping schemes The use of XOR mapping schemes has been studied extensively in the context of interleaved memories [7] 13] 19] 10] [14], 9] 15] and [21] among others. In this paper we consider two types of XOR based mapping schemes; those chosen in an ad hoc way based on common intuitive notions of how such schemes behave, and a scheme proposed by Rau in [15] which describes a method for constructing XOR mapping schemes based ....
B.R. Rau, M.S. Schlansker and D.W.L Yen, "The Cydra 5 Stride-Insensitive Memory System", In Proc Int. Conf. on Parallel Processing, 1989, pp. 242-246.
....for each reference that the processor makes, of which all but one word is wasted. Far from helping the situation, the cache is now compounding the problem by amplifying the request rate to an already under designed main memory. This phenomenon has been studied and reported elsewhere, e.g. in [1,2]. However, since data caches are important in achieving good performance on scalar computations with little parallelism, the right compromise, probably, is to provide a data cache that can be bypassed when referencing data structures which have poor locality. Interleaved memory systems. Thus, ....
....function. It is clearly preferable if the computation of the randomized address is inexpensive both in the amount of hard,vare required as well as in the time taken to do it. From this viewpoint, the idea of randomizing the physical address by XOR ing it with another bit pattern is very attractive [2,12 14]. This mapping is a permutation and, so, satisfies requkements 2 and 3. Obviously, the bit pattern that is XOR ed with the physical address must keep changing, else all that we have accomplished is a renaming of the memory modules. 9 rseudo Randomly Interleaved Memory Assume that the m low ....
[Article contains additional citation context not shown here]
B. R. Rau, M. S. Schlansker and D. W. L. Yen, "The Cydra 5 Stride-Insensitive Memory System", Proceedings of the 1989 International Conference on Parallel Processing. Vol. 1. pp. 242-246, August 8-12, 1989.
No context found.
R. Rau, M. Schlansker, D. Yen, "The Cydra 5 Stride-Insensitive Memory System," In Proceedings of 1989 International Conference on Parallel Processing, pp 242-246, St. Charles, Illinois, August 1989.
No context found.
R. Rau, M. Schlansker, D. Yen, "The Cydra 5 Stride-Insensitive Memory System," In Proceedings of 1989 International Conference on Parallel Processing, pp 242-246, St. Charles, Illinois, August 1989.
No context found.
B.R. Rau, M.S. Schlansker and D.W.L. Yen, "The Cydra 5 Stride-Insensitive Memory System", Int. Conf. on Parallel Processing, pp. 242-246, 1989.
No context found.
B. R. Rau, M.S. Schlansker and D.W.L. Yen, "The Cydra 5 stride-insensitive memory system", In Proc. Int Conf. on Parallel Processing, 1989, pp.242-246.
No context found.
B.R. Rau, M.S. Schlansker and D.W.L. Yen, "The Cydra 5 stride-insensitive memory system", In Proc. Int Conf. on Parallel Processing, 1989, pp.242-246.
No context found.
B. R. Rau, M. S. Schlansker, and D. W. L. Yen, "The cydra 5 stride-insensitive memory system," ICPP, pp. 242--246, 1989.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC