| Naraig Manjikian and Tarek S. Abdelrahman. Array data layout for the reduction of cache conflicts. Proc. of 8th International Conference on Parallel and Distributed Computing Systems, Sep. 1995. |
....measurements of locality that will allow us to determine what sparse matrices will behave better in terms of all the events in which the locality is involved, for example, the number of cache misses. Other applications of changing the data layout to improve the locality can be found in [16,18,5]. 3 A Quantitative Model of Locality Let N be the number of rows or columns of a square sparse matrix A and N Z the number of non zero elements (entries) Figure 1 shows a code for the row oriented sparse matrix vector product when the matrix is stored in the Compressed Row Storage (CRS) ....
N. Manjikian and T. S. Abdelrahman. Array data layout for the reduction of cache conflicts. In Porc. of the 8th Int. Conference on Parallel and Distributed Computing Systems, 1998.
....but still a significant amount of cache misses are present. Similarly storage order optimizations [5, 6] are very helpful in reducing the capacity misses. Thus mostly conflict cache misses related to the sub optimal data layout remain. Array padding has been proposed earlier to reduce the latter [16, 18, 20]. These approaches are useful for reducing the (cross ) conflict misses to some extent. However existing approaches do not eliminate the majority of the conflict misses. Besides [3, 10, 18, 20] very little has been done to measure the impact of data layout (s) on the cache performance. Thus there ....
.... 3 ( # ) Note that the complexity in addressing can be removed to a large extent using the address optimizations proposed in [7] The effectiveness of that stage will be shown in section 4. Current[50176] V4x[3136] V4y[3136] Previous[50176] Current[240] Previous[240] V4x[16] V4y[16] Current[240] Previous[240] 240 480 496 512 107520 0 Cache Size = 512 bytes ; Line Size = 16 bytes Memory Overhead = 1 Memory Data Organization Improved Initial Data Layout Organized Initial Current[50176] Previous[50176] V4x[3136] Base Address ....
[Article contains additional citation context not shown here]
N.Manjikian and T.Abdelrahman, "Array data layout for reduction of cache conflicts", Intl. Conference on Parallel and Distributed Computing Systems, 1995.
.... Most of the work related to efficient utilization of caches has been directed towards optimization of the throughput by means of (local) loop nest transformations to improve data locality [2, 18] and loop blocking transformations to improve the cache utilisation by reducing conflict misses [14, 17, 19]. Some work [22] has been reported on the data organization for improved cache performance in embedded processors but they do not take into account a power oriented model. None of the previous work tries to directly reduce the storage requirements by partially overlapping data, as memory is ....
N.Manjikian and T.Abdelrahman, "Array data layout for reduction of cache conflicts", Int. Conf. on Parallel and Distributed Computing Systems, 1995.
....[35, 14, 15, 12, 11] Unfortunately, the performance of a tiled program resulting from existing tiling heuristics shows a large amount of instability [32, 28] Instability comes from the so called pathological array sizes [4, 10, 22, 2] which result in poor choices of tile sizes. Array padding [1, 23, 24, 30, 31] is a compiler optimization that increases the array sizes and initial locations to avoid the pathological cases. It introduces space overhead but e#ectively stabilizes program performance. More recent research e#orts have investigated the combination of both loop tiling and array padding in the ....
N. Manjikian and T. Abdelrahman. Array data layout for the reduction of cache conflicts. In Proceedings of the 8th International Conference on Parallel and Distributed Computing Systems, 1995.
.... Most of the work related to efficient utilization of caches has been directed towards optimization of the throughput by means of (local) loop nest transformations to improve data locality [2, 18] and loop blocking transformations to improve the cache utilisation by reducing conflict misses [14, 17, 19]. Some work [22] has been reported on the data organization for improved cache performance in embedded processors but they do not take into account a power oriented model. None of the previous work tries to directly reduce the storage requirements by partially overlapping data, as memory is ....
N.Manjikian and T.Abdelrahman, "Array data layout for reduction of cache conflicts", Int. Conf. on Parallel and Distributed Computing Systems, 1995.
No context found.
Naraig Manjikian and Tarek S. Abdelrahman. Array data layout for the reduction of cache conflicts. Proc. of 8th International Conference on Parallel and Distributed Computing Systems, Sep. 1995.
No context found.
N. Manjikian and A. T. Array data layout for the reduction of cache conflicts. In Proceedings of the 8th International Conference on Parallel and Distributed Computing Systems, 1995.
No context found.
N. Manjikian and T. Abdelrahman. Array data layout for the reduction of cache conflicts. In Proceedings of the 8th International Conference on Parallel and Distributed Computing Systems, 1995.
No context found.
N. Manjikian and T. Abdelrahman. Array data layout for the reduction of cache conflicts. In Proceedings of the 8th International Conference on Parallel and Distributed Computing Systems, 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC