| J.J. Navarro, E. Garc'ia, J.L. Larriba-Pey and T. Juan, "Block Algorithms for Sparse Matrix Computations on High Performance Workstations," Proc. ACM Int'l. Conf. on Supercomputing (ICS'96), pp. 301-309, May 1996. |
....the Locality of Numerical Codes In the literature a large number of algorithms for evaluating and optimizing data locality can be found. In the case of dense codes, most approaches are based on decreasing capacity and interference misses by using blocking or other restructuring techniques [9]. It is well known that techniques that improve locality for a problem in an uniprocessor system will obtain an improvement when the code is executed in multiprocessor systems [2, 6, 12] In the case of the sparse matrix vector product (SpM ThetaV ) a classic method for dealing with the problem ....
J.J. Navarro, E. Garc'ia, J.L. Larriba-Pey, and T. Juan. Block algorithms for sparse matrix computations on high performance workstations. Proc. IEEE Int'l. Conf. on Supercomputing (ICS'96), pages 301--309, 1996.
....loop interchanging or software pipelining. This work demonstrates the feasibility of modeling such types of complex algorithms that fit hardware improvements of current high performance microprocessors. As an example, an optimized version of the sparse matrix dense matrix product described in [10] is modeled. In this work we apply the probabilistic model for K way associative caches with LRU replacement introduced in [11] where only simple algebra kernels such as sparse matrix vector product and sparse matrix transposition were considered. Optimum block sizes for the memory hierarchy of ....
J.J. Navarro, E. Garc'ia, J.L. Larriba-Pey and T. Juan, Block Algorithms for Sparse Matrix Computations on High Performance Workstations, in Proc. 10th ACM Int'l. Conf. on Supercomputing (ICS'96), Philadelphia, May 1996, 301--308.
....due to the indirect accesses that arise in the processing of sparse matrices, because of their compressed storage format [1] Several software techniques for improving memory performance have been proposed, such as blocking, loop unrolling, loop interchanging or software pipelining. Navarro et al. [5] have studied some of these techniques for the sparse matrixdense matrix product algorithm. They have proposed an optimized version of this algorithm as a result of a series of simulations on a DEC Alpha processor. The traditional approach for cache performance evaluation has been the software ....
....the modeling process. In Sect. 4 the model is validated and used to study the cache behavior of the algorithm as a function of the block dimensions and the cache main parameters. Section 5 concludes the paper. 2 Modeled Algorithm The optimized sparse matrix dense matrix product code proposed in [5] is shown in Fig. 1. The sparse matrix is stored using the Compressed Row Storage (CRS) format [1] This format uses three vectors: vector A contains the sparse matrix entries, vector C stores the column of each entry, and vector R indicates in which point of A and C a new row of the sparse matrix ....
Navarro, J.J., Garc'ia, E., Larriba-Pey, J.L., Juan, T.: Block Algorithms for Sparse Matrix Computations on High Performance Workstations. Proc. ACM Int'l. Conf. on Supercomputing (ICS'96) (1996) 301--309.
....memory considering sparse matrices with a uniform or banded distribution is presented. We want to emphasize that an important body of the model is reusable in different algebra kernels. The most important approach to study cache behavior has traditionally been the use of trace driven simulations [7], 9] 12] whose main drawback is the large amount of time needed to process the traces. Another possibility is nowadays provided by the performance monitoring tools of modern microprocessors (built in hardware counters) that make This work was supported by the Ministry of Education and ....
J.J. Navarro, E. Garc'ia, J.L. Larriba-Pey and T. Juan, "Block Algorithms for Sparse Matrix Computations on High Performance Workstations," Proc. ACM Int'l. Conf. on Supercomputing (ICS'96), pp. 301-309, May 1996.
No context found.
J.J. Navarro, E. Garc'ia, J.L. Larriba-Pey and T. Juan, "Block Algorithms for Sparse Matrix Computations on High Performance Workstations," Proc. ACM Int'l. Conf. on Supercomputing (ICS'96), pp. 301-309, May 1996.
No context found.
J.J. Navarro, E. Garcia-Diego, J.-L. Larriba-Pey, and T. Juan. Block Algorithms for Sparse Matrix Computations on High Performance Workstations. In Proc. of the Int. Conference on Supercomputing, pages 301-308, Philadelphia, Pennsylvania, USA, 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC