(Enter summary)
Abstract: to
choose
c?
Ax--[
<-->
]
Memory
Hierarchy
Optimizations
for
multiplication
:
:
:
Dot-product,
followed
by
"axpy":
.
Register-level:
Take
be
a
block
row
composed
blocks.
Question:
How
to
c?
<-->
]
2x2
Code
Ai
for(
ptr[i]
;
<
ptr[i+1]
;
2*2
)
{/*
Aix
*/
val[0*2+0]
;
val[0*2+1]
;
RESET(
ind,
val
);
11
register
double
y0
0,
y1
0
;
+=
*t1
;
Ax--
[
<-->
]
Register-Level
Blocking
(SPARSITY):
3x3
<-->
]
Register-Level
Blocking... (Update)
Cited by: More
Memory Hierarchy Optimizations and Performance Bounds .. - Vuduc, Gyulassy.. (2003)
(Correct)
Active bibliography (related documents): More All
2.9: Performance Optimizations and Bounds for Sparse.. - Vuduc, Demmel, Yelick (2002)
(Correct)
2.0: Performance Tuning and Analysis of Sparse Triangular Solve .. - Richie Bebop Computer (2002)
(Correct)
0.8: Automatic Performance Tuning and Analysis of Sparse .. - Vuduc, Kamil, Hsu, .. (2002)
(Correct)
Similar documents based on text: More All
0.3: Modeling the Benefits of Mixed Data and Task Parallelism - Chakrabarti, Demmel, Yelick (1995)
(Correct)
0.2: Statistical Modeling of Feedback Data in an Automatic.. - Vuduc, Bilmes, Demmel (2000)
(Correct)
0.2: Optimization of Sparse Matrix Kernels for Data Mining - Im, Yelick (2000)
(Correct)
BibTeX entry: (Update)
R. Vuduc, A. Gyulassy, J. W. Demmel, and K. A. Yelick. Memory hierarchy optimizations and performance bounds for sparse A Ax. Technical Report UCB/CS03 -1232, University of California, Berkeley, February 2003. http://citeseer.ist.psu.edu/article/vuduc03memory.html More
@misc{ vuduc03memory,
author = "R. Vuduc and A. Gyulassy and J. Demmel and K. Yelick",
title = "Memory hierarchy optimizations and performance bounds for sparse A Ax",
text = "R. Vuduc, A. Gyulassy, J. W. Demmel, and K. A. Yelick. Memory hierarchy
optimizations and performance bounds for sparse A Ax. Technical Report UCB/CS03
-1232, University of California, Berkeley, February 2003.",
year = "2003",
url = "citeseer.ist.psu.edu/article/vuduc03memory.html" }
Citations (may not include all citations):
576
Authoritative sources in a hyperlinked environment
- Kleinberg - 1999 ACM DBLP
165
SPARSKIT: A basic toolkit for sparse matrix computations
- Saad - 1994
124
FFTW: An adaptive software architecture for the FFT
- Frigo, Johnson - 1998
117
Applied Numerical Linear Algebra (context) - Demmel - 1997 ACM
56
Automated empirical optimizations of software and the ATLAS ..
- Whaley, Petitet et al. - 2001 DBLP
27
Characterizing the behavior of sparse algorithms on caches
- Temam, Jalby - 1992 ACM DBLP
25
Improving memory-system performance of sparse matrixvector m..
- Toledo - 1997
23
Optimizing the performance of sparse matrix-vector multiplic.. (context) - Im - 2000 ACM
21
Arrays in blitz (context) - Veldhuizen - 1998
15
NIST Sparse BLAS: User's Guide
- Remington, Pozo - 1996
14
Adaptive use of iterative methods in interior point methods ..
- Wang, O'Leary - 1995 ACM
13
A Relational Approach to the Automatic Generation of Sequent.. (context) - Stodghill - 1997 ACM
12
Automatic nonzero structure analysis (context) - Bik, Wijshoff - 1999 ACM DBLP
10
FLAME: Formal Linear Algebra Methods Environment (context) - Gunnels, Gustavson et al. - 2001 DBLP
10
Memory hierarchy performance prediction for sparse blocked a.. (context) - Fraguela, Doallo et al. - 1999
9
Modeling and improving locality for irregular problems: spar..
- Heras, Perez et al. - 1999
7
Anwendung von generativen Programmiertechniken am Beispiel d.. (context) - Neubert - 1998
7
Towards realistic bounds for implicit CFD codes (context) - Gropp, Kasushik et al. - 1999
7
An adaptive software library for fast fourier transforms
- Mirkovic, Mahasoom et al. - 2000 ACM DBLP
6
On improving the performance of sparse matrix-vector multipl..
- James, White et al. - 1997 ACM
6
A rational approach to portable high performance: the Basic ..
- Siek, Lumsdaine - 1998
6
Algorithms for sparse matrix computations on high-performanc.. (context) - Navarro, ia et al. - 1996
5
Improving performance of sparse matrixvector multiplication
- Pinar, Heath - 1999
5
PSBLAS: A library for parallel linear algebra computation on.. (context) - Filippone, Colajanni - 2000 DBLP
4
Towards a fast parallel sparse matrix-vector multiplication
- Geus, ollin - 1999
3
Towards an accurate model for collective communications
- Vadhiyar, Fagg et al. - 2001 ACM DBLP
3
Fast automatic generation of DSP algorithms
- uschel, Singer et al. - 2001 ACM DBLP
Documents on the same site (http://bebop.cs.berkeley.edu/#pubs): More
Automatic Performance Tuning of Sparse Matrix Kernels - Vuduc (2003)
(Correct)
Memory Hierarchy Optimizations and Performance Bounds .. - Vuduc, Gyulassy.. (2003)
(Correct)
Statistical Models for Automatic Performance Tuning - Vuduc, Demmel, Bilmes (2001)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC