See this document in CiteSeerX!

Performance Tuning and Analysis of Sparse Triangular Solve JamesW. Demmel, Katherine A. Yelick June 22, 2002 Berkeley Benchmarking and OPtimization (BeBOP) Project www.cs.berkeley.edu/  (Make Corrections)  
Richie Bebop Computer Performance Tuning and Analysis Sparse Triangular Solve JamesW. Demmel,



  Home/Search   Context   Related

 
View or download:
berkeley.edu/./pub...boundsslides.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  berkeley.edu/#pubs (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: ed BLAS for (3). Automatic Performance Tuning and Analysis of Sparse Triangular Solve -- Blocking (SPARSITY) 0 10 20 30 40 50 0 5 10 15 20 25 30 35 40 45 50 nz = 598 4x3 Register Blocking Store blocks Multiply/solve block-by-block Fill in explicit zeros 1.3x--2.5x speedup on FEM matrices Reduced storage overhead over, Block ops are fully unrolled -- improves register reuse Trade-off extra computation for Performance Tuning and Analysis of Sparse Triangular Solve -- Parameter... (Update)

Active bibliography (related documents):   More   All
2.0:   Performance Optimizations and Bounds for Sparse.. - Vuduc, Demmel, Yelick (2002)   (Correct)
2.0:   Memory Hierarchy Optimizations and Performance Bounds.. - Vuduc, Gyulassy.. (2003)   (Correct)
0.6:   Automatic Performance Tuning and Analysis of Sparse .. - Vuduc, Kamil, Hsu, .. (2002)   (Correct)

BibTeX entry:   (Update)

@misc{ computer-performance,
  author = "Richie Bebop Computer",
  title = "Performance Tuning and Analysis of Sparse Triangular Solve JamesW. Demmel,
    Katherine A. Yelick June 22, 2002 Berkeley Benchmarking and OPtimization
    (BeBOP) Project www.cs.berkeley.edu/",
  url = "citeseer.ist.psu.edu/661888.html" }
Citations (may not include all citations):
124   FFTW: An adaptive software architecture for the FFT - Frigo, Johnson - 1998
123   Optimizing matrix multiply using PHiPAC: a portable - Bilmes, Asanovi et al. - 1997
56   Automated empirical optimizations of software and the ATLAS .. - Whaley, Petitet et al. - 2001  DBLP
27   Characterizing the behavior of sparse algorithms on caches - Temam, Jalby - 1992
23   Optimizing the performance of sparse matrix-vector multiplic.. (context) - Im - 2000  ACM
21   Arrays in blitz (context) - Veldhuizen - 1998
13   A Relational Approach to the Automatic Generation of Sequent.. (context) - Stodghill - 1997  ACM
12   Automatic nonzero structure analysis (context) - Bik, Wijshoff - 1999  ACM   DBLP
10   Memory hierarchy performance prediction for sparse blocked a.. (context) - Fraguela, Doallo et al. - 1999
10   FLAME: Formal Linear Algebra Methods Environment (context) - Gunnels, Gustavson et al. - 2001  DBLP
9   Modeling and improving locality for irregular problems: spar.. - Heras, Perez et al. - 1999
7   An adaptive software library for fast fourier transforms - Mirkovic, Mahasoom et al. - 2000  ACM   DBLP
7   Anwendung von generativen Programmiertechniken am Beispiel d.. (context) - Neubert - 1998
6   Algorithms for sparse matrix computations on high-performanc.. (context) - Navarro, ia et al. - 1996
6   A rational approach to portable high performance: the Basic .. - Siek, Lumsdaine - 1998
6   On improving the performance of sparse matrix-vector multipl.. - White, Sadayappan - 1997
3   Fast automatic generation of DSP algorithms - uschel, Singer et al. - 2001
3   Towards an accurate model for collective communications - Vadhiyar, Fagg et al. - 2001  ACM   DBLP

Documents on the same site (http://bebop.cs.berkeley.edu/#pubs):   More
Automatic Performance Tuning of Sparse Matrix Kernels - Vuduc (2003)   (Correct)
Memory Hierarchy Optimizations and Performance Bounds .. - Vuduc, Gyulassy.. (2003)   (Correct)
Statistical Models for Automatic Performance Tuning - Vuduc, Demmel, Bilmes (2001)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC