See this document in CiteSeerX!

A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts (1999)  (Make Corrections)  (14 citations)
M. Kandemir, A. Choudhary, N. Shenoy, P. Banerjee, J. Ramanujam
IEEE Transactions on Parallel and Distributed Systems



  Home/Search   Context   Related

 
View or download:
lsu.edu/pub/jxr/papers/...tpds98mtk.PS
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  lsu.edu/jxr/group (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper presents a data layout optimization technique for sequential and parallel programs based on the theory of hyperplanes from linear algebra. Given a program, our framework automatically determines suitable memory layouts that can be expressed by hyperplanes for each array that is referenced. We discuss the cases where data transformations are preferable to loop transformations and show that under certain conditions a loop nest can be optimized for perfect spatial locality by using... (Update)

Cited by:   More
Software Methods to Improve Data Locality and Cache Behavior - Beyls (2004)   (Correct)
Optimizing Program Locality through CMEs and GAs - Vera, Abella.. (2003)   (Correct)
Efficient and Accurate Analytical Modeling of Whole-Program Data .. - Xue, Vera (2003)   (Correct)

Similar documents (at the sentence level):   More
47.8%:   A Data Layout Optimization Technique Based on Hyperplanes - Kandemir, Choudhary.. (1997)   (Correct)
27.7%:   A Hyperplane Based Approach for Optimizing Spatial .. - Kandemir.. (1998)   (Correct)
5.6%:   Enhancing Spatial Locality using Data Layout.. - Kandemir, Choudhary.. (1997)   (Correct)

Active bibliography (related documents):   More   All
0.7:   Compiler Algorithms for Optimizing Locality and.. - Kandemir Ramanujam.. (1997)   (Correct)
0.7:   An ILP Approach for Optimizing Cache Locality - Kandemir, Banerjee.. (1998)   (Correct)
0.6:   A Matrix-Based Approach to Global Locality Optimization - Kandemir, Choudhary.. (1999)   (Correct)

Similar documents based on text:   More   All
0.6:   An Integer Linear Programming Approach for Optimizing.. - Kandemir Banerjee.. (1999)   (Correct)
0.3:   Towards Automatic Synthesis of High-Performance.. - Cociorva.. (2001)   (Correct)
0.3:   A Combined Communication and Synchronization.. - Kandemir.. (1998)   (Correct)

Related documents from co-citation:   More   All
12:   A Data Locality Optimizing Algorithm (context) - Wolf, Lam - 1991
10:   Cache miss equations: a compiler framework for analyzing and tuning memory behav.. - Ghosh, Martonosi et al. - 1998
10:   A fast and accurate approach to analyze cache memory behavior - Vera, Llosa et al. - 2000

BibTeX entry:   (Update)

M. Kandemir, A. Choudhary, N. Shenoy, and P. Banerjee, and J. Ramanujam. A linear algebra framework for automatic determination of optimal data layouts. IEEE Transaction on Parallel and Distributed Systems, 10(2):115--135, February 1999. http://citeseer.ist.psu.edu/kandemir99linear.html   More

@article{ kandemir99linear,
    author = "M. Kandemir and A. Choudhary and N. Shenoy and P. Banerjee and J. Ramanujam",
    title = "A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts",
    journal = "IEEE Transactions on Parallel and Distributed Systems",
    volume = "10",
    number = "2",
    pages = "115--??",
    year = "1999",
    url = "citeseer.ist.psu.edu/kandemir99linear.html" }
Citations (may not include all citations):
717   Theory of Linear and Integer Programming (context) - Schrijver - 1986
474   A data locality optimizing algorithm (context) - Wolf, Lam - 1991
387   A set of level 3 basic linear algebra subprograms (context) - Dongarra, Croz et al. - 1990
299   Dependence Analysis for Supercomputing (context) - Banerjee - 1988
237   Global optimizations for parallelism and locality on scalabl.. - Anderson, Lam - 1993
222   The SGI Origin: A cc-NUMA highly scalable server (context) - Laudon, Lenoski - 1997
216   Strategies for cache and local memory management by global p.. (context) - Gannon, Jalby et al. - 1988
175   Evaluating associativity in CPU caches (context) - Hill, Smith - 1989
173   SUIF: An infrastructure for research on parallelizing and op.. - Wilson, French et al. - 1994
162   Improving data locality with loop transformations - McKinley, Carr et al. - 1996
146   Demonstration of automatic data partitioning techniques for .. - Gupta, Banerjee - 1992
146   Unimodular transformations of double loops (context) - Banerjee - 1991
113   Data and computation transformations for multiprocessors - Anderson, Amarasinghe et al. - 1995
106   Unifying data and control transformations for distributed sh.. - Cierniak, Li - 1995
87   Compile-time techniques for data distribution in distributed.. - Ramanujam, Sadayappan - 1991
81   Reducing false sharing on shared memory multiprocessors thro.. - Jeremiassen, Eggers - 1995
71   Automatic data layout for High Performance Fortran - Kennedy, Kremer - 1995
59   Operating system support for for improving data locality on .. (context) - Verghese, Devine et al. - 1996
49   False sharing and spatial locality in multiprocessor caches - Torrellas, Lam et al. - 1994
48   Memory Storage Patterns in Parallel Processing (context) - Mace - 1987
41   A novel approach towards automatic data distribution - Garcia, Ayguade et al. - 1995
34   Compiling for NUMA parallel machines (context) - Li - 1993
31   Non-unimodular transformations of nested loops - Ramanujam - 1992
30   Reduction of cache coherence overhead by compiler data layou.. (context) - Ju, Dietz - 1992
30   Non-singular data transformations: definition (context) - O'Boyle, Knijnenburg - 1996
28   A compiler algorithm for optimizing locality in loop nests - Kandemir, Ramanujam et al. - 1997
24   Automatic selection of dynamic data partitioning schemes for.. - Palermo, Banerjee - 1995
24   Delinearization: an efficient way to break multi-loop depend.. - Maslov - 1992
24   Optimizing data locality by array restructuring - Leung, Zahorjan - 1995
21   Data-distribution support on distributed-shared memory multi.. (context) - Chandra, Chen et al. - 1997
20   Dynamic data distribution with control flow analysis - Garcia, Ayguade et al. - 1996
16   Evaluating the impact of advanced memory systems on compiler.. - Torrie, Tseng et al. - 1995
16   Compile-time techniques for parallel execution of loops on d.. (context) - Ramanujam - 1990
12   Compiler algorithms for optimizing locality and parallelism .. - Kandemir, Ramanujam et al. - 1997
11   Automatic partitioning of data and computations on scalable .. (context) - Tandri, Abdelrahman - 1997
8   Optimizing parallel programs using affinity regions (context) - Appelbe, Lakshmanan - 1993
8   Communication-free partitioning of nested loops (context) - Huang, Sadayappan - 1993
5   Compilation techniques for out-of-core parallel computations - Kandemir, Choudhary et al. - 1998
4   NWChem: A computational chemistry package for parallel compu.. (context) - Computational, Group - 1995
2   Integrating data distribution and loop transformations for d.. - Ramanujam, Narayan - 1995
1   Locality optimization algorithms for compilation of out-of-c.. (context) - Kandemir, Choudhary et al. - 1998



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.ee.lsu.edu/jxr/group.html):   More
Tiling of Iteration Spaces for Multicomputers - Ramanujam Sadayappan (1992)   (Correct)
Beyond Unimodular Transformations - Ramanujam (1995)   (Correct)
Efficient Address and Communication Generation for.. - Venkatachar (1996)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC