| A. LaMarca and R. Ladner, "The Influence of Caches on the Performance of Heaps," in Proceedings of the Eighth ACM/SIAM Symposium on Discrete Algorithms, pp. 370--379, (New Orleans, LA), 1997. |
.... (ii) how to optimize locally as well as globally the use of caches; and (iii) how to minimize the work spent in synchronization (barrier calls) The first two are closely related: good data and task partitioning will ensure good locality; coupling such partitioning with cache sensitive coding (see [27, 28, 29, 34] for discussions) provides programs that take best advantage of the architecture. Minimizing the work done in synchronization barriers is a fairly simple exercise in program analysis, but turns out to be far more difficult in practice: a tree based gather and broadcast barrier, for instance, will ....
A. LaMarca and R.E. Ladner. The Influence of Caches on the Performance of Heaps. In Proceedings of the Eighth ACM/SIAM Symposium on Discrete Algorithms, pages 370--379, New Orleans, LA, 1997.
.... (ii) how to optimize locally as well as globally the use of caches; and (iii) how to minimize the work spent in synchronization (barrier calls) The first two are closely related: good data and task partitioning will ensure good locality; coupling such partitioning with cache sensitive coding (see [27, 28, 29, 34] for discussions) provides programs that take best advantage of the architecture. Minimizing the work done in synchronization barriers is a fairly simple exercise in program analysis, but turns out to be far more difficult in practice: a tree based gather and broadcast barrier, for instance, will ....
A. LaMarca and R.E. Ladner. The Influence of Caches on the Performance of Heaps. ACM Journal of Experimental Algorithmics, 1(4), 1996.
....these code segments and or data structures. Trace driven simulations have also been used to develop analytical models of cache behavior. See [4, 15, 19, 22 24] for example, for some ways in which trace driven simulators have been used in cache performance enhancement studies. La Maxca and Ladner [13] develop a model for a single level direct mapped cache. They use this model to analyze the performance of binary heaps and cache aligned d heaps. LaMarca and Ladnet [14] optimize the cache performance of several sorting methods. Their cache optimized heapsort and mergesort codes achieve a speedup ....
A. LaMarca and R. E. Ladnet. The influences of caches on the performance of heaps. The ACM Journal of Experimental Algorithms, 1(4), 1996.
....of separate arrays, in which simple iterative loops that traverse an array linearly are preferred over pointer dereferencing, in which code is replicated to process each array separately, etc. While we cannot measure exactly how much we gain from this approach, studies of cache aware algorithms [1, 13, 21 23, 45] indicate that the gain is likely to be substantial factors of anywhere from 2 to 40 have been reported. New memory hierarchies show differences in speed between cache and main memory that reach two orders of magnitude (even on uniprocessors) 7.3. Low level algorithmic changes Unless the ....
A. LaMarca and R. Ladnet. The influence of caches on the performance of heaps. In Proceedings of the 8th Symposium on Discrete Algorithms, pp. 370-379. New Orleans, LA, 1997.
.... to o timize locally as well as globally the use of caches; and (iii) how to minimize the wor k s ent in synchr onization (bar r ier calls) The fir st two ar e closelyr elated: good data and task ar titioning willensur e good locality; cou ling such ar titioning with cache sensitive coding (see [27,28,29,34] for discussions) r videsrF15F s that take best advantage of the ar chitectur e. Minimizing the wor k done in synchr onization bar r ier s is a fair ly simpleexer cise inpr ogr am analysis, buttur ns out to be far mor e di#cult inpr actice: a tr15x ased gather]F d br oadcast barF er for instance, ....
A. LaMarca and R. E. Ladner. The Influence of Caches on the Perf ormance of Heaps. In Proceedings of the Eighth ACM/SIAM Symposium on Discrete Algorithms, pages 3 0--3 9, New Orleans, LA, 199 . 132
.... to o timize locally as well as globally the use of caches; and (iii) how to minimize the wor k s ent in synchr onization (bar r ier calls) The fir st two ar e closelyr elated: good data and task ar titioning willensur e good locality; cou ling such ar titioning with cache sensitive coding (see [27,28,29,34] for discussions) r videsrF15F s that take best advantage of the ar chitectur e. Minimizing the wor k done in synchr onization bar r ier s is a fair ly simpleexer cise inpr ogr am analysis, buttur ns out to be far mor e di#cult inpr actice: a tr15x ased gather]F d br oadcast barF er for instance, ....
A. LaMarca and R. E. Ladner. The Influence of Caches on the Perf ormance of Heaps. ACM Journal of Experimental Algorithmics, 1(4), 1996. http://www.jea.acm.org/1996/LaMarcaInfluence/. 132
....et al. discusses making pointer based data structures cache conscious in [5] He focuses on providing structure layouts to make tree structures cacheconscious. LaMarca and Ladner developed analytical models and showed simulation results predicting the number of cache misses for the heap in [13]. However, the predictions they made were for an isolated heap, and the model they used was the hold model, in which the heap is static for the majority of operations. In our work, we consider Dijkstra s algorithm and Prim s algorithm in which the heap is very dynamic. In both Dijkstra s ....
A. LaMarca and R. E. Ladner. The Influence of Caches on the Performance of Heaps. ACM Journal of Experimental Algorithmics , 1, 1996.
....studied before in di#erent contexts. In connection with matrices, significant speedups can be achieved by using layouts optimized for the memory hierarchy see e.g. the paper by Chatterjee et al. 8] and the references it contains. LaMarca and Ladner consider the question in connection with heaps [16]. Among other things, they repeat an experiment performed by Jones [15] ten years earlier, and demonstrate that due to the increased gaps in access time between levels in the memory hierarchy, the d ary heap has increased competitiveness relative to the pointer based priority queues. For search ....
A. LaMarca and R. E. Ladner. The influence of caches on the performance of heaps. ACM Journal of Experimental Algorithms, 1:4, 1996.
....of data structures and specifically which optimizations help the most for a given set of graphs. In a similar vein, work has been done to speed up minimum spanning tree algorithms by concentrating on high level constructs and passing over architecture based optimizations [9,10] LaMarca and Ladner [2] analyze the cache influence on heaps, but understandably keep their discussion to a heap specific point of view. We build on their work by noticing that certain heap operations occur at a higher frequency than others in the minimum spanning tree problem, and we view their work in this light. ....
....assuming a zero memory access cost. Here, we use a more exact analysis to try and predict how these algorithms will behave on a variety of cache based architectures. III.1 Cache aligned heap optimization to Prim s algorithm We began by implementing an optimization to the heap as suggested by [2]. The idea was to insert empty spaces at the beginning of the heap so that siblings to a node would be in the same cache block. We ran several tests with varying graph structures and found that although these sorts of optimizations produced some effect on the run time when the heap size began to ....
Anthony LaMarca and Richard Ladner. The Influence of Caches on the Performance of Heaps. Journal of Experimental Algorithms Vol. 1, 1996.
....studied before in di#erent contexts. In connection with matrices, significant speedups can be achieved by using layouts optimized for the memory hierarchy see e.g. the paper by Chatterjee et al. 8] and the references it contains. LaMarca and Ladner consider the question in connection with heaps [16]. Among other things, they repeat an experiment performed by Jones [15] ten years earlier, and demonstrate that due to the increased gaps in access time between levels in the memory hierarchy, the d ary heap has increased competitiveness relative to the pointer based priority queues. For search ....
A. LaMarca and R. E. Ladner. The influence of caches on the performance of heaps. ACM Journal of Experimental Algorithms, 1:4, 1996.
.... example is hashing: most textbook on data structures still advocate using double hashing in preference to linear probing, whereas experimental data clearly indicates that linear probing is the faster method, thanks to its good locality (see [BMQ98] Pioneering studies by Ladner and his coworkers [LL96, LL97] established that cache optimization was feasible, algorithmically interesting, and worthwhile, even for such old friends as sorting algorithms [ACVW01, LL97, RR99, XZK00] and priority queues [LL96, San00] indeed, even matrix multiplication, which has been optimized in numerical libraries ....
.... thanks to its good locality (see [BMQ98] Pioneering studies by Ladner and his coworkers [LL96, LL97] established that cache optimization was feasible, algorithmically interesting, and worthwhile, even for such old friends as sorting algorithms [ACVW01, LL97, RR99, XZK00] and priority queues [LL96, San00] indeed, even matrix multiplication, which has been optimized in numerical libraries for over 40 years (including optimizations for paging behavior) is amenable to such techniques [ERS90] Ad hoc reduction in memory usage and improvement in patterns of memory addressing have been ....
[Article contains additional citation context not shown here]
A. LaMarca and R. Ladner, The influence of caches on the performance of heaps, ACM J. Exp. Algorithmics 1 (1996), www.jea.acm.org/1996/LaMarcaInfluence/.
....queues. However, due to the usage of explicit or implicit pointers the performance of these structures deteriorates on a two level memory system. It has been observed by several researchers that a d ary heap performs better than the normal binary heap on multi level memory systems (see, e.g. [22, 24]) For instance, a B ary heap reduces the number of I Os from O(log 2 N B ) cf. 4] to O(log B N B ) per operation [24] Of course, a B tree [8, 11] could also be used as a priority queue, with which a similar I O performance is achieved. However, in a virtual memory environment a B ary ....
A. LaMarca and R. E. Ladner. The influence of caches on the performance of heaps. The ACM Journal of Experimental Algorithmics, volume 1, article 4, 1996.
....maps contemporaneouslyaccessed elements to non conflicting regions of the cache. Previous research has attacked the processormemory gap using the above techniques. Wolf and Lam [WL91] exploited cache reference locality to improve the performance of matrix multiplication. LaMarca and Ladner [LL96, LL97] considered the Page 3 e#ects of caches on sorting algorithms and improved performance by restructuring these algorithms to exploit caches. In addition, they constructed a cache conscious heap structure that clustered and aligned heap elements to cache blocks. CLH98] demonstrated that ....
Anthony LaMarca and Richard E. Ladner. The influence of caches on the performance of heaps. ACM Journal of Experimental Algorithmics, 1, 1996.
....to store a graph. Thus they attempt to normalize for machine effects by using running times relative to the time needed to scan the adjacency structure. More recently several researchers have focused on designing algorithms to improve cache performance by improving the locality of the algorithms [20, 17, 18]. Lebeck and Wood focused on recoding the SPEC benchmarks and also developed a cache profiler to help in the design of faster algorithms. LaMarca and Ladner came up with improved heap and later sorting algorithms by improving locality. They also developed a new methodology for analyzing cache ....
A. LaMarca and R. Ladner. The influence of caches on the performance of heaps. Journal of Experimental Algorithms, Vol.1, 1996.
....the concept of bulk I O in the experimental section to refine our analysis and better evaluate the I O behavior of the tested data structures. Previous Work. It has been observed by several researchers that d ary heaps perform better than the classical binary heaps on multi level memory systems [16, 14]. Consequently, a variety of external PQs already known in the literature follow this design paradigm by using a multi way tree as a basic structure. Buffer trees [2, 11] and M=B ary heaps [9, 13] achieve optimal O( 1=B) log M=B N=B) amortized I Os per operation, in a sequence of total N ....
A. LaMarca and R.E. Ladner, `The influence of caches on the performances of heaps', Tech. Report 96-02-03, UW University, 1996. To appear in Journal of Experimental Algorithmics.
....fulfilled. We let B denote the size of the blocks transferred between the cache and the memory level above it, and M the capacity of the cache, both measured in elements being manipulated. Recent research papers indicate that a binary heap is not the fastest priorityqueue structure (see, e.g. [LaMarca and Ladner 1996; Sanders 1999] and heapsort based on a binary heap is not the fastest sorting method (see, e.g. Moret and Shapiro 1991; LaMarca and Ladner 1999] but we are mainly interested in the performance engineering aspects of the problem. The reason for choosing the heap construction as the topic of ....
....for this: 1) In the average case the number of instructions executed by Williams program is linear [Hayward and McDiarmid 1991] which is guaranteed in the worst case by Floyd s program. 2) Williams program accesses the memory more locally than Floyd s program, as shown experimentally by LaMarca and Ladner [1996]. Let N denote the size of the heap being constructed. We prove analytically that, if M rBblog 2 Nc for some real number r 1, Williams program incurs never more than (2r= r Gamma 1) N=B O(log 2 N) cache misses. On the other hand, there exists an input which makes Floyd s program to incur ....
[Article contains additional citation context not shown here]
LaMarca, A. and Ladner, R. E. 1996. The influence of caches on the performance of heaps. The ACM Journal of Experimental Algorithmics 1, Article 4.
No context found.
A. LaMarca and R. Ladner, "The Influence of Caches on the Performance of Heaps," in Proceedings of the Eighth ACM/SIAM Symposium on Discrete Algorithms, pp. 370--379, (New Orleans, LA), 1997.
No context found.
A. LaMarca and R. Ladner, "The Influence of Caches on the Performance of Heaps," ACM Journal of Experimental Algorithmics 1(4), 1996. http://www.jea.acm.org/1996/LaMarcaInfluence/.
No context found.
A. LaMarca and R. Ladner. The influence of caches on the performance of heaps. ACM Journal of Experimental Algorithmics, 1(4), 1996. Online at www.jea.acm.org/1996/LaMarcaInfluence/.
No context found.
A. LaMarca and R. Ladnet. The influence of caches on the performance of heaps. ACMJournal of ExperimentalAlgorithmics, 1(4), 1996. www.jea.acm.org/1996/LaMarcalnfluence/.
No context found.
LaMarca, A., & Ladner, R., "The influence of caches on the performance of heaps," ACM J. Exp. Algorithmics 1, 4 (1996), www.jea.acm.org/1996/LaMarcaInfluence.
No context found.
A. LaMarca and R. Ladner, "The Influence of Caches on the Performance of Heaps," in Proceedings of the Eighth ACM/SIAM Symposium on Discrete Algorithms, pp. 370--379, (New Orleans, LA), 1997.
No context found.
A. LaMarca and R. Ladner, "The Influence of Caches on the Performance of Heaps," ACM Journal of Experimental Algorithmics 1(4), 1996. http://www.jea.acm.org/1996/LaMarcaInfluence/.
No context found.
LaMarca, A., and Ladner, R.E., "The influence of caches on the performance of heaps," ACM J. Exp. Algorithmics 1, 4 (1996), www.jea.acm.org/1996/LaMarcaInfluence/
No context found.
LaMarca, A., and Ladner, R.E., "The influence of caches on the performance of heaps," ACM J. Exp. Algorithmics 1, 4 (1996), www.jea.acm.org/1996/LaMarcaInfluence/
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC