Results 1 - 10
of
36
Uniprocessor Garbage Collection Techniques
- SUBMITTED TO ACM COMPUTING SURVEYS
"... We survey basic garbage collection algorithms, and variations such as incremental and generational collection; we then discuss low-level implementation considerations and the relationships between storage management systems, languages, and compilers. Throughout, we attempt to present a uni ed view b ..."
Abstract
-
Cited by 416 (5 self)
- Add to MetaCart
We survey basic garbage collection algorithms, and variations such as incremental and generational collection; we then discuss low-level implementation considerations and the relationships between storage management systems, languages, and compilers. Throughout, we attempt to present a uni ed view based on abstract traversal strategies, addressing issues of conservatism, opportunism, and immediacy of reclamation; we also point outavariety of implementation details that are likely to have a significant impact on performance.
Dynamic storage allocation: A survey and critical review
, 1995
"... Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their de ..."
Abstract
-
Cited by 187 (6 self)
- Add to MetaCart
Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their design and evaluation. We then chronologically survey most of the literature on allocators between 1961 and 1995. (Scores of papers are discussed, in varying detail, and over 150 references are given.) We argue that allocator designs have been unduly restricted by an emphasis on mechanism, rather than policy, while the latter is more important; higher-level strategic issues are still more important, but have not been given much attention. Most theoretical analyses and empirical allocator evaluations to date have relied on very strong assumptions of randomness and independence, but real program behavior exhibits important regularities that must be exploited if allocators are to perform well in practice.
Cache-Conscious Structure Layout
, 1999
"... Hardware trends have produced an increasing disparity between processor speeds and memory access times. While a variety of techniques for tolerating or reducing memory latency have been proposed, these are rarely successful for pointer-manipulating programs. This paper explores a complementary appro ..."
Abstract
-
Cited by 164 (8 self)
- Add to MetaCart
Hardware trends have produced an increasing disparity between processor speeds and memory access times. While a variety of techniques for tolerating or reducing memory latency have been proposed, these are rarely successful for pointer-manipulating programs. This paper explores a complementary approach that attacks the source (poor reference locality) of the problem rather than its manifestation (memory latency). It demonstrates that careful data organization and layout provides an essential mechanism to improve the cache locality of pointer-manipulating programs and consequently, their performance. It explores two placement technique-lustering and colorinet improve cache performance by increasing a pointer structure’s spatial and temporal locality, and by reducing cache-conflicts. To reduce the cost of applying these techniques, this paper discusses two strategies-cache-conscious reorganization and cacheconscious allocation--and describes two semi-automatic toolsccmorph and ccmalloc-that use these strategies to produce cache-conscious pointer structure layouts. ccmorph is a transparent tree reorganizer that utilizes topology information to cluster and color the structure. ccmalloc is a cache-conscious heap allocator that attempts to co-locate contemporaneously accessed data elements in the same physical cache block. Our evaluations, with microbenchmarks, several small benchmarks, and a couple of large real-world applications, demonstrate that the cache-conscious structure layouts produced by ccmorph and ccmalloc offer large performance benefit-n most cases, significantly outperforming state-of-the-art prefetching.
Using generational garbage collection to implement cache-conscious data placement
- In Proceedings of the International Symposium on Memory Management
, 1998
"... The cost of accessing main memory is increasing. Machine designers have tried to mitigate the consequences of the processor and memory technology trends underlying this increasing gap with a variety of techniques to reduce or tolerate memory latency. These techniques, unfortunately, are only occasio ..."
Abstract
-
Cited by 90 (11 self)
- Add to MetaCart
The cost of accessing main memory is increasing. Machine designers have tried to mitigate the consequences of the processor and memory technology trends underlying this increasing gap with a variety of techniques to reduce or tolerate memory latency. These techniques, unfortunately, are only occasionally successful for pointer-manipulating programs. Recent research has demonstrated the value of a complementary approach, in which pointer-based data structures are reorganized to improve cache locality. This paper studies a technique for using a generational garbage collector to reorganize data
Oil and Water? High Performance Garbage Collection in Java with MMTk
- In ICSE 2004, 26th International Conference on Software Engineering
, 2004
"... Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage col ..."
Abstract
-
Cited by 81 (18 self)
- Add to MetaCart
Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage collectors. MMTk uses design patterns and compiler cooperation to combine modularity and efficiency. The resulting system is more robust, easier to maintain, and has fewer defects than monolithic collectors. Experimental comparisons with monolithic Java and C implementations reveal MMTk has significant performance advantages as well. Performance critical system software typically uses monolithic C at the expense of flexibility. Our results refute common wisdom that only this approach attains efficiency, and suggest that performance critical software can embrace modular design and high-level languages. 1
Pointer Swizzling at Page Fault Time: Efficiently and Compatibly Supporting Huge Address Spaces on Standard Hardware
- Computer Architecture News
, 1992
"... Pointer swizzling at page fault time is a novel address translation mechanism that exploits conventional address translation hardware. It can support huge address spaces efficiently without long hardware addresses; such large address spaces are attractive for persistent object stores, distributed sh ..."
Abstract
-
Cited by 78 (0 self)
- Add to MetaCart
Pointer swizzling at page fault time is a novel address translation mechanism that exploits conventional address translation hardware. It can support huge address spaces efficiently without long hardware addresses; such large address spaces are attractive for persistent object stores, distributed shared memories, and shared address space operating systems. This swizzling scheme can be used to provide data compatibility across machines with different word sizes, and even to provide binary code compatibility across machines with different hardware address sizes. Pointers are translated ("swizzled") from a long format to a shorter hardware-supported format at page fault time. No extra hardware is required, and no continual software overhead is incurred by presence checks or indirection of pointers. This pagewise technique exploits temporal and spatial locality in much the same way as a normal virtual memory; this gives it many desirable performance characteristics, especially given the tr...
Improving the Cache Locality of Memory Allocation
, 1993
"... The allocation and disposal of memory is a ubiquitous operation in most programs. Rarely do programmers concern themselves with details of memory allocators; most assume that memory allocators provided by the system perform well. This paper presents a performance evaluation of the reference locality ..."
Abstract
-
Cited by 69 (8 self)
- Add to MetaCart
The allocation and disposal of memory is a ubiquitous operation in most programs. Rarely do programmers concern themselves with details of memory allocators; most assume that memory allocators provided by the system perform well. This paper presents a performance evaluation of the reference locality of dynamic storage allocation algorithms based on trace-driven simulation of five large allocation-intensive C programs. In this paper, we show how the design of a memory allocator can significantly affect the reference locality for various applications. Our measurements show that poor locality in sequential-fit allocation algorithms reduces program performance, both by increasing paging and cache miss rates. While increased paging can be debilitating on any architecture, cache misses rates are also important for modern computer architectures. We show that algorithms attempting to be space-efficient by coalescing adjacent free objects show poor reference locality, possibly negating the benef...
The Case for Compressed Caching in Virtual Memory Systems
- IN PROCEEDINGS OF THE 1999 USENIX ANNUAL TECHNICAL CONFERENCE
, 1999
"... Compressed caching uses part of the available RAM to hold pages in compressed form, effectively adding a new level to the virtual memory hierarchy. This level attempts to bridge the huge performance gap between normal (uncompressed) RAM and disk. Unfortunately, previous studies did not show a consi ..."
Abstract
-
Cited by 43 (2 self)
- Add to MetaCart
Compressed caching uses part of the available RAM to hold pages in compressed form, effectively adding a new level to the virtual memory hierarchy. This level attempts to bridge the huge performance gap between normal (uncompressed) RAM and disk. Unfortunately, previous studies did not show a consistent benefit from the use of compressed virtual memory. In this study, we show that technology trends favor compressed virtual memory---it is attractive now, offering reduction of paging costs of several tens of percent, and it will be increasingly attractive as CPU speeds increase faster than disk speeds. Two of the elements of our approach are innovative. First, we introduce novel compression algorithms suited to compressing in-memory data representations. These algorithms are competitive with more mature ZivLempel compressors, and complement them. Second, we adaptively determine how much memory (if at all) should be compressed by keeping track of recent program behavior. This solves the...
A Comparative Performance Evaluation of Write Barrier Implementations
, 1992
"... Generational garbage collectors are able to achieve very small pause times by concentrating on the youngest (most recently allocated) objects when collecting, since objects have been observed to die young in many systems. Generational collectors must keep track of all pointers from older to younger ..."
Abstract
-
Cited by 41 (11 self)
- Add to MetaCart
Generational garbage collectors are able to achieve very small pause times by concentrating on the youngest (most recently allocated) objects when collecting, since objects have been observed to die young in many systems. Generational collectors must keep track of all pointers from older to younger generations, by "monitoring " all stores into the heap. This write barrier has been implemented in a number of ways, varying essentially in the granularity of the information observed and stored. Here we examine a range of write barrier implementations and evaluate their relative performance within a generation scavenging garbage collector for Smalltalk. 1 Introduction Generational collectors achieve short collection pause times partly because they separate heap-allocated objects into two or more generations and do not process all generations during each collection. Empirical studies have shown that in many programs most objects die young, so separating objects by age and focusing collecti...
Connectivity-Based Garbage Collection
, 2003
"... We introduce a new family of connectivity-based garbage collectors (Cbgc) that are based on potential objectconnectivity properties. The key feature of these collectors is that the placement of objects into partitions is determined by performing one of several forms of connectivity analyses on the p ..."
Abstract
-
Cited by 34 (7 self)
- Add to MetaCart
We introduce a new family of connectivity-based garbage collectors (Cbgc) that are based on potential objectconnectivity properties. The key feature of these collectors is that the placement of objects into partitions is determined by performing one of several forms of connectivity analyses on the program. This enables partial garbage collections, as in generational collectors, but without the need for any write barrier.

