22 citations found. Retrieving documents...
A. Diwan, D. Tarditi, and E. Moss. Memory system performance of programs with intensive heap allocation. ACM Trans. Comput. Syst., 13(3):244--273, Aug. 1995.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
The Cache Behaviour of Large Lazy Functional - Stock   (Correct)

....as is done in a write allocate cache; the rest of the line will soon be overwritten. It would be better to write the word directly to the D1 cache and invalidate the rest of the cache line. This can be achieved by using a write allocate cache with sub block placement, as noted by Diwan et al. [6]. However, such caches are now a rarity. An equally e ective approach would be to use a write invalidate instruction instead of a normal write. Some architectures have write invalidate instructions, but unfortunately the x86 is not one of them. An alternative is to use prefetching. Because writes ....

....better performance than write no allocate caches, because most data is referenced almost immediately after allocation. Diwan, Tarditi and Moss made very detailed simulation of Standard ML programs, including the e ects of parts of the memory system usually ignored such as the write bu er and TLB [6]. They compared di erent write miss strategies, and found that using sub block placement cut cache miss rates signi cantly. Gon calves and Appel also made detailed measurements of Standard ML programs [8] They found the miss rates of SML NJ programs could be lower than SPEC92 C and Fortran ....

A. Diwan, D. Tarditi, and E. Moss. Memory-system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244-273, Aug. 1995.


Real-Time Performance of Dynamic Memory Allocation Algorithms - Puaut (2002)   (2 citations)  (Correct)

....satisfy future requests for larger blocks (fragmentation problem) The goal when designing an allocator is usually to minimize this wasted space without undue time cost. The time and memory performance of dynamic memory allocators are very often evaluated using simulation, using either real traces [1, 7] or synthetic traces [5, 15] Most efforts on the evaluation of dynamic memory allocators have focused on evaluating the average 14th Euromicro Conference on Real Time Systems, Vienna, Austria, June 2002, pages 41 49 behavior of allocators, with respect to allocation times and wasted memory due ....

A. Diwan, D. Tarditi, and E. Moss. Memory system performance of programs with intensive heap allocation. ACM Transaction on Computer Systems, 13(3):244--273, Aug. 1995.


Real-Time Performance of Dynamic Memory Allocation Algorithms - Puaut (2002)   (2 citations)  (Correct)

....blocks. This problem is known as fragmentation. The goal when designing an allocator is usually to minimize this wasted space without undue time cost, or vice versa. The time and memory performance of dynamic memory allocators are very often evaluated using simulation, using either real traces [DTM95, NG95, Joh97] or synthetic traces [Knu73, ZG94] Most efforts on the evaluation of dynamic memory allocators have focused on evaluating the average behavior of allocators, with respect to allocation times and wasted memory due to fragmentation. While the average execution time of a program is suited as a ....

A. Diwan, D. Tarditi, and E. Moss. Memory system performance of programs with intensive heap allocation. ACM Transaction on Computer Systems, 13(3):244--273, August 1995.


Properties Of Age-Based Automatic Memory Reclamation Algorithms - Stefanovic (1999)   (8 citations)  (Correct)

.... management of generations have reported good performance [Caudill and Wirfs Brock, 1986; Courts, 1988; Shaw, 1988; Sobalvarro, 1988; Ungar and Jackson, 1988; Wilson, 1989; Wilson and Moher, 1989; Appel, 1989b; Wilson et al. 1991; Hudson et al. 1991; Stefanovic, 1993b; Stefanovic and Moss, 1994; Diwan et al. 1995; Barrett and Zorn, 1995] The performance of a collector is affected by a number of factors: the number of generations in the system; the promotion policy, or how long an object must remain in one generation before it is advanced into the next older; when to initiate collection; and how to track ....

....This focuses attention on the primary costs, namely copying and pointer maintenance. However, we cannot measure certain other costs, without integrating our implementation with a live object system. The chief of these costs is the effect on locality [Zorn, 1991; Reinhold, 1993; Goncalves, 1995; Diwan et al. 1995; Appel and Shao, 1996] The schemes that we found to perform well with respect to copying cost also share the property that they do not change the relative order of objects; think of them as compacting. Therefore, they do not adversely affect the locality of access within the mutator. Unlike ....

Amer Diwan, David Tarditi, and J. Eliot B. Moss. Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, August 1995.


Stack-Based Typed Assembly Language - Morrisett, Crary, Walker, Glew (1998)   (44 citations)  (Correct)

....are also compelling reasons for providing support for stacks. First, Appel and Shao s work did not consider imperative languages, such as Java, where the ability to share environments is greatly reduced nor did it consider languages that do not require garbage collection. Second, Tarditi and Diwan [13, 12] have shown that with some cache architectures, heap allocation of continuations (as in SML NJ) can have substantial overhead due to a loss of locality. Third, stack based activation records can have a smaller memory footprint than heap based activation records. Finally, many machine architectures ....

Amer Diwan, David Tarditi, and Eliot Moss. Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, August 1995.


Validating an Architectural Simulator - Nahum (1996)   (Correct)

....goal of this simulator has been to accurately model performance costs for our SGI machine. Much of the simulation literature discusses the tradeoff between speed and accuracy, and describes techniques for making simulations fast. However, accuracy is rarely discussed (notable exceptions include [2, 4, 5]) and the tradeoff between accuracy and speed has not been quantitatively evaluated. Given that our simulator is meant to capture performance costs, it must be more than an emulator that duplicates the execution semantics of the hardware or counts events such as cache misses. The simulator should ....

Amer Diwan, David Tarditi, and Eliot Moss. Memory-system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, 1995.


Research Demonstration of a Hardware Reference-Counting Heap - Wise, Heck, Hess, Hunt, Ost (1997)   (2 citations)  (Correct)

....requires parallelism to keep apace. Conventional garbage collection, however, assembles global knowledge ( Nothing points here ) thus, any multiprocessor realization requires much synchronization, which is not a constraint on uniprocessors where garbage collection is thought to be perfected [1, 11] The asynchronous, atomic transactions of reference counting [8] make it the heap manager of choice for a multiprocessor or multitasking system, It is commonly used to recover available sectors from a shared disk, where a traversing collector is only rarely used (e.g.UNIX s fsck. In contrast, ....

....then it must apply both to large computations, where the overhead of process scheduling can be amortized, and to new problems so the cost for software revision can be justified. However, exactly this class of large, parallel programs is where current garbage collection technology fades [11]. With storage management essential to modern programming languages, like Smalltalk, ML, Lisp, Haskell, and Java, we must either abandon languages that depend on automatic storage management and cast our parallel programs in the likes of C, or find strategies for managing dynamic storage on ....

A. Diwan, D. Tarditi, & E. Moss. Memory-system performance of programs with intensive heap allocation. ACM Trans. Comput. Sys. 13, 3 (August 1995), 244--273.


Stack-Based Typed Assembly Language - Greg Morrisett (1998)   (44 citations)  (Correct)

....reasons for providing support for stacks. First, almost all compilers use a stack based architecture. Second, Tarditi and Diwan have shown that with the wrong kind of cache architecture, heap allocation of continuations (as in SML NJ) can have substantial overhead due to a loss of locality [DTM95, DTM94] Third, Appel and Shao do not consider imperative languages, such as Java, where the ability to share environments is greatly reduced nor do they consider languages that do not require garbage collection. Finally, many machine architectures have hardware devices that expect programs to ....

Amer Diwan, David Tarditi, and Eliot Moss. Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, August 1995.


Networking Support For High-Performance Servers - Nahum (1997)   (Correct)

....and negative individual values from canceling each other out. Note the average error is under 5 percent, with the worst case error being about 15 percent. We are aware of only a very few pieces of work that use trace driven or execution driven simulation that actually validate their simulators [10, 25, 36]. Our accuracy is comparable to theirs. More details about the construction and validation of the simulator can be found in the appendix. 4.4 Characterization and Analysis In this Section, we present our characterization and analysis of memory reference behavior of network protocols under a ....

....goal of this simulator has been to accurately model performance costs for our SGI machine. Much of the simulation literature discusses the tradeoff between speed and accuracy, and describes techniques for making simulations fast. However, accuracy is rarely discussed (notable exceptions include [10, 25, 36]) and the tradeoff between accuracy and speed has not been quantitatively evaluated. Given that our simulator is meant to capture performance costs, it must be more than an emulator that duplicates the execution semantics of the hardware or counts events such as cache misses. The simulator ....

Diwan, A., Tarditi, D., and Moss, E. Memory-system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, 1995.


Comparing Mostly-Copying and Mark-Sweep Conservative Collection - Smith, Morrisett (1998)   (13 citations)  (Correct)

....with p, and hence collection is typically proportional to the size of the live data. 2.2.2 Design Details Our implementation of the above algorithm is designed with one overarching aim: fast allocation. Several studies in the literature have suggested that allocation costs can be significant [13, 12, 15], and are in fact one of the main advantages of copying collection. We were interested in determining whether we could reap these same benefits in a mostly copying setting. To achieve fast allocation, clients allocate objects contiguously on a single page until the page is full. Thus MCC achieves ....

A. Diwan, D. Tarditi, and E. Moss. Memory-system performance of programs with intensive heap allocation. Transactions on Computer Systems, 13(3):244-- 273, Aug. 1995.


Stack-Based Typed Assembly Language - Morrisett, Crary, Glew, Walker (1998)   (44 citations)  (Correct)

....also compelling reasons for providing support for stacks. First, Appel and Shao s work did not consider imperative languages, such as Java, where the ability to share environments is greatly reduced, nor did it consider languages that do not require garbage collection. Second, Tarditi and Diwan [14, 13] have shown that with some cache architectures, heap allocation of continuations (as in SML NJ) can have substantial overhead due to a loss of locality. Third, stack based activation records can have a smaller memory footprint than heap based activation records. Finally, many machine ....

Amer Diwan, David Tarditi, and Eliot Moss. Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, August 1995.


A Smalltalk Memory Profiler and its Performance Enhancement - Jingyu Sun (1997)   (1 citation)  (Correct)

....times for the usual situation in which the user is monitoring performance in specific classes. Work is underway to improve the performance further. 1. Introduction Performance has always been a concern with object oriented software. Much attention has been focused on garbage collection [CM 96, DTM 95, Zorn 93, Ungar 87] and method dispatch [AGS 94, Driesen 93, HC 92, Rose 88] but little work has been done on other aspects of performance. We believe that important performance gains can be achieved by reducing the amount of memory allocated by o o programs. A reduction in memory allocations ....

Diwan, Amer; Tarditi, David; Moss, Eliot, "Memory system performance of programs with intensive heap allocation," ACM Transactions on Computer Systems 13:3, August 1995, p. 244-273.


Properties Of Age-Based Automatic Memory Reclamation Algorithms - Stefanovic (1999)   (8 citations)  (Correct)

.... management of generations have reported good performance [Caudill and Wirfs Brock, 1986; Courts, 1988; Shaw, 1988; Sobalvarro, 1988; Ungar and Jackson, 1988; Wilson, 1989; Wilson and Moher, 1989; Appel, 1989b; Wilson et al. 1991; Hudson et al. 1991; Stefanovic, 1993b; Stefanovic and Moss, 1994; Diwan et al. 1995; Barrett and Zorn, 1995] The performance of a collector is affected by a number of factors: the number of generations in the system; the promotion policy, or how long an object must remain in one generation before it is advanced into the next older; when to initiate collection; and how to track ....

....This focuses attention on the primary costs, namely copying and pointer maintenance. However, we cannot measure certain other costs, without integrating our implementation with a live object system. The chief of these costs is the effect on locality [Zorn, 1991; Reinhold, 1993; Goncalves, 1995; Diwan et al. 1995; Appel and Shao, 1996] The schemes that we found to perform well with respect to copying cost also share the property that they do not change the relative order of objects; think of them as compacting. Therefore, they do not adversely affect the locality of access within the mutator. Unlike ....

Amer Diwan, David Tarditi, and J. Eliot B. Moss. Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, August 1995.


Comparing Mostly-Copying and Mark-Sweep Conservative Collection - Smith, Morrisett (1998)   (13 citations)  (Correct)

....hence collection is typically proportional to the size of the live data. 2.2.2 Design Details Our implementation of the above algorithm is designed with one overarching aim: allocation should be fast. Several studies in the literature have suggested that allocation costs can be very significant [13, 12, 15], and are in fact one of the main advantages of copying collection. We were interested in determining whether we could reap these same benefits in a mostly copying setting. To achieve fast allocation, clients allocate objects contiguously on a single page until the page is full. Thus MCC achieves ....

A. Diwan, D. Tarditi, and E. Moss. Memory-system performance of programs with intensive heap allocation. Transactions on Computer Systems, 13(3):244-- 273, Aug. 1995.


The Structure and Performance of Interpreters - Romer, Lee, Voelker, Wolman.. (1996)   (37 citations)  (Correct)

.... Wortman 75, Elshoff 76] Similar studies on CISCs provided the rationale for moving to RISC processors [Clark Levy 82, Hennessy Patterson 90] More recently, researchers have looked at the interaction of the memory system and various object oriented and functional languages [Calder et al. 94, Diwan et al. 95, Goncalves Appel 95, Holzle Ungar 95] Researchers have also studied the interaction of particular classes of applications with architecture: for example, Maynard et al. 94] and [Uhlig et al. 95] studied the memory system behavior of commercial and productivity applications. In a similar ....

Diwan, A., Tarditi, D., and Moss, E. Memory System Performance of Programs with Intensive Heap Allocation. ACM Transactions on Computer Systems, 13(3):244--273, August 1995.


Stack-Based Typed Assembly Language - Morrisett, Crary, Walker, Glew (1998)   (44 citations)  (Correct)

....are also compelling reasons for providing support for stacks. First, Appel and Shao s work did not consider imperative languages, such as Java, where the ability to share environments is greatly reduced nor did it consider languages that do not require garbage collection. Second, Tarditi and Diwan [13, 12] have shown that with some cache architectures, heap allocation of continuations (as in SML NJ) can have substantial overhead due to a loss of locality. Third, stack based activation records can have a smaller memory footprint than heap based activation records. Finally, many machine architectures ....

Amer Diwan, David Tarditi, and Eliot Moss. Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, August 1995.


Understanding and Improving the Performance of Modern Programming.. - Diwan (1997)   (1 citation)  Self-citation (Diwan)   (Correct)

....the critical path, and indirectly by diluting the control flow information available to the compiler. Finally, garbage collection degrades performance by changing the memory requirements and memory behavior of programs and by requiring bookkeeping code to support garbage collection. In prior work [60, 40, 103, 41, 104, 42], we have addressed the performance of garbage collection and thus we do not discuss it in this dissertation. Since objects, method invocations, and garbage collection degrade performance, programs written in modern languages typically run slower than equivalent programs written in traditional ....

Diwan, A., Tarditi, D., and Moss, E. Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273, Aug. 1995.


Age-Based Garbage Collection - Stefanovic, McKinley, Moss (1998)   (4 citations)  Self-citation (Moss)   (Correct)

....and indeed a number of altogether different schemes. This focuses attention on the primary cost, namely the copying cost. However, we cannot, without an actual implementation, measure certain other costs. The chief of these is probably the effect on locality [Zorn, 1991; Reinhold, 1993; Diwan et al. 1995]. The schemes that we found to perform well with respect to copying cost also share the property that they do not change the relative order of objects; think of them as compacting. Therefore, they do not adversely affect the locality of access within the mutator. Unlike traditional generational ....

Diwan, A., Tarditi, D., and Moss, J. E. B. (1995). Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273.


Resume - Diwan   Self-citation (Diwan Tarditi Moss)   (Correct)

....work on understanding and improving the performance of garbage collection. I conducted two performance studies of programs compiled with the SML NJ compiler. In the first study, I tested the commonly held belief that garbage collection leads to poor memory system performance (Diwan et al. 1994; Diwan et al. 1995). I showed this belief to be false: given the right memory system hardware, such as that in the DECstation 5000, garbage collection can have excellent memory system performance. In the second study, I itemized the cost of a garbage collection implementation into its components to discover the ....

....For my dissertation work, I devised techniques to reduce performance overhead of two features of modern programming languages: method invocations and linked structures. To aid in this study, I built the wholeprogram optimizer (WPO) a test bed for experimenting with whole program optimizations (Diwan, 1995). To eliminate the overhead of method invocations, I implemented a range of analyses and transformations in the WPO. A novel aspect of these analyses is that they are simpler and faster than most existing analyses (Diwan et al. 1996) However, despite their simplicity, the analyses are highly ....

Diwan, A., Tarditi, D., and Moss, E. (1995). Memory system performance of programs with intensive heap allocation. ACM Transactions on Computer Systems, 13(3):244--273.


TIL: A Type-Directed Optimizing Compiler for ML - Tarditi, Morrisett, Cheng (1995)   (139 citations)  Self-citation (Tarditi)   (Correct)

....Further performance analysis of TIL appears in Morrisett s [33] and Tarditi s theses [45] 5.1 Benchmarks Table 1 describes the benchmark programs, which range in size from 62 lines to about 2000 lines of code. Some of these programs have been used previously for measuring ML performance [5, 16]. The benchmarks cover a range of application areas including scientific computing, list processing, systems programming, and compilers. We compiled the programs as single closed modules. For Lexgen and Simple, which are standard benchmarks [5] we eliminated functors by hand because TIL does not ....

Amer Diwan, David Tarditi, and Eliot Moss. Memory-System Performance of Programs with Intensive Heap Allocation. Transactions on Computer Systems, August 1995.


Quantifying the Performance of - Garbage Collection Vs   (Correct)

No context found.

A. Diwan, D. Tarditi, and E. Moss. Memory system performance of programs with intensive heap allocation. ACM Trans. Comput. Syst., 13(3):244--273, Aug. 1995.


Quantifying the Performance of Garbage Collection vs. Explicit .. - Hertz, Berger (2005)   (Correct)

No context found.

A. Diwan, D. Tarditi, and E. Moss. Memory system performance of programs with intensive heap allocation. ACM Trans. Comput. Syst., 13(3):244--273, Aug. 1995.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC