11 citations found. Retrieving documents...
Dirk Grunwald, Richard Neves. "Whole-program optimization for time and space efficient threads". In Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, Cambridge, Massachusetts. 1996

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Mondrian Memory Protection - Witchel, Cates, Asanovic (2002)   (21 citations)  (Correct)

....combined with byte level translation, we discover additional opportunities for implementing system services. We explored one application in detail, zero copy networking, in Section 5.3. A persistent problem for supporting large numbers of user threads is the space occupied by each thread s stack [11]. Each thread needs enough stack to operate, but reserving too much stack space wastes memory. With paged virtual memory, stack must be allocated in page sized chunks. This strategy requires a lot of physical memory to support many threads, even though most threads don t need a page worth of stack ....

D. Grunwald and R. Neves. Whole-program optimization for time and space efficient threads. In Proceedings of ASPLOS-VII, Oct 1996.


Portable High-Performance Programs - Frigo (1992)   (1 citation)  (Correct)

....Figure 2 5. The generated C code has the same general structure as the C elision, with a few additional statements. In lines 4 5, an activation frame is allocated for fib and initialized. The Cilk runtime system uses activation frames to represent procedure instances. Using techniques similar to [72, 73], our inlined allocator typically takes only a few cycles. The frame is initialized in line 5 by storing a pointer to a static structure, called a signature, describing fib. The first spawn in fib is translated into lines 12 18. In lines 12 13, the state of the fib procedure is saved into the ....

D. GRUNWALD AND R. NEVES, Whole-program optimization for time and space efficient threads, in Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Cambridge, Massachusetts, Oct. 1996, pp. 50--59. 163


Pthreads for Dynamic and Irregular Parallelism - Narlikar, Blelloch (1998)   (4 citations)  (Correct)

....the user to determine the default thread stack size may be useful. However, predicting the required stack size can be difficult for some applications. In such cases, instead of conservatively allocating an extremely large stack, a technique such as stacklets [22] or whole program optimization [24] could be used to dynamically and efficiently extend stacks. 11 Benchmark Problem Size Coarse gr Fine gr orig sched Fine gr new sched Speedup Speedup Threads Speedup Threads Matrix Mult. 1024 Theta 1024 3.65 1977 6.56 59 Barnes Hut N = 100K, Plummer 7.53 5.76 860 7.80 34 FMM N = 10K, 5 ....

Dirk Grunwald and Richard Neves. Whole-program optimization for time and space efficient threads. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 50--59, Cambridge, Massachusetts, 1--5 October 1996. ACM Press.


Portable High-Performance Programs - Frigo (1999)   (1 citation)  (Correct)

....Figure 2 5. The generated C code has the same general structure as the C elision, with a few additional statements. In lines 4 5, an activation frame is allocated for fib and initialized. The Cilk runtime system uses activation frames to represent procedure instances. Using techniques similar to [72, 73], our inlined allocator typically takes only a few cycles. The frame is initialized in line 5 by storing a pointer to a static structure, called a signature, describing fib. The first spawn in fib is translated into lines 12 18. In lines 12 13, the state of the fib procedure is saved into the ....

D. GRUNWALD AND R. NEVES, Whole-program optimization for time and space efficient threads, in Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Cambridge, Massachusetts, Oct. 1996, pp. 50--59.


Cilk: Efficient Multithreaded Computing - Randall (1998)   (3 citations)  (Correct)

....Figure 3 2. The generated C code has the same general structure as the C elision, with a few additional statements. In lines 4 5, an activation frame is allocated for fib and initialized. The Cilk runtime system uses activation frames to represent procedure instances. Using techniques similar to [49, 50], our inlined allocator typically takes only a few cycles. The frame is initialized in line 5 by storing a pointer to a static structure, called a signature, describing fib. The first spawn in fib is translated into lines 12 17. In lines 12 13, the state of the fib procedure is saved into the ....

Dirk Grunwald and Richard Neves. Whole-program optimization for time and space efficient threads. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 50--59, Cambridge, Massachusetts, October 1996.


Better Operating System Features for Faster Network Servers - Gaurav Banga (1998)   (22 citations)  (Correct)

....the use and performance of purely multithreaded servers, even though several important high performance servers, e.g. AltaVista, have successfully adopted this approach. There has been a lot of research in the runtime systems communityon improving the performance of massively threaded applications [9, 15] by reducing the storage management overhead. However, these approaches have not yet been applied to general purpose operating systems. In this paper, we will concentrate on providing support for an event based control model. 3.3 Scheduling and resource management Most operating systems treat a ....

D. Grunwald and R. Neves. Whole-Program Optimization for Time and Space Efficient Threads. In Proc. of the 2nd Intl. Conf. on Arch. Support for Prog. Lang. and Operating Systems, Cambridge, MA, Oct. 1996.


The Implementation of the Cilk-5 Multithreaded Language - Frigo, Leiserson, Randall (1998)   (70 citations)  (Correct)

....in Figure 3. The generated C code has the same general structure as the C elision, with a few additional statements. In lines 4 5, an activation frame is allocated for fib and initialized. The Cilk runtime system uses activation frames to represent procedure instances. Using techniques similar to [16, 17], our inlined allocator typically takes only a few cycles. The frame is initialized in line 5 by storing a pointer to a static structure, called a signature, describing fib. The first spawn in fib is translated into lines 12 18. In lines 12 13, the state of the fib procedure is saved into the ....

Dirk Grunwald and Richard Neves. Whole-program optimization for time and space efficient threads. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 50--59, Cambridge, Massachusetts, October 1996.


Exploiting Dead Value Information - Martin, Roth, Fischer (1997)   (19 citations)  (Correct)

....normally require that the architectural processor state, including the values of all architectural registers, be preserved. While the cost of saving and restoring this state is not significant for context switches [15] it dominates thread switch overhead especially for fine grained threaded code [1, 9]. Non preemptive switches are implemented using a procedure call interface allowing the compiler to generate specialized save and restore code at these well defined switch points based on static liveness information [9] Preemptive switches are not amenable to such static analysis or optimization ....

.... dominates thread switch overhead especially for fine grained threaded code [1, 9] Non preemptive switches are implemented using a procedure call interface allowing the compiler to generate specialized save and restore code at these well defined switch points based on static liveness information [9]. Preemptive switches are not amenable to such static analysis or optimization and must conservatively save and restore all registers. We propose that DVI be used in multi threaded programs to optimize saves and restores dynamically. Unlike previously proposed solutions, our solution does not ....

[Article contains additional citation context not shown here]

Dirk Grunwald and Richard Neves. Wholeprogram optimization for time and space efficient threads. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 50--59, 1--5 October 1996.


Better Operating System Features for Faster Network Servers - Banga, Druschel, Mogul (1998)   (22 citations)  (Correct)

....of purely multi threaded servers, even though several important high performance servers, such as the AltaVista front end [8] have successfully adopted this approach. There has been a lot of research in the runtime systems community on improving the performance of massively threaded applications [10, 15, 19] by reducing the storage management overhead. However, these approaches have not yet been applied to general purpose operating systems. In this paper, we will concentrate on providing operating system support for an event based control model. 3.3 Scheduling and resource management Most operating ....

....necessary to exploit the full power of a multiprocessor. A hybrid model, using a moderate number of threads and an event based notification mechanism, may be best for Internet servers. As mentioned in Section 3. 2, the issues involved in efficiently supporting threads are relatively well understood [3, 10, 19]. Thus, we concentrate here on efficient support for event driven servers. 4.1 Resource containers We propose a new model for fine grained resource management and scheduling. This model is based on a new operating system abstraction called a resource con Application domain really extends ....

D. Grunwald and R. Neves. Whole-Program Optimization for Time and Space Efficient Threads. In Proceedings of the 2nd Intl. Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, Oct. 1996.


Pthreads for Dynamic Parallelism - Narlikar, Blelloch (1998)   (Correct)

....size of 1MB was chosen in Solaris. For example, the space required for a recursive function in a thread may vary depending on the input data. For such cases, an alternate strategy to conserve stack space is required to efficiently support a large number of threads, such as the one suggested in [22]. 3. LIFO scheduler. Next, we modified the scheduling queue to be last in first out (LIFO) instead of FIFO. The motivation for this change was to reduce the total number of active threads created by the library at any time during the execution. A FIFO queue executes the threads in a breadth first ....

Dirk Grunwald and Richard Neves. Whole-program optimization for time and space efficient threads. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 50--59, Cambridge, Massachusetts, 1--5 October 1996. ACM Press.


Using Runtime Information for Adapting Enterprise Java.. - Servers Mircea Trofin   (Correct)

No context found.

Dirk Grunwald, Richard Neves. "Whole-program optimization for time and space efficient threads". In Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, Cambridge, Massachusetts. 1996

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC