38 citations found. Retrieving documents...
E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum and M. S. Lam, "Compiler-Directed Page Coloring for Multiprocessors, " Proceedings of the Seventh International Symposium on Architectural Support for Programming Languages and Operating Systems, (ASPLOS VII), pp. 244--255, Oct 1996.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Characterizing the Memory Behavior of Java Workloads: A .. - Shuf, Serrano, Gupta.. (2000)   (15 citations)  (Correct)

....and validate our simulation results with help of performance monitor counters on a system with a Power PC 604e processor. We use virtual addresses in the simulation and assume that the operating system will employ one of the standard page mapping techniques (e.g. page coloring or bin hopping [7, 22]) to reduce the number of conflicts between pages. We used a heap size of 1 GB, which leads to relatively few garbage collections while executing these applications, and is consistent with the choice of large heap sizes in production environments. Hence, our measurements largely reflect the ....

E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, and M. Lam. Compiler-directed page coloring for multiprocessors. In Proc. of ASPLOS VII, pages 244--255, Oct. 1996.


Colorable Memory - Liedtke (1996)   (Correct)

....cache by coloring strategies. Recent research reported page coloring to influence second level cache miss rates and thus overall system performance. Kessler and Hill [1992] investigated static page coloring and dynamic bin hopping techniques to avoid color conflicts in the second level cache. Bugnion et al. 1996] got even better results on multiprocessors by compiler directed page coloring. Bershad et al. 1994] see also [Romer et al. 1994] improved cache performance by monitoring cache misses and dynamically recoloring pages. Liedtke et al. 1996] used coloring techniques to partition the second level ....

Bugnion, E., Anderson, J. M., Mowry, T. C., Rosenblum, M., and Lam, M. S. 1996. Compiler-directed page coloring for multiprocessors. In 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Cambridge, MA, pp. 244--255.


Scoped Behaviour for Optimized Distributed Data Sharing - Lu (2000)   (Correct)

....a relatively small range compared to the total run time. Variations have also been be observed between the average time of one parallel job and another parallel job using the exact same executable and dataset. These variations are believed to be due to page placement and data cache conflicts [Bugnion et al. 1996] . Since there is no (known) way to control page placement under AIX, parallel jobs are repeated multiple times and the best time (which is still an average of five runs) is reported. Although the times reported are best cases, it should also be noted that the reported real times are also ....

E. Bugnion, J.M. Anderson, T.C. Mowry, M. Rosenblum, and M.S. Lam. Compiler-Directed Page Coloring for Multiprocessors. Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 244--255, Cambridge, Massachusetts, 1--5 October 1996. ACM Press.


A Memory-layout Oriented Run-time Technique for Locality.. - Yan, Zhang, Zhang   (Correct)

....the memory reference patterns of loops and cache architectural information. Compared with a uniprocessor system, a cache coherent shared memory system has more complicated factors that should be considered for locality exploitation, such as data sharing and load imbalance. More recently, reference [2] uses a run time system to color the virtual pages of a program based on both machinespecific parameters and a summary of the array access patterns generated by the high level compiler. This approach still depends on the compiler time static analysis on data access patterns 1.4. Organization of ....

....that each iteration accesses a contiguous region on each array. In addition, the following hint also should be provided to assist task partitioning. Hint 5: The number of processors, p. 1000 t(1000, 100) 100 100 108 116 124 900 892 1000 1008 1016 2592 2600 2592 892 t(1000, 892) B[0] B[1] B[2] B[100] A[0] A[1] A[2] A[200] double B[100] A[200] access dimension on B access dimension on A (a) hints on memory layouts of two accessed arrays. b) Physical memory layout Memory layout of A size = 200 8; starting at A[0] 1000; Memory layout of B : size = 100 8; starting at B[0] ....

[Article contains additional citation context not shown here]

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam. Compiler-directed page coloring for multiprocessors. Proceedings of ASPLOS'96, pages 244--255, Oct. 1996.


Software Support For Improving Locality in Scientific Codes - Han, Rivera, Tseng (2000)   (8 citations)  (Correct)

.... combining array transpose and loop transformations to improve locality for parallel programs [15] Amarasinghe et al. demonstrated the utility of array reindexing for parallel applications [3] Bugnion et al. presented page coloring as a technique to improve cache utilization on multiprocessors [9]. Many researchers have investigated support for irregular computations. Irregular computations generally perform reductions. The importance of identifying and parallelizing reductions in scientific applications is well established [30, 49, 59, 70] Researchers have investigated both efficient ....

E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, and M. Lam. Compiler-directed page coloring for multiprocessors. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), Boston, MA, October 1996.


Reducing Cache Misses Using Hardware and Software Page.. - Sherwood, Calder, Emer (1999)   (9 citations)  (Correct)

.... with operating system support to move pages [2] Software techniques include program restructuring to improve data [6, 26, 7, 25, 29] or instruction cache performance [12, 15, 16, 24, 28] and compiler directed page coloring for multiprocessors to eliminate 2nd level cache misses for arrays [3]. When performing placement of instructions and data, the software approaches have shown to eliminate a significant amount of cache misses for virtually indexed caches. For a physically indexed cache, the operating system needs to provide support for page colPublished in the Proceedings of the ....

....as our default page allocation algorithm for our baseline configuration. 3.2 Software Guided Page Placement Custom operating systems, such as Exokernal [10] and V [14] have been designed that allow applications to provide their own page replacement and page mapping policies. Bugnion et al. [3] recently examined using compiler directed page coloring for arrays on multiprocessors. Their approach at run time generates a preferred coloring for data pages containing arrays. The coloring is generated using compiler generated analysis for the access patterns and the array sizes provided at ....

[Article contains additional citation context not shown here]

E. Bugnion, J. Anderson, T. Mowry, M. RosenBlum, and M. Lam. Compiler-directed page coloring for multiprocessors. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, October 1996.


Reducing Cache Conflicts by Partitioning and Privatizing Shared.. - Li   (Correct)

....page coloring in the Irix and the NT operating systems, respectively. DEC has implemented bin hopping in the OSF 1 system. With Compiler directed page coloring (or CDPC) the OS takes array reference information from the compiler and it then chooses a smart page mapping to avoid bin conflicts [5]. Our experiments show the APP technique to significantly improve cache performance under the page coloring scheme, but we have not experimented with other page mapping schemes. The array padding technique modifies array dimensions in order to change the way array elements are mapped to the cache ....

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum and M. S. Lam. Compiler-directed page coloring for multiprocessors. In ASPLOS VIII, October, 1996.


Themis: Enforcing Titanium Consistency on the NOW - Miyamoto, Liblit (1997)   (1 citation)  (Correct)

....models, though simpler to understand, may severely limit the optimizations that can take place, hampering performance [3] A balance between the extremes should be reached. 1. 1 The Java Consistency Model The Java consistency model is defined in the original Java language specification [2]. The model is described in terms of actions that may be executed by different parts of the system and the ordering that must be imposed on these actions. This paper interprets the model to be: 1. Locally sequentially consistent. For a single processor, all reads and writes to a given memory ....

....hand optimize its overlap with computation. In the process, however, the original, easily understandable algorithm becomes clouded and hidden. To test the effectiveness of the OrderCache, a na ve CG method on OrderCache was pitted against four different versions of CG running on the normal Split C [2] Active Messages runtime (libsplit c) A new version of the Split C runtime (libOC c) re implemented certain calls to channel through the OrderCache. The mapping from Split C language features into the libOC c runtime is as follows: Split C notation Split C Terminology OrderCache call = ....

[Article contains additional citation context not shown here]

E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, M. Lam, Compiler-Directed Page Coloring for Multiprocessors, Proceedings of the Seventh International Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII), October, 1996.


Validating an Architectural Simulator - Nahum (1996)   (Correct)

....caches such as those on chip in the R4400, this is accurate. However, in our SGI systems, the second level cache is physically indexed. The mapping between virtual and physical addresses on the real system is determined by the IRIX operating system. It is reported that IRIX 5. 3 uses page coloring [3, 9] as a virtual to physical mapping strategy. In page coloring, whenever a new virtual to physical mapping is created, the OS attempts to assign a free physical page so that both the virtual and physical addresses map to the same bin in a physically indexed cache. This way, pages that are adjacent ....

Edouard Bugnion, Jennifer M. Anderson, Todd C. Mowry, Mendel Rosenblum, and Monica S. Lam. Compilerdirected page coloring for multiprocessors. In Proceedings of the Seventh International Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII), Cambridge MA, October 1996.


Reducing Cache Misses for CC-NUMA by Careful Page-Mapping - Jian Huang   (Correct)

....reducing cache set conflicts by properly mapping a virtual page to a physical page. Previous works show that the page assignment by the OS can affect the number of set conflicts in caches, and hence program performance, on uniprocessor machines as well as multiprocessor with uniform memory access [1][5] We expect a similar, if not greater, performance impact on CC NUMA due to the higher cache miss penalty. Various techniques have been proposed and used in page mapping, including page coloring, binhopping, best bin, hierarchical method[5] compilerassisted page coloring[1] and dynamic ....

....memory access [1] 5] We expect a similar, if not greater, performance impact on CC NUMA due to the higher cache miss penalty. Various techniques have been proposed and used in page mapping, including page coloring, binhopping, best bin, hierarchical method[5] compilerassisted page coloring[1] and dynamic re mapping[9] SGI adopts page coloring scheme in its products, while DEC ships OSF 1 with bin hopping. In this paper, we study page mapping techniques in the context of CC NUMA. We compare their performances and show that page coloring was lagged behind considerably. A renovated ....

E. Bugnion, et al. Compiler-directed page coloring for multiprocessors. In Proc. of the 7th Int. Sym. on Architectural Support for Programming Languages and Operating Systems, 10/96.


Maximizing Multiprocessor Performance with the SUIF.. - Hall, Anderson.. (1996)   (172 citations)  Self-citation (Bugnion)   (Correct)

No context found.

E. Bugnion et al., "Compiler-Directed Page Coloring for Multiprocessors, " Proc. Seventh lnt'l Conf. Architectural Support for Programming Languages and Operating Systems, ACM Press, New York, 1996, pp. 244-257.


Achieving High Performance on Digital AlphaServers .. - Anderson, Hall..   Self-citation (Bugnion Anderson Lam)   (Correct)

....An example of how data transformations are used to make data contiguous is shown in Figure 3. Directing Page Placement to Make Data Contiguous. To make the data across multiple arrays that is accessed by the same processor contiguous in shared memory, SUIF uses compiler directed page coloring[7]. With CDPC, the compiler uses its knowledge of the access patterns to direct the operating system s page allocation policy to make each processor s data contiguous in the physical address space. The operating system uses these hints to determine the virtual to physical page mapping at page ....

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam. Compiler-directed page coloring for multiprocessors. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), pages 244--257, Cambridge, MA, October 1996.


Memory Forwarding: Enabling Aggressive Data Layout.. - Mowry, Luk (1998)   Self-citation (Mowry)   (Correct)

....the virtual memory system. The operating system can relocate an entire page of memory in the physical address space without breaking the program by simply copying the page and updating its virtual to physical mapping. One cache optimization which exploits this flexibility is page coloring [5, 25], whereby the operating system attempts to avoid mapping conflicts in the large secondary or tertiary caches. Hence we see that by adding a layer of indirection within the memory system, we can move data safely and transparently without any special language or compiler support. Unfortunately, the ....

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam. Compiler-directed page coloring for multiprocessors. In ASPLOS-VII, pages 244--255, October 1996.


Maximizing Multiprocessor Performance with the SUIF Compiler - Mary Hall (1996)   (170 citations)  Self-citation (Bugnion Anderson Lam)   (Correct)

....so that all data accessed by the same processor is in the same plane of the array. To make the data across multiple arrays that is accessed by the same processor contiguous, we use a technique that involves the cooperation of the compiler and operating system called compiler directed page coloring[5]. The compiler uses its knowledge of the access patterns to direct the operating system s page allocation policy to make each processor s data contiguous in the physical address space. The operating system uses these hints to determine the virtual to physical page mapping at page allocation time. ....

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam. Compiler-directed page coloring for multiprocessors. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), Cambridge, MA, October 1996.


The Potential for Using Thread-Level Data Speculation to.. - Steffan, Mowry (1998)   (72 citations)  Self-citation (Mowry)   (Correct)

....to only write parallel programs from now on is unrealistic. Instead, the preferred solution would be for the compiler to parallelize programs automatically. Unfortunately, compilers have only been successful so far at parallelizing the numeric applications commonly run on supercomputers [1, 7, 16]. For single chip multiprocessing to have an impact on most users, we must also find a way to automatically parallelize non numeric applications. One of the primary challenges in automatic parallelization is determining whether data dependences exist between two potential threads that would ....

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam. Compiler-Directed Page Coloring for Multiprocessors. In Proceedings of ASPLOS-VII, pages 244--255, October 1996.


Exploiting Cache Locality At Run-Time - Yan (1998)   (Correct)

No context found.

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum and M. S. Lam, "Compiler-Directed Page Coloring for Multiprocessors, " Proceedings of the Seventh International Symposium on Architectural Support for Programming Languages and Operating Systems, (ASPLOS VII), pp. 244--255, Oct 1996.


MESA: Reducing Cache Conflicts by Integrating Static and.. - Xiaoning Ding Dimitrios   (Correct)

No context found.

E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, and M. Lam. Compiler-Directed Page Coloring for Multiprocessors. ACM SIGPLAN Notices, 31(9):244--255, Sept. 1996.


A Compiler Perspective on Architectural Evolutions - Nicholas Mitchell Larry (1997)   (2 citations)  (Correct)

No context found.

Edouard Bugnion, Jennifer M. Anderson, Todd C. Mowry, Mendel Rosenblum, and Monica Lam, "Compiler-directed page coloring for multiprocessors, " in ASPLOS-VII, Cambridge, MA, Oct. 1996.


Next-Generation Memory Systems - Wang (2004)   (Correct)

No context found.

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam. Compiler-directed page coloring for multiprocessors. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 244--257, Cambridge, MA, October 1996.


Cache Optimization for Coarse Grain Task - Parallel Processing Using   (Correct)

No context found.

E. Bugnion, J. M. Anderson, T. C. Mowry, M. R. Rosenblum, and M. S. Lam. Compiler-directed page coloring for multiprocessors. Proc. of the Seventh Internatinal Symposium of Architectural Support for Programing Languages and Operating Systems, Oct. 1996.


Improving the Speed vs. Accuracy Tradeoff for Simulating.. - Durbhakula (1998)   (Correct)

No context found.

E. Bugnion et al. Compiler-Directed Page Coloring for Multiprocessors. In Proceedings of the 7th International Conference on Architectual Support for Programming Languages and Operating Systems, 1996.


Exploiting Thread-Level Parallelism On . . . - Lo (1998)   (Correct)

No context found.

E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam. Compiler -directed page coloring for multiprocessors. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 244--255, October 1997. 145


Compiling for Instruction Cache Performance on a.. - Kumar, Tullsen (2002)   (Correct)

No context found.

E. Bugnion, J. Anderson, T. Mowry, M. RosenBlum, and M. Lam. Compiler-directed page coloring for multiprocessors. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 1996.


Page-Mapping Techniques for CC-NUMA Multiprocessors - Jian Huang   (Correct)

No context found.

E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum and M. S. Lam. Compiler-directed page coloring for multiprocessors. To appear in Proc. of the 7th Int. Sym. on Architectural Support for Programming Languages and Operating Systems, October 1996.


A Framework for Qualitative Performance Prediction - Hsu, Kremer (1998)   (Correct)

No context found.

Edouard Bugnion, Jennifer M. Anderson, Todd C. Mowry, Mendel Rosenblum, and Monica S. Lam. Compiler-directed page coloring for multiprocessors. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII), pages 244--255, Cambridge, Massachusetts, October 1996. ACM Press.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC