29 citations found. Retrieving documents...
C. Amza, A. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory," in Proceedings of the 6th ACM Symposium on Principles and Practice of Parallel Programming, (Las Vegas), pp. 90 -- 99, June 1997. 11

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Performance Evaluation of a Transactional DSM System - Wende, Schoettner.. (2002)   (7 citations)  (Correct)

....independent variables reside on the same page and are alternatingly accessed by different processors. As a result the page is exchanged repeatedly between the processors. It is crucial to choose an appropriated page size. Large pages can speed up memory access due to the locality of data [7]. On the other side, false sharing increases when larger memory pages are used. 2.1 IVY IVY [2] was one of the first proposed DSM sys tems. It is a userlevel implementation running on a group of network processors, the Apollo Domain system. It implements a page based DSM allowing multiple ....

A. Cox W. Zwaenepoel K. Rajamani C. Amza. Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory. In Principles and Practice of Parallel Programming, 1997.


Efficient Runtime Support for Cluster-Based Distributed Shared.. - Speight (1997)   (3 citations)  (Correct)

.... Even though fine grained access control provides a partial solution to the problenl of false sharing, recent studies have shown that while the extra nunlber of bytes sent as a result of false sharing nlay be high, the increase in the overall nunlber of hiessages is relatively snlall [4]. 128 5.2 Hardware and Hybrid DSM Systems Release consistency was developed as part of the DASH multiprocessor project [36] In DASH, writes are pipelined to mask the latczcy of write opcratiozs. In contrast, most software DSM systems such as Brazos, Munin, and TreadMarks buffer writes until the ....

....there is a high degree of false or write write sharing. Brazes does not employ such an adaptive lechanism, but a large reduction in the lemery overhead to maintain coherence is achieved in Brazes through the use of multicast. The effects of false sharing in page based DSM systems was examined in [4]. The authors conclude that while false sharing contributes significantly to the number of extra bytes that must be sent to maintain coherence, the number of messages resulting 131 from false sharing is few for many applications. A protocol that aggregates pages together to capture the benefits of ....

C. Amza, K. Rajamany, A. Cox, and W. Zwaenepoet. Tradeoffs between false sharing and aggregation in software distributed shared memory. In Proceedigs of the 6th ACM SIGPLAN Slmposium o Priciplcs 84 Practice of Parallel Pro- grammig, pages 90 99, June 1997.


Active Correlation Tracking - Thitikamol, Keleher (1999)   (1 citation)  (Correct)

....consistency [11] rather than one of many high performance relaxed consistency models. This makes comparisons difficult, as sequentially consistent systems suffer from both false and true sharing. Relaxed consistency models hide false sharing effectively without recourse to multi threading [12]. Thread scheduling algorithms on modern systems, therefore, only address performance problems due to true sharing. Furthermore, the level of false sharing in both systems is higher than a typical sequentially consistent system, as neither system incorporates a delta interval mechanism. This ....

C. Amza, A. L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory, " in Proceedings of the Principles and Practice of Parallel Programming, 1997.


Update Protocols and Cluster-based Shared Memory - Peter Keleher Keleher (1999)   (8 citations)  (Correct)

....more detail in Section 4. Although AIX s default virtual memory page size is 4k bytes, we used 8k pages in CVM by the simple expedient of ensuring that all page protection changes use an 8k granularity. The larger page size generally increases performance by aggregating data into fewer messages [2]. 3.3 Base Results Figure 5 shows speedups of our application using four protocols, lmw i, lmw u, bar i, and bar u. Table 2 shows the number of diff creations, misses (remote faults) that cause network traffic, the number of data and synchronization requests sent (there are an equal number of ....

C. Amza, A. L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory," in Proceedings of the Principles and Practice of Parallel Programming, 1997.


A DSM Cluster Architecture Supporting Aggressive Computation in.. - Graham (2001)   (Correct)

....of Li and Hudak [26] a large body of knowledge has been created focusing, primarily, on improving the performance of such systems by reducing the number and size of messages sent. This has been accomplished using weak consistency protocols (e.g. 14, 24, 7] as well as other optimizations (e.g. [15, 2, 3, 27, 30, 8]) Work has also been done on decreasing the overhead of the transmission protocol used to send the required consistency maintenance messages [34, 33, 10] 2.3. Active Networks Active networks[13, 29, 32, 36] allow customized (possibly application specific) programs to execute in the network. By ....

C. Amza, A. L. Cox, K. Ramajamni, and W. Zwaenepoel. Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory. In Proc. of the Sixth ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPOPP'97), pages 90--99, June 1997.


Minimizing Consistency Traffic in a Versioned Object.. - Grimstrup, Graham   (Correct)

....pioneering work of Li and Hudak [15] a large body of knowledge has been created focusing, primarily, on improving the performance of such systems by reducing the number and size of messages sent. This has been accomplished using weak consistency protocols [7, 13, 4] as well as other optimizations [8, 1, 2, 21]. Work has also been done on decreasing the overhead of the transmission protocols used to send consistency maintenance messages [23, 22, 6] and on the underlying networks themselves (e.g. Gigabit Ethernet, Myrinet [5] 2.2. LOTEC Sui and Graham [20] described a DSM based object programming ....

C. Amza, A. L. Cox, K. Ramajamni, and W. Zwaenepoel. Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory. In Proc. 6th ACM Symp. on Principles and Practice of Parallel Programming, 1997.


LOTEC: A Simple DSM Consistency Protocol for Nested Object.. - Graham, Sui (1999)   (1 citation)  (Correct)

....supportedinpartby the Natural Sciencesand Engineering ResearchCouncilofCanadaunder grantOGP 0194227. such systemsby reducing the numberandsizeofmessages sent. This has been accomplished using weak consistency protocols [CBZ91, KCZ92, BZS93] as well as other optimizations [CBZ95, ACDZ97, ACRZ97, Lu97, SB98, BPA98] Work has also been done on decreasing the overhead of the transmission protocol used to send the required consistency maintenance messages [vECGS92, vEBBV95, Buo98] and on the underlying networksthemselves (e.g. Gigabit Ethernet, Myrinet [BCF 95] Despite this, the ....

C. Amza, A. L. Cox, K. Ramajamni, and W. Zwaenepoel. Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory. In Proc. of the Sixth ACM SIGPLAN Symp. on Principles and Practiceof Parallel Programming (PPOPP'97),pages 90-- 99, June 1997.


ADSM: A Hybrid DSM Protocol that Efficiently Adapts to.. - Monnerat, Bianchini (1997)   (1 citation)  (Correct)

....AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA 0 1 2 3 4 5 6 7 IS FFT MIGD MIGF SHA Water Speedup Tmk AAAA AAAA AAAA ADSM LDSC LDU ADSM LDU AAAA AAAA AAAA ADSM ATmk Figure 5: Application Speedups. sion is consistent with previous work [3] that shows that pages in Water exhibit a mixture of true and false sharing, such that the false sharing does not significantly hurt performance. MigDepth also exhibits such a mixture, behaving in the same way as Water. The other applications in our suite cannot benefit from our implementation of ....

C. Amza, A. Cox, K. Rajamani, and W. Zwaenepoel. Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory. Technical report, Rice University, submitted for publication, 1997.


Locality and Performance of Page- and Object-Based DSMs - Buck, Keleher (1998)   (10 citations)  (Correct)

....in the cost of operating system primitives, such as communication and page protection system calls. We use two representative protocols, one based on CRL s (C Region Library) 3] protocol, and another based on the lazy multi writer protocol [5] used by CVM (Coherent Virtual Machine) and TreadMarks [16]. We implemented both protocols in CVM (which provides a framework for implementing software DSM protocols) and then ported them to the simulator. This latter step was relatively simple because the simulator is basically a variant of CVM that uses a thread library to context switch between ....

....because software DSMs can easily use page sizes larger than the system page size merely by passing appropriate arguments to operating system calls. We have since confirmed this result by adding a command line page size p arameter to CVM, and other researchers have observed the same phenomenon [16]. Performance improves by an average of 25 between the 512 byte page size that is more typical of object size and 8k. Over the same span, 83 of remote misses are elim inated, while bandwidth requirements remain roughly the same. Table 3.3 shows the results of an experiment that can be used to ....

C. Amza, A. L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Sof tware Distributed Shared Memory," in Proceedings of the Principles and Practice of Parallel Programming, 1997.


Thread Migration and Communication Minimization in DSM Systems - Thitikamol, Keleher (1999)   (14 citations)  (Correct)

....synchronization or program order. In practice, most shared memory programs require little or no modifications to meet these requirements. From the perspective of the mechanisms discussed in the rest of this paper, the most important attribute of D CVM is that its protocols tolerate false sharing [7] well. The majority of our experiments were run on an eightprocessor SP 2. Each node is a 66.7 MHz POWER2 processor with 64K first level caches and 128 MBytes of memory per node. The processors are connected by a 40 MByte sec switch. The operating system is AIX 4.1.4. D CVM runs on UDP IP over ....

....context of sequential consistency rather than a relaxed consistency model. This makes comparisons with our system difficult, as sequentially consistent systems suffer from both false and true sharing. Relaxed consistency models hide false sharing effectively without recourse to multi threading [7]. Threadscheduling algorithms on modern systems, therefore, only address performance problems due to true sharing. Both systems implement forms of passive correlation sche duling, in which remote page faults are used to gain information about data sharing between threads. As discussed in ....

C. Amza, A. L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory," in Proceedings of the Principles and Practice of Parallel Programming, 1997.


A High-Level Abstraction of Shared Accesses - Keleher (2000)   (1 citation)  (Correct)

....entry consistency. Assuming that all data was initially zero, the P 3 s reads of y, z, and t would return 2,2,1 for LRC, 2,2,0 for scope consistency, and 2,0,0 for entry consistency. 27 shown that relaxed consistency models largely eliminate the effects of false sharing in page based systems [4, 18]. Aside from differences due to the use of page based or diff based write collection mechanisms, the primary performance differences between scope, entry, and lazy release consistency are due to the degree to which faults are eliminated by moving updates prior to their use [9] We discuss below ....

....while this information is assumed to be known by the compiler in TreadMarks. Our work also allows the user to manipulate discover and manipulate shared modifications at a high level. Recent work at Rice has investigated automatic determination of extent like objects in shared memory applications [4]. We have concentrated our discussion on software SDSM systems, but it may also be relevant in the context of hardware shared memory systems. For instance, the prefetch and poststore primitives of the KSR 2 [1] implement user initiated data movement on top of the underlying consistency protocols. ....

C. Amza, A. L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory," in Proceedings of the Principles and Practice of Parallel Programming, 1997.


Multiple-Writer Entry Consistency - Sandhu, Brecht, Moscoso (1998)   (Correct)

....difference in performance between LRC and MEC in our study. However, larger problem sizes in SOR favor MEC over LRC in our study. MULTIPLE WRITER ENTRY CONSISTENCY 9 The use of larger virtual memory pages for obtaining data aggregating effects in LRC has been previously considered by Amze et al. [3]. While larger virtual memory pages improves LRC performance for many applications, its effectiveness is limited by the increase in false sharing in others. On the other hand, MEC is capable of simultaneously eliminating false sharing along page boundaries and achieving much larger data ....

C. Amza, A.L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory", Proceedings of the Sixth Conference on Principles and Practice of Parallel Programming, pp. 90-99, June 1997.


Adjusting Single-/Multiple-writer to False Sharing in Software .. - Xianghui Xie And (1999)   (1 citation)  (Correct)

....is the first system to propose the use of multiple protocols for different shared data objects. However, it leaves the choice of protocols to the user. The work by Amza et al. on Adaptive TreadMarks improves the performance well by adapting the protocols between single and multiple writer[1][2]. It uses the ownership refusal protocol to automatically implement the transform from single writer to multiple writer and vice versa in their adaptive protocols. The multiple writer method uses the TreadMarks twinning and diffing mechanism, while the single writer is similar to that of CVM. It ....

Cristiana Amza, Alan Cox, Karthick Rajamani, and Willy Zwaenepoel. Tradeoffs between False sharing and Aggregation in Software Distributed Shared Memory. Proceedings of PPOPP, 90-99, 1997.


Dynamic Adaptation of Sharing Granularity in DSM Systems - Itzkovitz, Niv, Schuster (1999)   (Correct)

....data structures. Much of this work was done while the author was with the Computer Science Department at the Technion, Israel. However, in many cases a fixed granularity setting seems insufficient. Several recent studies deal with the tradeoff between false sharing elimination and aggregation [1, 11, 19]. On one hand, fine granularity can eliminate false sharing and fragmentation, and reduce unnecessary network traffic. On the other hand, coarse granularity reduces the number of page faults and other expensive consistency protocol operations. Therefore, in addition to setting the granularity ....

....in Barnes are replaced by only 117 pct hits. 5 Related Work Several researchers have investigated the effects of the sharing granularity on the performance of dsm systems, and concluded that changing the sharing granularity dynamically improves the performance of many applications. Amza et al. [1] claim that for many applications, the benefits of coarse sharing granularity usually outweighs the false sharing penalty. They suggest increasing the consistency unit in a page based dsm to a multiple of the virtual memory page size, unless the page faults behavior reveals an increase in false ....

[Article contains additional citation context not shown here]

C. Amza, A. L. Cox, K. Ramajamni, and W. Zwaenepoel. Tradeoffs between false sharing and aggregation in software distributed shared memory. In Proc. of the Sixth ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPOPP'97), pages 90--99, June 1997.


Reducing Coherence-Related Communication in Software.. - Speight, Bennett (1998)   (2 citations)  (Correct)

....degree of false or writewrite sharing. Brazos does not employ such an adaptive mechanism, but a large reduction in the memory overhead to maintain coherence is achieved in Brazos through the use of multicast. Amza et al. 38] also examined the effects of false sharing in page based DSM systems [39]. The authors conclude that while false sharing contributes significantly to the number of extra bytes that must be sent to maintain coherence, the number of messages resulting from false sharing is few for many applications. A protocol that aggregates pages together to capture the benefits of ....

C. Amza, K. Rajamany, A. Cox, and W. Zwaenepoel, "Tradeoffs between false sharing and aggregation in software distributed shared memory," in Proceedings of the 6th ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, pp. 90--99, June 1997.


An Interaction of Coherence Protocols and Memory Consistency.. - Shi, Hu, Tang (1997)   (Correct)

....atomicity is more expensive than that in hardware DSM systems. Therefore, reducing the frequency of communication and message traffic is more important in software DSM systems than in hardware DSM systems. Furthermore, since the coherence unit in software DSM system is page or larger than page[9], the false sharing problem is more serious than that in hardware DSM systems. As such, how to eliminate the false sharing problem effectively is also an important issue in software DSM systems. Before discussing the solution for false sharing, we first give a clear description about false ....

C.Amza, A.L.Cox, K.Rajamani, and W.Zwawnwpoel. Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory. To appear Proceedings of the 6th Symposium on Principles and Practice of Parallel Programming, June 1997.


Data Prefetching for Software DSMs - Bianchini, Pinto, Amorim (1998)   (10 citations)  (Correct)

....Adaptive can provide speedupimprovements as significant as 34 on 16 processors. Adaptive is not the first runtime only prefetching technique proposed thus far in the literature; the techniques of Bianchini et al. B ) 3] Karlsson and Stenstrom (KS) 7] and Amza et al. Dynamic Aggregation) [1] are the best known previous proposals. The KS technique is similar in flavor but not directly comparable to Adaptive . A direct comparison of Adaptive against the other two strategies shows that our technique performs as well or better than B for all applications in our suite. In addition, ....

.... Work Most of the previously published work on reducing the overhead of data accesses in software DSMs has been concentrated on updatebased coherence strategies, e.g. 6, 11, 9] Prefetching and multithreading techniques for software DSMs, on the other hand, have received little attention so far [5, 3, 7, 1, 10]. Due to space limitations, we only focus on runtime based prefetching techniques here. A more complete discussion of the related work can be found in [4] Dynamic aggregation [1] has been studied in this paper. Karlsson and Stenstrom [7] proposed a prefetching technique (called KS here) that is ....

[Article contains additional citation context not shown here]

C. Amza, A. Cox, K. Rajamani, and W. Zwaenepoel. Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory. In Proceedings of the 6th PPoPP, June 1997.


MultiView and Millipage - Fine-Grain Sharing in Page-Based DSMs - Itzkovitz, Schuster (1999)   (16 citations)  (Correct)

....cause them to suffer increased coherence traffic because of false sharing. 14] the conventional wisdom remains that the overhead of false sharing, as well as finegrained true sharing, in page based consistency protocols is the primary factor limiting the performance of software dsm [6]; page based systems heavily depend on relaxed models to alleviate problems such as false sharing that arise due to the pagesized coherence granularity [22] As a consequence: conventional wisdom holds that fine grained performance and false sharing doom page based approaches [5] 1 ....

C. Amza, A.L. Cox, K. Ramajamni, and W. Zwaenepoel. Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory. In Proc. of the Sixth ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPOPP'97), pages 90--99, June 1997.


Multiple-Writer Entry Consistency - Harjinder Sandhu (1998)   (Correct)

....study and no significant difference in performance between LRC and MEC in our study. However, larger problem sizes in SOR are shown to favor MEC over LRC in our study. The use of larger virtual memory pages for obtaining data aggregating effects in LRC has been previously considered by Amze et al. [3]. While larger virtual memory pages improves LRC performance for many applications, its effectiveness is limited by the increase in false sharing in others. On the other hand, MEC is capable of simultaneously eliminating false sharing along page boundaries and achieving much larger data ....

C. Amza, A.L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory", Proceedings of the Sixth Conference on Principles and Practice of Parallel Programming, pp. 90-99, June 1997.


Efficiently Adapting to Sharing Patterns in Software DSMs - Monnerat, Bianchini (1998)   (23 citations)  (Correct)

....while ADSM only uses its lock data association to adapt to sharing patterns. The Brazos system and the AEC protocol also use a combination of invalidates and updates to keep shared data coherent. Prefetching techniques attempt to achieve the same benefits as update based approaches. A few studies [3, 5, 8, 18] have evaluated prefetching techniques for software DSMs. In the Dynamic Aggregation (DA) technique [3] for instance, nodes compute page groups (sets of pages that are accessed in between synchronization points) at each synchronization. The diffs for all pages of a group are requested on the ....

....also use a combination of invalidates and updates to keep shared data coherent. Prefetching techniques attempt to achieve the same benefits as update based approaches. A few studies [3, 5, 8, 18] have evaluated prefetching techniques for software DSMs. In the Dynamic Aggregation (DA) technique [3], for instance, nodes compute page groups (sets of pages that are accessed in between synchronization points) at each synchronization. The diffs for all pages of a group are requested on the first fault on any of the group s pages. The extent to which prefetching techniques can outperform update ....

C. Amza, A. Cox, K. Rajamani, and W. Zwaenepoel. Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory. In Proceedings of the 6th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 1997.


A transactional DSM Operating System in Java - Schoettner, Traub, Schulthess (1998)   (Correct)

....independent variables reside in the same page and are alternatingly accessed by different processors. As a result the page is exchanged again and again between the processors. It is crucial to choose the right page size. Larger pages can speed up memory access due to the locality of data [7]. On the other side the probability of false sharing increases when larger memory pages are used. 2.1 IVY IVY [3] was one of the first proposed DSM systems. It is a user level implementation running on a group of network processors, the Apollo Domain system. It implements a page based DSM ....

K. Rajamani C. Amza, A. Cox and W. Zwaenepoel. Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory. In Principles and Practice of Parallel Programming, 1997.


Evaluation of Cluster Interconnects for a Distributed Shared.. - Roy, Chaudhary (1999)   (Correct)

.... 10 Mbps Ethernet 100 Mbps Ethernet 100 Mbps ATM Figure 1: Roundtrip times for UDP packets 0 20 40 60 80 100 120 140 32 64 128 256 512 1024 2048 4096 8192 16384 Throughput (Mbps) Packet Size (bytes) 10 Mbps Ethernet 100 Mbps Ethernet 155 Mbps ATM Figure 2: Throughput for UDP packets Ethernet [12], and an 8 processor IBM SP 2 [13] CVM results were presented on the IBM SP2 and a DEC Alpha cluster [14] The raw roundtrip latencies and streaming bandwidth that can be obtained for UDP packets are shown in Figures 1 and 2. The results were obtained using two nodes on an unloaded network, using ....

C. Amza, A. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory," in Proceedings of the ACM Symposium on the Principles and Practice of Parallel Programming, (Las Vegas), pp. 90 -- 99, June 1997.


The Region Trap Library: Handling Traps on.. - Brecht, Sandhu (1999)   (3 citations)  (Correct)

....of the study presented in this section. Consequently, a number of factors that may also influence the choice of whether to use pages or regions for managing shared data have not been considered here. This includes the use of page aggregation techniques, which increase the effective size of a page [3] and would likely improve the performance of the VM case for some of the coarse grained applications used in this study, and the use of other coherence protocols such as scope consistency [12] which captures some of the advantages of object based protocols such as entry consistency. 6 Discussion ....

C. Amza, A.L. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs between False Sharing and Aggregation in Software Distributed Shared Memory ", Proceedings of the Sixth Conference on Principles and Practice of Parallel Programming, pp. 90-99, June 1997.


Adaptive Protocols for Software Distributed Shared Memory - Amza, Cox, Dwarkadas.. (1999)   (32 citations)  Self-citation (Amza Cox Rajamani Zwaenepoel)   (Correct)

....We have demonstrated that communication aggregation is the key to improving performance in both invalidate and update protocols. Adding dynamic aggregation to the invalidate protocol provides the same benefits as using an update protocol, without the risk of sending extra messages. Amza et al. [3] investigated the benefits of dynamic page aggregation. They did not, however, combine aggregation with other forms of adaptation. Lu et al. 18] found that aggregation is the main reason that message passing programs outperform (software) shared memory programs. Overall, they found that for six ....

C. Amza, A.L. Cox, K. Rajamani, and W. Zwaenepoel. Tradeoffs between false sharing and aggregation in software distributed shared memory. In Proceedings of the 6th Symposium on the Principles and Practice of Parallel Programming, pages 90--99, June 1997.


Strings: A High-Performance Distributed Shared Memory for.. - Roy, Chaudhary (1998)   (3 citations)  (Correct)

No context found.

C. Amza, A. Cox, K. Rajamani, and W. Zwaenepoel, "Tradeoffs Between False Sharing and Aggregation in Software Distributed Shared Memory," in Proceedings of the 6th ACM Symposium on Principles and Practice of Parallel Programming, (Las Vegas), pp. 90 -- 99, June 1997. 11

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC