Results 1 - 10
of
7,231
Table 5: Execution time ratios between software-only and hardware-only directory protocols under sequential consistency and release consistency. Water LU Ocean MP3D
1997
"... In PAGE 19: ... For Ocean, the longer read stall time originates from a combination of longer delays in the first-level write buffer and higher contention in the network interface of the home nodes. The resulting execution time ratios between SW and HW, and between SW-RC and HW-RC are presented in Table5 . For three of the applications, LU, Ocean, and MP3D, the ETR increases when release consistency is applied as we predicted in Section 3.... ..."
Cited by 1
Table 5: Execution time ratios between software-only and hardware-only directory protocols under sequential consistency and release consistency. Water LU Ocean MP3D
1997
"... In PAGE 19: ... For Ocean, the longer read stall time originates from a combination of longer delays in the first-level write buffer and higher contention in the network interface of the home nodes. The resulting execution time ratios between SW and HW, and between SW-RC and HW-RC are presented in Table5 . For three of the applications, LU, Ocean, and MP3D, the ETR increases when release consistency is applied as we predicted in Section 3.... ..."
Cited by 1
Table 1: Speedups for SC vs. ERC, an early implementa- tion of release consistency for SVM in the Munin system. 16 processors [9].
1999
"... In PAGE 1: ...ion and application of coherence operations (e.g. invalida- tions) to be postponed to synchronization points, greatly reducing the impact of false sharing and the frequency of coherence operations. Among the many relaxed con- sistency models proposed for hardware-coherent systems, release consistency [14] which separates acquire from release synchronization was the rst to inspire a major breakthrough in the performance of software shared mem- ory systems [10] (see Table1 1). Under release consistency and models inspired by it, processors can thus continue to share a page without any communication of data or coherence information until a synchronization point.... In PAGE 9: ...8 3.9 Table1 0: Speedups for LRC vs ScC. 8 processors [34] 4 Conclusions There has been a lot of progress in shared virtual memory over the last several years.... ..."
Cited by 35
Table 2. Parameters of the DSPN of the adaptive release consistency protocol
1995
"... In PAGE 5: ... Note, the DSPN of the AC protocol contains the DSPN of a sequential consistency protocol as a submodel by removing the places Define local queue, Process define local, and LOCAL as well as the corresponding transitions. Table2 states the model parameters of the DSPN of the AC protocol according to the hardware parameter stated in Section 2. The time corresponding to 1,000 processor cycles is chosen as the basic time unit.... In PAGE 12: ... To provide an increasing load from left to right in all figure presented below the x-axes correspond to the rates at which memory requests are issued. In all experiments one memory request rates varies and the others have the default values given in Table2 or Table 3. Figure 6 plots curves for the overall processor utilization for AC with different ratios between the define global and define local request rates.... In PAGE 12: ... For SC a read request rate of 0.01 is assumed, which correspond to the default value of AC given in Table2 . We observe that in case of a reasonable high degree of buffering (define global request rate = 0.... ..."
Cited by 1
Table 1n3a Comparison between Lazy Release Consistency and Data Merging mechanisms
1992
"... In PAGE 14: ... Of all the related work discussed heren2c the LRC algorithm is the most closely related to the data merging mechanism described in this paper. Table1 contains a summary comparison of the two mechanisms. 4 Examples Any realistic memory subsystem makes compromises.... ..."
Table 1 shows the average line utilization for the applications studied. Applications with low line utilization will have little opportunity for write grouping, but in these cases the single word updates will not result in excessive network traffic, as illustrated in figure 1.
"... In PAGE 11: ... Table1 : Application characteristics The applications studied use two different types of data synchronization. The first is the typical coarse-grain (Block) synchronization using a release consistency memory model [2].... ..."
Table 1. DSM design issues.
"... In PAGE 35: ... In our presenta- tion, we use examples from the DSM systems listed and briefly described in the sidebar on page 55. Table1 com- pares how design issues are handled in a selected subset of the systems. Design choices A DSM system designer must make choices regarding structure, granulari- ty, access, coherence semantics, scal- ability, and heterogeneity.... In PAGE 37: ... Relaxed coherence semantics allows more efficient shared access because it requires less synchronization and less data movement However, programs that depend on a stronger form of coherence may not perform correctly if executed in a system that supports only a weaker form. Figure 2 gives brief definitions of strict, sequential, processor, weak, and release consistency, and illustrates the hierarchical relationship among these types of coherence, Table1 indicates the coherence semantics supported by some current DSM systems. Figure 2.... In PAGE 37: ... Each type of operator is guaranteed to be processor consistent DSM systems This partial listing gives the name of the DSM system, the princi- pal developers of the system, the site and duration of their research, andabrief description of the system. Table1 gives more informa- tion about the systems followed with an asterisk. Agora (Bisiani and Forin, Carnegie Mellon University, 1987-): A heterogeneous DSM system that allows data structures to be shared across machines.... ..."
Table 1: Base simulated configuration.
1999
"... In PAGE 9: ... We model both an ILP uniprocessor and an ILP-based CC-NUMA multiprocessor with release consistency. Table1 summa- rizes the base simulated configuration. The cache sizes are scaled based on application input sizes according to the methodology of Woo et al.... ..."
Cited by 21
Table 1: Base simulated configuration.
1999
"... In PAGE 9: ... We model both an ILP uniprocessor and an ILP-based CC-NUMA multiprocessor with release consistency. Table1 summa- rizes the base simulated configuration. The cache sizes are scaled based on application input sizes according to the methodology of Woo et al.... ..."
Cited by 21
Table 1. Base simulated configuration.
1999
"... In PAGE 6: ... We model both an ILP uniprocessor and an ILP-based CC-NUMA multiprocessor with release consistency. Table1 summa- rizes the base configuration. The cache sizes are scaled based on application input sizes according to the methodol- ogy of Woo et al.... ..."
Cited by 21
Results 1 - 10
of
7,231