| K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. Proceedings of the 1988. |
....facility, the MBP has various features to reduce overheads on the DSM and to lower the hardware cost of the system. The details are given in Chapter 2. 1.2 Software DSM In 1988 K. Li developed a new software solution for the generation of shared virtual spaces on clusters of PCs [32] by exploiting the memory page management mechanisms of the processors. In recent generations, high end microprocessors have included page management units, and illegal access to pages which are not mapped or allowed for the allocated task (process) is detected by page traps. K. Li exploited this ....
....memory references and as main memory for local data. The system thus has hierarchical cache system: the on chip caches of the processors (L1 caches) are at level 1, snoop caches (L2 caches) are at level 2 and the memory banks of the nodes are at level 3. Page level directory schemes such as IVY[32] were used in the MBP systems instead of cache block level directory schemes [31] in order to reduce the amount of directory memory and to use translation look aside buffers (TLBs) to accelerate consistency preserving operations. The unit of data transmission, however, is the size of an L1 cache ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proc. of the 1988.
....over 8 word blocks, allowing different blocks within the same mapped page to be in different states. This fine grain control over data is similar to that provided in hardware based cache coherent multiprocessors, and alleviates the false sharing that exists in other software data coherence systems [21]. The two block status bits are used to encode the following four states: INVALID: The block may not be read, written, or placed in the hardware cache. READ ONLY: The block may be read, but not written. READ WRITE: The block may be read or written. DIRTY: The block may be read or written, ....
....KSR 1 [9] perform a function similar to that of the block status bits of the M Machine. Implementing remote memory access and coherence completely in software on a conventional processor would involve delays much greater than those shown in Table 1, as evidenced by experience with the Ivy system [21]. The M Machine s fast exception handling in a dedicated H Thread avoids the delay associated with context switching and allows the user thread to execute in parallel with the exception handler. The GTLB avoids the overhead of manual translation and the cost of a system call to access the network. ....
LI, K. Ivy: A shared virtual memory system for parallel computing. In International Conference on Parallel Processing (1988), pp. 94-101.
....but no consis tency. Moving the distribution functionality into the OS is an interesting alternative. This can be achieved by a Distributed Shared Memory (DSM) mechanism providing a virtual address space shared among tasks on loosely coupled processors, like introduced by Keedy [1] and Li [2]. The application programmer is offered a transparent view at shared data on several nodes connected via a network. Regular pointers are used for both local and remote memory accesses. OS and memory management hardware jointly will detect a remote memory access, fetch the desired memory block and ....
....software or hardware based systems and hybrid architectures have been de veloped [4] We do not attempt to give a com prehensive perspective of the state of DSM systems in this section. However, because Plurix is a page based system we shortly review some representative paged based systems: IVY [2], Mirage [5] and TreadMarks [6] Page based DSM systems detect memory accesses to pages by using the protection features of the MMU. MMU hardware support can substantially speed up program execution in comparison to software based implementations but it is afflicted by the false sharing problem. ....
[Article contains additional citation context not shown here]
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the International Conference on Parallel Processing, 1988.
....Machine (JVM) like JavaOS does [1] Herewith we gain efficiency and speed. Language based OS development has been successfully demonstrated by native Oberon [2] and others. The well known Distributed Shared Memory (DSM) paradigm offers a natural solution for distributing data among several nodes [3]. Applications running on top of a DSM are not aware of data locations. Any reference can either point to a local or a remote memory block. During program execution the OS detects a remote memory access and automatically fetches the desired memory block. Plurix implements a page based DSM using ....
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", Int. Conference on Parallel Processing, 1988.
....a lean distributed Operating System (OS) from scratch for the PC platform. We suggest that the distribution functionality should be moved into the OS and not be reimplemented by each distributed application. This is achieved using the wellknown Distributed Shared Memory (DSM) paradigm [1] [2]. A DSM provides a virtual address space shared among tasks on loosely coupled nodes. The distribution of data on several computers is not noticed by the application programmer. Any reference can either point to local or remote memory blocks. During program execution the OS or run time environment ....
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", In Proceedings of the International Conference on Parallel Processing, 1988.
....evolution and version management for different type generations. KEY WORDS: Type Evolution, Persistence, Version Management, Object Oriented Languages. 1. INTRODUCTION The Plurix Operating System (OS) uses the well known Distributed Shared Memory (DSM) paradigm introduced by Keedy [1] and Li [2] and implements a new memory consistency model using restartable transactions in combination with an optimistic synchronization scheme [3] A primary research goal is to investigate the DSM as a general purpose communication medium e.g. for simplified development of distributed applications. ....
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", International Conference on Parallel Processing, 1988.
....efficiency and speed. Furthermore Plurix is not implemented on top of an existing OS discarding any overhead caused by commercially justified backward compatibility The well known Distributed Shared Memory (DSM) paradigm offers a natural solution for distributing data among several nodes, 3] [4]. Applications running on top of the Plurix DSM are not aware of data locations. Any reference can either point to local or remote memory blocks. During program execution the OS detects a remote memory access and automatically fetches the desired block. A file system can be avoided by ....
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", In Proceedings of the International Conference on Parallel Processing, 1988.
....results of an early prototype which was shown at the CeBIT 2000 trade fair. Keywords: Distributed Shared Memory, Network Protocols, Transactions, Optimistic Concurrency Control, Operating Systems, Plurix. 1. INTRODUCTION Plurix supports the concept of distributed virtual storage [5][6] where distributed memory access is transparent to the application. Remote object access is resolved automatically by the OS. This DSM concept simplifies development of distributed applications because explicit communication as used by Java RMI, DCOM or CORBA is avoided. Programming of distributed ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the International Conference on Parallel Processing, 1988.
....static software DSM approach appears to remain fairly limited. Dynamic Approaches Dynamic software DSM systems typically support a more general programming model than their static counterparts, typically allowing multiple independent threads of control to operate within the shared address space [4, 5, 11, 21, 37, 52, 75]. Given mechanisms for inter thread synchronization (e.g. semaphores, barriers) a programmer is able to express essentially any form of parallelism. For the most part, these systems utilize a data shipping paradigm in which threads of computation are relatively immobile and data items (or ....
.... Mostly Software Many software DSM systems are actually mostly software systems in which the hit miss check functionality is implemented in hardware (e.g. by leveraging off of virtual memory protection mechanisms to provide access control) Typical examples of mostly software systems include Ivy [52], Munin [11] and TreadMarks [38] coherence units in these systems are the size of virtual memory pages. Blizzard [70] implements a similar scheme on the CM 5 at the granularity of individual cache lines. By manipulating the error correcting code bits associated with every memory block, Blizzard ....
[Article contains additional citation context not shown here]
Kai Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the International Conference on Parallel Computing, pages 94--101, 1988.
....for PHD. hierarchy worth of messages plus one direct data delivery message need to be sent for PHD as opposed to four traversals of the hierarchy worth of messages for the strict hierarchy. 1.4. 3 Ownership The concept of ownership [10] as used in this protocol was derived from both Li [16] [17] and Totty [27] An owner of a block is responsible for it. Any other node can only have a copy of the block, which can be asynchronously thrown away in order to make room for other blocks. That node can then inform the rest of the system at its leisure without affecting the correctness of the ....
Kai Li. IVY: A shared virtual memory system for parallel computing. In International Conference on Parallel Computing, pages 94-101, 1988.
....is thus given the illusion of a large global address space encompassing all available memory, eliminating the task of explicitly moving data between processes located on separate machines. Both hardware DSM systems (e.g. Alewife [15] DASH [36] FLASH [31] and software DSM systems (e.g. Ivy [37], Munin [13] TreadMarks [28] have been implemented. The majority of software DSM systems use page based memory protection hard ware and the low level message passing facilities of the host operating system to implement the necessary shared memory abstractions. Programmers write programs using ....
....this section we present a brief discussion of previous efforts to reduce the amount of communication necessary in software DSM systems. A detailed discussion of related work can be found in Chapter 5. 1.1. 1 Using Relaxed Consistency Models to Reduce Communication Early DSM systems, such as IVY [37], enforced sequential consistency [32] to maintain coherence between processes in a DSM system. Sequential consistency requires that each write be globally performed before the process issuing the write is allowed to proceed. This restriction severely limits the performance achievable by these ....
[Article contains additional citation context not shown here]
K. Li. Ivy: A shared virtual memory system for parallel computing. In Proceed- igs of the 1955.
....by high bandwidth and low latency networks, the processors can also be used for parallel computing. To establish a truly general purpose and user friendly system, one of the main problems is to provide users with a single system image. By adopting the technique of distributed shared memory [12], for example, we can provide a single addressing space for the whole system so that communication for transferring data between processors is completely transparent to the client programs. In this paper we discuss another very important issue relating to the provision of single system image, that ....
K. Li, IVY: A shared virtual memory system for parallel computing, Proceedings of International Conference on Parallel Processing, 1988, pp.94-101.
....thread migration and affinity based self scheduling for load balancing [30, 44, 39] hardware support to 1 At that time, main memory is expensive and small, while the ratio of access speed of disk and remote memory is 1:10. Systems Heterogeneity Granularity Large Memory Consistency Model IVY [28] No Page based Ye s Sequential Mermaid [48] Ye s Page based No Sequential ORCA[2] No Object based No Relaxed CRL[21] No Object based No Relaxed TreadMarks[23] No Page based No Relaxed HLRC [49] No Page based No Relaxed Shasta [35] No Object based No Sequential JIAJIA [19] No Page based Ye s ....
K. Li. Ivy: A shared virtual memory system for parallel computing. In Proc. of the
....shared memory model is known as a powerful parallel programming model that provides a programmer a simple abstraction for information exchange among parallel tasks. A distributed system that has a capability to access globally shared data objects is called a DSM (Distributed Shared Memory) system [1][2] 3] 4] A DSM system has an advantage in scalability as well as the easiness in parallel programming. In realizing a shared memory on a distributed environment, access latency to the data that is owned by the other node is a critical issue. Generally, to reduce such access latency, the remote ....
K. Li, IVY: A Shared Virtual Memory System for Parallel Computing, Proc. of the
....A problem that arises in home migration is that there must be some mechanism for all hosts to locate the new home of a migrated page. The most straightforward solving method is to broadcast the new home of migrated pages to all hosts. Broadcast, however, is very expensive. The probowner method[20] provides a message saving alternative but introduces a more complex protocol in which a page faulting processor may need more than one round trip to get the faulting page. To reduce message overhead, the decision of migrating the home of a page is made at the barrier manager and the page ....
....sends diffs of a modified page to its home, and home migration makes host 0 the bottleneck and degrades the performance in ILINK. 5 Related Work Many software DSM systems have been built since Kai Li proposed the concept of shared virtual memory and implemented the first software DSM system Ivy[20]. Some frequently cited software DSMs include Midway[3] Munin[4] TreadMarks[15] CVM[16] Cashmere 2L[26] Quarks[18] and Brazos[25] Two important technical bases for the recent prosperity of software DSMs are the multiple writer protocel which is first implemented in Munin and the lazy ....
K.Li, "IVY: A Shared Virtual Memory System for Parallel Computing", in Proc. of the
....shared memory systems for executing certain classes of workloads. SW DSM technology can also be used to connect several large hardware distributed shared memory (HW DSM) systems and thereby extend their upper scalability limit. Most SW DSM systems keep coherence between pagesized coherence units [26], 8] 21] 40] 41] The normal per page access privilege of the memory management unit offers a cheap access control mechanism for these SW DSM systems. The large page size coherence units in the earlier SWDSM systems created extra false sharing and caused frequent page transfers of large ....
....9 10 11 12 13 14 FFT LU c LU nc Radix Barnes FMM Ocean c Ocean nc Radiosity Raytrace Water nsq Water sp Average E6000 16 CPUs CC NUMA 2x8 DSZOOM WF 2x8 Figure 8: Application speedups for Sun Enterprise E6000, 2node CC NUMA, and 2 node DSZOOM WF. 2L [41] 10] CRL [19] GeNIMA [5] Ivy [26], 27] MGS [44] Munin [8] Shasta [35] 34] 32] 33] 10] SiroccoS [36] SoftFLASH [11] and TreadMarks [21] Most of them suffer from synchronous interrupt protocol processing. We belive that many of these implementations would benefit from a more efficient protocol implementation; such ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the
....memory systems for executing certain classes of workloads. SW DSM technology can also be used to connect several large hardware distributed shared memory (HW DSM) systems and thereby extend their upper scalability limit. Most SW DSM systems keep coherence between page sized coherence units [Li88] CBZ91] KCDZ94] SB97] SDH # 97] The normal per page access privilege of the memory management unit offers a cheap access control mechanism for these SW DSM systems. The large page sized coherence units in the earlier SW DSM systems created extra false sharing and caused frequent ....
....values from Table 2. Network delay is about 3 microseconds for all DSZOOM EMU configurations. 5 Related Work Many different SW DSM implementations have been proposed over the years: Blizzard S [SFL # 94] Brazos [SB97] Cashmere 2L [SDH # 97] DGK # 99] CRL [JKW95] GeNIMA [BLS99] Ivy [Li88] LH89] MGS [YKA96] Munin [CBZ91] Shasta [SGT96] SGA97] SG97a] SG97b] DGK # 99] Sirocco S [SFH # 98] SoftFLASH [ENCH96] and TreadMarks [KCDZ94] Most of them suffer from synchronous interrupt protocol processing. We belive that many of these implementations would benefit from ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the
....number of techniques for providing shared memory abstraction on parallel systems with physically distributed memories have emerged. Distributed shared memory (DSM) with single address space and coherence control mechanism is one of the major approaches for coping with large communication latencies[1, 2, 3, 4, 5]. As an alternative, we have developed a new parallel processor architecture integrating fine grain communication mechanism into their processing elements based on extended data flow model. The EM X distributed memory multiprocessors is an implementation of our concept of next generation MPP. So ....
....the one of them can improve the performance by 32 with CL. So far, a large number of distributed shared memory systems has been studied ranging from hardware DSM (i.e. cache coherent non uniform memory access; CC NUMA) such as MIT Alewife[1] and SGI Origin 2000[2] to software DSM such as IVY[3], TreadMarks[4] and Shasta[5] There has also been some interesting approach for software based and hardware assisted systems. The examples include Blizzard E[12] using ECC bits while CASHMERe [13] project employing remote memory write. But, the ECC approach is restricted to only checking invalid ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proc. of the International Conference on Parallel Processing, pages 94--101, 1988.
....logic of CC and microcode on MBP light. In the rest of this section, we explain the addressing system and how to invoke the functions of the intelligent memory system, before going into the detail of CC. 2.3. 1 Addressing system JUMP 1 memory system adopts a variant type of shared virtual memory[4] . A cluster memory is used not only as a main memory but also as a cache for cluster memories of remote clusters. Remember that each processor has a private secondary cache. Thus the cluster memory also functions as a shared third level cache. The cache block size of the third level cache is ....
K. Li. Ivy: A shared virtual memory system for parallel computing. In Proc. ICPP '88, pages 94--101, St. Charles, 1988.
....before relation, as defined by [Lam79] 1.3 Related Techniques for Implementing DSM Systems In this section, we are going to give an brief introduction to the implementation techniques of software distributed shared memory systems related to this thesis. 8 1.3. 1 Single Writer Protocol [Li88, LH89] uses the page based mechanism to implement a software distributed shared memory system called IVY. The pages of the shared memory are cached in all processes. Each cache in a process is protected by the operating system. Any access to an invalid cache by the application generates a segment ....
K. Li. IVY: A shared virtual memory system for parallel computing. In Proceedings of the 1988 International Conference on Parallel Processing, 1988.
....like AURC[1] It can be implemented on workstation clusters or multicomputers with conventional network interfaces. In the ADSM, the action to the shared read is different from that to the shared write. ffl Shared read: The shared read is based on a cache based shared virtual memory system[4]. The shared read is executed as a load instruction from the shared page. Only when the processor reads from the shared page that is not allocated or invalid, instructions for consistency management are invoked by the read trap routine. ffl Shared write: The shared write is realized as ....
....locations, there are many opportunities to coalesce a sequence of instructions for consistency management. In our implementation on the AP1000 , SAURC is best consistency protocol. 6 Related Work 6. 1 OS based Software DSM A number of software page based shared memory models have been proposed[4, 2, 1]. Writing to the shared location is executed as a single store instruction. At this point, the ADSM is different from the existing OS based Software DSM. The consistency management instructions are executed in the write trap routine. In order to execute consistency management instructions, the ....
[Article contains additional citation context not shown here]
Kai Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proc. of the 1988 ICPP, pages 94--101, August 1988.
.... Applications using software distributed shared memory (DSM) can run without troubles of unnecessary memory copy and address translation which happen with the inspector executor mechanism[22] Most of existing software DSM systems are designed on the assumption of using sequential compilers[23, 20, 19]. An executable object made by a sequential compiler only issues a shared memory access as the ordinary memory access(load store) To utilize bandwidth, a runtime system has to buffer the remote memory access. There is another approach where a programmer can specify optimal granularity, protocol, ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proc. of the 1988 ICPP, pages 94--101, Aug. 1988.
....this feature has to be implemented in software. While systems built 2 Effective and innovative harnessing of the computing power of workstation networks is one of the primary research goals of the MILAN project in general and Calypso in particular. around Distributed Shared Memory (DSM) like IVY [Li88] Munin [DCZ90] TreadMarks [ACD 95] and Quarks [K96] and Clouds [DLA 90] provide a more natural programming model, they still suffer from the high cost of distributed synchronization and the inability to provide suitable fault tolerance. A mature system that uses a variant of the DSM concept is ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the 1988 International Conference on Parallel Processing, Volume II, pages 94-101, August 1988.
....false sharing problem associated with distributed RAID. Any shared memory system may face the false sharing problem. The problem was first encountered in a cached multiprocessor system [26] 40] It becomes worse in NUMA (non uniform memory access) machines with distributed shared memory (DSM) 1][29]. Different data objects within the same coherent blocks (cache line or memory page) may be falsely shared, if the cache or page size becomes too large. False sharing has thus been blamed to contribute additional memory consistency overhead in multiprocessors. In the case of clusters of ....
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", Proceedings of 1988 International Conference on Parallel Processing, Vol. II, 1988, pp.94-101. 09/26/00 21
....encodes the high level concepts in low level primitives only to have the compiler attempt to reconstruct the higher level structure. The next layer of software support implements the abstraction of shared memory on a set of message passing machines. Shared Virtual Memory systems such as Ivy [8] and Munin [4] use the virtual memory hardware to implement a page based coherence protocol. The page fault hardware is used to detect accesses to remote or invalid pages, and the page fault handler can be used to move or copy pages from remote processors. Shared virtual memory systems deal with ....
Kai Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the 1988 International Conference on Parallel Processing, pages II 94--101, August 1988.
....(SVM) system, provides an abstraction of single shared space on top of the physically distributed memories presented on network of workstations. It has been extensively studied in the past decade since it combines the programmability of shared memory systems and scalability of distributed systems[1]. However, the performance gap between software DSM systems and message passing platforms remains exist, which prevents the prevalence of the software DSM systems greatly. In general, communication overhead and coherence induced overhead are two main culprits for performance loss in software DSM ....
....such as relaxed memory consistency models [2, 3, 4] to reduce the frequency of communication, multiple writer protocol to reduce the effect of false sharing[5] and the help of hardware [6] et.al. The performance of recent software DSM systems made great progress in comparison with early systems[1]. However, according to some recent performance evaluation results[7] there remains a long way to the availability of software DSM systems. Where should we put our strength on in the future This question is so important that should be answered in time. However, up to now, there is no clear ....
K. Li. Ivy: A shared virtual memory system for parallel computing. In Proc. of the 1988 Int'l Conf. on Parallel Processing (ICPP'88), volume II, pages 94--101, August 1988.
....Xthreads[5, 7] The results show that the latency of operations, such as creation, context switching, and migration, etc. in Ythreads are close to the latency in Xthreads. 2 Related work Recent multi threaded systems supporting dynamic thread migration for distributed memory systems include IVY[3] and Amber[2] As far as we know, thread migration on these two systems appears to be limited to between homogeneous processors. Two approaches to heavy weight process migration in a heterogeneous environment can be found in [8] and [10] respectively. In the former approach, the migrant process ....
.... this; goto findstate; s00: this i = 1000; while( this i 0) ythreadyield(baryid) this resumept = 1; return; s01: ythreaddestroy(ythreadself( return; findstate: switch(this resumept) case 0: goto s00; case 1: goto s01; end switch . int ( pfunc[ 3]) foo, bar, ymain ; int actsize[ 3] sizeof(struct fooactrec) sizeof(struct baractrec) sizeof(struct ymainactrec) Figure 3: The translated code Scheduler return (suspend) call (resume) call (resume) return (suspend) call (resume) return (suspend) call ....
[Article contains additional citation context not shown here]
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proc. of the ICPP, 1988.
....are the candidates for heaps being associated with a segment of this category. This allows a programmer to assign only those data to performance degrading memory areas, which cannot be handled otherwise due to their size. Instances of VSMSegment are segments in a virtually shared memory region [21, 3, 9], which can be shared remotely. Given these building blocks made up of several kinds of heap strategies and segment types, one can construct many heaps with tailored characteristics. This can be achieved easily by inheritance mechanisms, as sketched below. class FixedHeap: public MemoryArea, ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. Proceedings of the 1988 International Conference on Parallel Processing, 2:94--101, August 1988.
....There have been two approaches to implement distributed shared memory at the software level: one relies on virtual memory page protection and the other on a compiler to provide software write detection. Traditional software based shared memory systems rely on virtual memory page protection [47, 4, 41]. Detection of a data access is done by protecting the virtual memory pages and catching the page fault signal generated by the operating system. A write operation sets a dirty bit for each page, indicating that the change needs to be propagated. The granularity of shared data segments is ....
....or the system needs an account on each machine participating in the computation. These factors severely limit their use as a metacomputing framework for the Web. Another class of systems for distributed computing focuses on providing distributed shared memory across loosely coupled machines. IVY [47] and TreadMarks [41] are representatives of such systems. Cilk [11] is a comprehensive system providing resource management and fault tolerance in addition to DSM. However, it makes similar assumptions about the file system and user privileges as message passing systems, which limits its ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the International Conference on Parallel Processing, 1988.
....workstations has been studied by several groups. These systems are called DSM (distributed shared memory) systems. 14 Kai Li s work with Shiva and Ivy provides a complete shared memory abstraction to the programmer on distributed memory multiprocessors [LS89] and a network of Apollo workstations [Li88] His systems are based on a software implementation of caching at the page size granularity, and he uses a directory based invalidation scheme to ensure data coherence. The work of Francioni, Poplawski, and Pahwa [FPP88] avoids the problem of maintaining memory consistency by caching only ....
....ensures that all memory references are to local memory. Unlike Am Am N, the Am Ar N policy creates multiple copies of shared pages in preference to moving existing copies. This is essentially the same strategy used by the various distributed virtual memory systems discussed in the literature [Li88, LS89, RAK89, BCZ90b] though the page coherence strategies differ in some cases. The Am Ar N policy performs quite poorly for most of our test applications. When run under the Am Ar N policy, the fish, hough, psolu, hh3d, and gauss applications all experience a page bouncing condition in which ....
[Article contains additional citation context not shown here]
Kai Li. IVY: A shared virtual memory system for parallel computing. In Proceedings of the 1988 International Conference on Parallel Processing, volume 2, pages 94--101, 1988.
....with a large cache line size has been studied in the context of the VMP multiprocessor both experimentally and with simulation [18] The idea of providing a shared memory abstraction on distributed memory, message passing architectures in software has been studied by several groups. Kai Li s work [36, 37], based on an ondemand copying of pages between memories (page granularity software caching) with a directorybased invalidation coherency scheme, provides a complete shared memory abstraction to the programmer. Clouds [40] is an object oriented system that provides a shared memory abstraction on a ....
....[13] is approximated by Am N Cp. Cox and Fowler s policies in [19] are essentially Pm Prm TDi and Pm Prm N NS. The simple strategy described by Bolosky, Scott, and Fitzgerald [14] with threshold of 1) is WmWrm N NS. The distributed shared memory forms of migration and replication policies (e.g. [36]) are Am Am N and Am Ar N. 33 Our results show that there are memory management policies implemented in our system that can improve the performance of programs written using the simpler uniform memory access (UMA) programming model. While achieving the level of performance of a highly tuned NUMA ....
K. Li. IVY: A shared virtual memory system for parallel computing. In Proceedings of the 1988 International Conference on Parallel Processing, volume 2, pages 94--101, 1988.
....program and when experimenting with different optimization strategies. On distributed memory platforms, the lack of hardware support to directly access remote memories has prompted a variety of software based, logically shared systems. Broadly speaking, there are distributed shared memory (DSM) [Li, 1988, Bennett et al. 1990b, Amza et al. 1996] and distributed shared data (DSD) Bal et al. 1992, Sandhu et al. 1993, Johnson et al. 1995] systems. Some hybrid systems combine properties of both DSM and DSD. At one end of the spectrum, DSM systems typically emulate hardware based shared memory ....
....SR model, with the same disadvantage of requiring the programmer to explicitly bind memory regions to a region identifier. 2.2. 4 Distributed Shared Memory: TreadMarks A variety of distributed shared memory (DSM) systems have been designed, implemented, and evaluated [Stumm and Zhou, 1990] Ivy [Li, 1988] was the first DSM system. Since then, other systems, including Midway [Bershad et al. 1993] and Blizzard [Schoinas et al. 1994] have also been described in the literature. The TreadMarks system [Keleher et al. 1994, CHAPTER 2. BACKGROUND AND RELATED WORK 15 Node 0 Node 2 Node 3 Node 1 ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. Proceedings of the 1988 International Conference on Parallel Processing, volume II, pages 94--101, August 1988.
....which hides distribution and communication. The model makes programming easier; however, to be useful, it should be implemented efficiently on disjoint memories. Several existing systems use replication for implementing shared memory. The best known example is the Shared Virtual Memory of Li [20]. The unit of replication is a page; several copies of the same page may be stored on different processors. Stumm and Zhou also provides a good comparison of existing implementation techniques of distributed shared memories [27] This paper concerns another model, namely, the shared object model. ....
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", Proc. 1988 Int. Conf. Parallel Processing, pp. 94-101, Vol.II, August 1988.
....of techniques that could allow each approach to benefit from the best features of the other. 1. Introduction One of the most contentious debates in the software distributed shared memory (DSM) community has been between the proponents of object based systems [1 3] and page based systems [4 6]. The former advocate using program objects as a natural consistency granularity. The advantages are clear: specifying objects by name limits the scope of the consistency action to the object s extent. This limitation reduces the amount of data that needs to be transferred, allows the data to be ....
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing," in Proceedings of the 1988 International Conference on Parallel Processing, August 1988.
.... and prototype [20] uses the traditional virtual memory access protection mechanisms to detect access misses and implements a sequential consistency model [17] The main advantage of the approach is that it implements shared memory entirely in software on a network of commodity workstations [19] to run applications developed for hardware shared memory multiprocessors. A disadvantage is that it restricts the coherence granularity to be a virtual memory page size. For systems with large page sizes, false sharing and fragmentation will occur in applications with multiple writer, ne grained ....
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the
No context found.
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. Proceedings of the 1988.
No context found.
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the International Conference on Parallel Processing, pp.94-101, August 1988.
No context found.
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", Proceedings of the International Conference on Parallel Processing, 1988.
No context found.
K. Li. Ivy: A shared virtual memory system for parallel computing. In Proceedings of the 1988.
No context found.
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", In Proceedings of the International Conference on Parallel Processing, 1988.
No context found.
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", In Proceedings of the International Conference on Parallel Processing, 1988.
No context found.
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing", in: Proceedings of the International Conference on Parallel Computing, 1988.
No context found.
K. Li. IVY: A shared virtual memory system for parallel computing. Proceedings of the 1988.
No context found.
K. Li. IVY: A shared virtual memory system for parallel computing. In Proc. of the International Conference on Parallel Processing, pages 94--101, 1988.
No context found.
K. Li, \IVY: A Shared Virtual Memory System for Parallel Computing," in Proceedings of the International Conference on Parallel Computing, 1988, pp. 94-101.
No context found.
Kai Li. Ivy: A shared virtual memory system for parallel computing. In Proceedings of the 1988 International Conference on Parallel Processing, volume II Software, pages 94--101, August 1988.
No context found.
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proc. of the 1988.
No context found.
K. Li, IVY: a shared virtual memory system for parallel computing, in: Proceedings of the International Conference on Parallel Processing (ICPP), vol. 2, 1989, pp. 94--101.
No context found.
Kai Li, `IVY: a shared virtual memory system for parallel computing', Proceedings 1988 International Conference on Parallel Processing, volume 2, August 1988, pp. 94--101.
No context found.
K. Li, "IVY: A Shared Virtual Memory System for Parallel Computing," Proceedings of the 1988 International Conference on Parallel Processing, August 1988, pp. 94-101.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC