| Andrew Jonathan Bennett. Parallel Graph Reduction for Shared-Memory Architectures. PhD thesis, Imperial College, September 1993. |
....ffl a prototype implementation of an example ADT, called futurespace, which demonstrates how the idea can be used for general purpose portable parallel programming in conventional languages. Simulation results ffl Studied caching mechanisms to support parallel functional program execution ([Ben93], BK93] ffl In high bandwidth, high latency architectures, communication costs can be dramatically reduced using very large cache lines because of spatial locality. ffl With conventional cache coherency protocols, this benefit is very limited because contention for write access to cache ....
Andrew Jonathan Bennett. Parallel Graph Reduction for Shared-Memory Architectures. PhD thesis, Imperial College, September 1993.
....of the PGR regime. We are particularly concerned with modern multiprocessors which have relatively high latency and high bandwidth interconnection networks since these characteristics greatly influence the behaviour and speedup of programs. Preliminary results from this work were given in [7], and have been presented in conference papers [8, 9] This paper reports results from a larger and more interesting benchmark suite, and includes more detailed analyses of simulation results, weak consistency and interconnection network bandwidth issues. The remainder of the paper is structured ....
....design The performance and behaviour of a set of benchmark programs executing under PGR with three types of shared memory (ideal, invalidation, and two level ownership) was studied using a series of simulation experiments. A comprehensive description of the experimental design can be found in [7]. Here the most important aspects are summarised. We have chosen to use simulation rather than an implementation on real hardware since it offers a number of distinct advantages: it allows the behaviour of the system to be closely monitored without affecting its behaviour, permitting, for ....
[Article contains additional citation context not shown here]
Andrew J. Bennett. Parallel graph reduction for shared-memory architectures. PhD thesis, Department of Computing, Imperial College, London, July 1993.
....greatly simplifies that task of locating the owner of a line: it is implicit in the address. 4 Experimental Design The performance and behaviour of the new protocol have been assessed using a series of simulation experiments. A comprehensive description of the experimental design can be found in [4]. Here the most important aspects are discussed. We have chosen to use execution driven simulation which eliminates the validity problems of trace driven simulation since the simulated processors read the data as it is at the simulated time at which the reference is made. A simplified ....
Andrew J. Bennett. Parallel graph reduction for shared-memory architectures. PhD thesis, Department of Computing, Imperial College, London, 1993.
....copy set of each cache line in the shared memory in order to determine the latency of each heap reference. Further information is also associated with each cell and cache line to enable the performance of the cache system to be closely monitored. The simulation scheme is more fully described in [4]. 5.1 Architectural Model We basically assume a 32 bit non pipelined load store architecture, with the following assumptions: ffl The stack, private data and code regions of each process are served by a separate, perfect cache system; each read or write to these regions has a latency of one ....
Andrew J. Bennett. Parallel graph reduction for shared-memory architectures. PhD thesis, Department of Computing, Imperial College, London, 1993.
No context found.
Andrew Jonathan Bennett. Parallel Graph Reduction for Shared-Memory Architectures. PhD thesis, Imperial College, September 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC