12 citations found. Retrieving documents...
J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Shape-based Cost Analysis of Skeletal Parallel Programs - Hayashi   (Correct)

....in a window surrounding the corresponding pixel in the input image, and split and merge (SAM) where an image is partitioned into slices, an operation is applied to each slice then the results are merged. Performance models for the skeletons are derived in terms of the WPRAM computational model [65] and the execution time for a skeleton is presented as a generic higher order complexity function. The time complexity of the particular application is derived when the skeleton with a specific set of parameters is instantiated. The approach is illustrated by some examples from image processing, ....

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995.


Concurrent Sharing through Abstract Data-types: A Case Study - Goodeve, al. (1996)   (1 citation)  (Correct)

....application. ing the problem. By choosing a suitably small task size, an effective load balance across the machine may be obtained. Figure 5 shows the speedup and efficiencies obtained from executing this algorithm using the shared queue on a simulator for a weak shared memory machine, the WPRAM[16]. The underlying machine model consists of many processors with their own local memories connected by a CLOS network configuration[4] The WPRAM embodies a performance model that assumes O(log 2 n) time implementations of combining operations and O(log n) time memory operations. The WPRAM ....

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995.


BSP Scheduling of Regular Patterns of Computation - Calinescu (1997)   (Correct)

....must consistently permit both the design of general purpose parallel architectures and the development of portable software for them. Candidates for this role have not delayed their appearance: the XPRAM model of Valiant [66] the logP model of Culler et al. 17] the WPRAM model of Nash et al. [51] are but a few examples of possible such models. The most interesting proposal, however, is the bulk synchronous parallel (BSP) model introduced by Valiant in [65] Since its emergence in the early 1990s, the BSP model has continuously attracted the attention of many parallel algorithm and ....

J.M. Nash et al. Parallel algorithm design on the WPRAM model. Technical Report 94.24, Division of Computer Science, University of Leeds, July 1994.


A BSP Scheduling Tool for Loop Nest Parallelisation - Calinescu (1997)   (Correct)

.... models, which encompass many of the characteristics that made the von Neumann model so successful in the realm of sequential computing, include the bulk synchronous parallel (BSP) model of Valiant [30, 24] the LogP model of Culler et al. 12] the Weakly coherent PRAM (WPRAM) model of Nash et al. [27], etc. Due to its simplicity, elegance and generality, the BSP model represents one of the most attractive of these proposals. This position is further justified by the recent release of a worldwide standard BSP programming library [16] and of a BSP application development toolset [15, 17] ....

J.M. Nash et al. Parallel algorithm design on the WPRAM model. Technical Report 94.24, Division of Computer Science, University of Leeds, July 1994.


Communication-Efficient Parallel Sorting - Goodrich (1996)   (74 citations)  (Correct)

....model, which would allow for arbitrary broadcasts, or even a CRCW BSP model, which would also allow for messages to the same location to be combined (using some arbitration rule) These models correspond to models Gibbons [21] calls the CREW phase PRAM and CRCW phase PRAM. In fact, Nash et al. [35] consider even more powerful bulk synchronous models in which messages can be combined using more powerful combining functions (such as add or bitwise and ) We personally feel that anything more powerful than a weak CREW BSP computer is probably not a realistic parallel model, given the ....

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel algorithm design on the WPRAM model. Technical Report 94.24, School of Computer Science, Univeristy of Leeds, 1994.


Constructive Solid Geometry Using Algorithmic Skeletons - Davy, Deldari, Dew (1996)   Self-citation (Dew Davy)   (Correct)

....effects of tree imbalance are eliminated and a naturally load balanced algorithm is achieved. Fuller details can be found in [2] and the second author s forthcoming PhD thesis. A parallel PMC program using this tree contraction technique was implemented in C. The target system was the WPRAM [16], a general purpose scalable shared memory computational model developed at the University of Leeds. Currently the WPRAM exists only as a simulator, though an implementation using Inmos T9000 transputers is under way. The simulator includes a detailed performance model, parameterised by the ....

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel algorithm design on the WPRAM model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995.


Implementation Issues Relating to the WPRAM Model for.. - Nash, Dew, Davy, Dyer (1996)   Self-citation (Nash Dew Dyer Davy)   (Correct)

....use of supersteps in the BSP makes it most applicable to data parallel problems in which the patterns of synchronisation are very regular. The LogP model provides more flexibility, but operating at the message passing level can result in highly complex code. The WPRAM model (Weakly coherent PRAM) [21, 24] also provides a realistic cost model, allowing performance prediction, but provides greater flexibility and operates at a higher level of abstraction than message passing. In contrast to the BSP and LogP models, the WPRAM uses a weakly coherent shared address space [8] to communicate information. ....

.... algorithms, through the ability of these operations to replace the more common use of locking mechanisms (with the implied sequentialisation of the associated code) Previous work on the WPRAM model has emphasised algorithm design and performance prediction, in such areas as linear programming [24], computational geometry [7, 25] 2 , sorting [25] image processing [6] and solid modelling [5] In addition, the ability to support highly concurrent data structures, such as a FIFO queue [22] has been demonstrated, together with the application to dynamic load balancing [23] Many of these ....

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995.


Dynamic Load Balancing using a Highly Concurrent Shared Data .. - Nash, Dew, Davy, Dyer (1996)   Self-citation (Nash Dew Dyer Davy)   (Correct)

....router and IBM SP2 high performance switch) which provide very high bandwidths. Example machines are the Cray T3D E, IBM SP2, Intel Paragon and the forthcoming Tera MTA, which scale in performance to 100 s or 1000 s of processors. 2 The WPRAM Model The aim of the WPRAM (Weakly Coherent PRAM) [2] is to provide an instruction set enabling the development of scalable algorithms, based around a shared memory abstraction, which can support diverse forms of parallelism. The model is targeted at the above class of scalable machines, with no worse than logarithmic latencies and linear increases ....

J.M. Nash, P.M. Dew, M.E. Dyer and J.R. Davy, Parallel Algorithm Design on the WPRAM Model. In Abstract Machine Models for Highly Parallel Computers, Oxford University Press, 1995, pp 83-102.


The Consistency Properties of a Scalable, Concurrent Queue - Goodeve, Davy, Dew, Nash (1996)   (1 citation)  Self-citation (Nash Dew Davy)   (Correct)

....are well understood[11, Implementation techniques are known for these primitives that offer high scalable throughput, whilst retaining a latency that only grows slowly with system size. There is good reason to believe that future generation systems will support these primitives efficiently[19, 23, 18, . Tag variables provide a basic pair wise synchronisation mechanism. A write operation on a tag stores a value into the variable and sets an active flag. The set operation does not alter the value, but just sets the active flag. A read operation on the tag only completes once the active flag is ....

....has blocking semantics which does not guarantee liveness in the presence of failures. 3 Performance The queue has been designed to deliver throughput that scales with the size of a system, thus effectively supporting scalable algorithms. The algorithms were developed using the WPRAM model[19, , which has a well defined execution cost model. Based on this model, the complexity of the enqueue and dequeue operations are both O(n=p log p) for n concurrent enqueueing and dequeuing processes on a p processor machine[20, To practically evaluate the performance of the queue, a simple ....

[Article contains additional citation context not shown here]

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995.


The Performance of Parallel Algorithmic Skeletons - Deldarie, Davy, Dew (1995)   (8 citations)  Self-citation (Dew Davy)   (Correct)

....terms of the isoefficiency function. Case studies illustrating these ideas are presented from the field of image processing, using two skeletons derived from existing special purpose parallel languages. Performance models for these skeletons are developed in terms of the WPRAM computational model [7]. From these, the time complexity and isoefficiency function are derived for a number of applications. Measured performance on the WPRAM simulator shows a close match to theoretical predictions. 2. Algorithmic Skeletons A skeleton provides an interface and an implementation for a set of ....

....described using functional languages, this is not essential. A simple experimental testbed has been developed in which customising functions are defined in C, enabling their easy incorporation in the parallel templates. 4. Implementation on the WPRAM model The WPRAM computational model [7], developed at Leeds University, is a variant of the well known PRAM model. It is based on the idea of Bulk Synchronous Parallelism (BSP) 11] extended to exploit more fully the features of MIMD machines. BSP is conceived as a universal model, providing a bridging level between parallel languages ....

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers. Oxford University Press, 1995. (forthcoming) .


Scalable Caching Techniques for a Weakly Coherent Memory - Zamanifar, Nash, Dew (1995)   Self-citation (Nash Dew)   (Correct)

....to update their set of columns. The use of a barrier operation guarantees the coherency of the new data, and supports the required program ordering. The use of multiple copies of some of the shared variables is used to reduce the potential number of synchronisation points within the solution [24]. The algorithm completes when there are no negative objective coefficient values remaining. 3.2 Using a one level memory hierarchy for sharing data Figure 4 shows the related pseudo code based on the one level memory hierarchy. The same set of shared variables are used. It is assumed that each ....

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995.


Practical Structured Parallelism Using BMF - Crooke (1998)   (Correct)

No context found.

J. M. Nash, P. M. Dew, M. E. Dyer, and J. R. Davy. Parallel Algorithm Design on the WPRAM Model. In J. R. Davy and P. M. Dew, editors, Abstract Machine Models for Highly Parallel Computers, pages 83--102. Oxford University Press, 1995. 138

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC