200 citations found. Retrieving documents...
D. Hillis and Jr. G. Steele. Data Parallel Algorithms. Communications of the ACM, Vol. 29, pages 1170--1183, December 1986.

 Home/Search   Document Not in Database   Summary   ACM   TOC   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

On the Implementation of an Inference Algorithm in Java - Rus (1999)   (Correct)

.... systems that support a true message passing implementation tend to provide natural support for parallelism and distribution[11] 23] Those systems that implement messages as either statically or dynamically bound procedure calls do not provide such support for parallelism and distribution [22](although their conceptual model implies that they do) Another important concept underlying object orientation is the idea of a class. Classes in object oriented systems provide a means of classifying objects [10] Di erent objects that belong to a common class share identical behavior. The ....

Hillis, W., and Steele, G. Data parallel algorithms. Communications of the ACM 29, 12 (December 1986), 1170-1183.


c flCopyright by Manish Gupta, 1992 - Automatic Data Partitioning   (Correct)

....the target parallel program for a multicomputer. For most compilers, this parallel program corresponds to the SPMD (Single Program Multiple Data) model [39] where all processors execute the same program, but operate on distinct data items, thus enabling the exploitation of data parallelism [28]. These research efforts include the Fortran D compiler [30, 31] and the Superb compiler [81] both accepting Fortran 77 as the base language. The Crystal compiler [15] and the Id Nouveau compiler [62] are targeted for single assignment languages. Numerous other compilers, Dataparallel C [59] ....

W. Hillis and G. Steele Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, 1986.


Mixed Programming Metaphors in a Shared Dataspace Model of.. - Roman, Cunningham (2003)   (13 citations)  (Correct)

.... to organize the computation dynamically in response to the unpredictable structure of the data being processed (e.g. on a region by region basis in the labeling problem) Neither Linda nor the traditional approaches to concurrent computation, such as the UNITY paradigm and the data parallel [17] computing style used to write Connection Machine algorithms, can accomplish this. Linda s limitations are the result of a language development philosophy di#erent from that of Swarm a philosophy which favors an e#cient implementation over programming convenience and the capability to reason ....

W. D. Hillis and G. L. Steele Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


Data Compiling for Systems of Affine Recurrence Equations - Mongenet (1994)   (Correct)

....systems of affine recurrence equations. Keywords: parallelism, mapping, systems of recurrence equations, data parallelism, data compilation, communications. 1: Introduction Data parallelism is often considered as a general programming model in the context of massively parallel architectures ([7]) This model is founded on the virtualization of data and processes. A program is composed of parallel operations in which the mechanisms managing the data accesses are hidden. To be efficiently used this programming model requires powerful tools such as parallelizing compilers. Such a compiler ....

Hillis W.D., Steele G.L., Data parallel algorithms. Communications of the ACM, 29(12), pp. 1170--1183, 1986.


Representing and Executing Agent-Based Systems - Fisher (1994)   (24 citations)  (Correct)

.... is that, in recent years, low level mechanisms for efficient broadcast have been developed in many computer systems [4] Further, not only is broadcast one of the basic communication mechanisms on local area networks [19] but also the advent of novel parallel architectures (e.g. data parallelism [14, 15]) has meant that more powerful programming techniques based upon broadcast communication are beginning to be developed. 4 Implementing Agent Behaviour Having considered both the representation of behaviour within individual agents, and the communication mechanism between agents, we will now ....

W. D. Hillis and G. L. Steele. Data parallel algorithms. Comm. ACM, 29(12):1170--1183, December 1986.


Scalable Peer-to-Peer Indexing with Constant State - Considine, Florio (2002)   (4 citations)  (Correct)

....3.2.2 Group Maintenance Once the ring topology has been constructed, the next key component to maintaining the jump pointers is to estimate the value of m. Once m is estimated, we can divide the ring into groups This is essentially the well known pointer jumping technique in parallel computing [4]. a) Chord topology (b) Our topology Figure 3: Visual comparison between the finger tables of Chord and the jump pointers of our topology x = successor.predecessor if self.hash x.hash successor.hash successor = x notify(successor) a) Chord stabilization function if predecessor.hash ....

....changed size estimates which caused the jump pointers to be reorganized again. Since this is a distinct problem in its own right, we defer this question to Appendix A where we present a novel solution based on a distributed version of skip lists [13] and the classic pointer doubling approach [4]. For the rest of this discussion, we assume that the network size is known with sufficient accuracy to pick an appropriate m within O(log n) rounds of any changes at the cost of a constant number of edges per node. To speedup the process of assigning ranks, we first divide the ring We ....

HILLIS, W. D., AND GUY L. STEELE, J. Data parallel algorithms. Communications of the ACM 29, 12 (1986), 1170-- 1183.


Scalable Peer-to-Peer Indexing with Constant State - Considine, Florio (2002)   (4 citations)  (Correct)

....3.2.2 Group Maintenance Once the ring topology has been constructed, the next key component to maintaining the jump pointers is to estimate the value of #. Once # is estimated, we can divide the ring into groups This is essentially the well known pointer jumping technique in parallel computing [4]. a) Chord topology (b) Our topology Figure 3: Visual comparison between the finger tables of Chord and the jump pointers of our topology x = successor.predecessor if self.hash # x.hash # successor.hash successor = x notify(successor) a) Chord stabilization function if predecessor.hash # ....

....changed size estimates which caused the jump pointers to be reorganized again. Since this is a distinct problem in its own right, we defer this question to Appendix A where we present a novel solution based on a distributed version of skip lists [13] and the classic pointer doubling approach [4]. For the rest of this discussion, we assume that the network size is known with sufficient accuracy to pick an appropriate # within ##### ## rounds of any changes at the cost of a constant number of edges per node. To speedup the process of assigning ranks, we first divide the ring We ....

HILLIS, W. D., AND GUY L. STEELE, J. Data parallel algorithms. Communications of the ACM 29, 12 (1986), 1170-- 1183.


Executing Multithreaded Programs Efficiently - Blumofe (1995)   (12 citations)  (Correct)

....Likewise, the Cilk programming model and runtime system including Cilk NOW build on ideas found in earlier systems. In this section, we look at other theoretical results and systems that address scheduling issues for dynamic parallel computation. We shall not look at data parallel systems [8, 53] nor at systems focused on infrastructure such as distributed shared memory [4, 6, 29, 39, 59, 60, 66, 73, 87, 92, 93] 1.2. Previous results and related work 7 or message passing [43, 96, 104, 105] Substantial research has been reported in the theoretical literature concerning the scheduling of ....

W. Hillis and G. Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


The Message-Driven Processor: A Multicomputer Processing.. - William Dally Roy (1992)   (85 citations)  (Correct)

....require hundreds of instructions to ereate a new process. This cost prohibits the use of fine grain programming models where processes typically last only a few tens of instructions. The MDP supports a broad range of parallel programming models (including shared memory [16] data parallel [17], dataflow [26] actors [1] and explicit message passing [5] by providing low overhead primitive mechanisms for communication, synchronization, and naming. Communication mechanisms are provided that permit a user level task on one node to send a message to any other node in a 4K node machine in ....

W. Daniel Hillis and Guy L. Steele. Data Parallel Algorithms. Communications of the ACM, 29(12):1170 ils3, 19s6.


Condition Graphs - Barklund, Hagner, Wafin (1988)   (Correct)

....in CIM; i) if any of the resolvents produced is the empty clause then a solution has been found, ii) if there is no link to a goal atom then the proof has failed, otherwise (iii) the CG is suspeded, that is, no tests are ground. 6. CIM ON THE CONNECTION MACHINE The Connection Machine (CM) [10,11] is a novel parallel computer architecture, grown out of the observation that a fundamental bottleneck in computing is the communication between processors and memory. In the Connection Machine each processor is comparatively simple and has direct access to only a small amount of memory. As a ....

W. D. Hillis, G. L. Steele Jr., Data Parallel Algorithm," Communications of tte ACM 29 (December 1986): 1170 1183.


A Framework for Parallel Job Scheduling - Subramanian (1995)   (Correct)

.... Nets [Rei85] ffl Alternating Turing Machines [CKS81, Ruz80] ffl Boolean circuits [CSV84, SV84] ffl Systolic Arrays [Kun82] ffl Associative Processors [Pot92, SKA92] ffl PRAM (several varieties: EREW, CREW, several kinds of CRCW) FW78, Gol82, SS79] ffl V RAM (data parallel) model [Ble90, HS86] Each model sprang from a different research community in response to completely different problems. As a result, these models are so far removed from each other that it is often difficult to translate research advances from one area to another. Fortunately, there finally seem to be some hints ....

W.D. Hillis and G.L. Steele Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


Analysis of Parallelism in Recursive Functions on Recursive Data .. - Ahn, Han   (Correct)

....functions with complex data flow. 1 Introduction Data parallelism implements parallel computation by simultaneously applying the same operation to each element of a data collection. This paradigm is considered as a parallel model which can solve many difficulties of parallel programming [2, 10, 20]. A single stream of program execution guarantees easier programming and better readability. Massive parallelism can be easily obtained by distributing large data collections. In functional languages, data parallelism is expressed with the following two components. parallel data collections ....

W.D. Hillis and G. Steele. Data Parallel Algorithms. Communications of the ACM, 29(12):1170--1183, 1989.


A Location-Independent ASOCS Model - Rudolph (1991)   (2 citations)  (Correct)

....Parallelism The main goals in creating parallel systems are increased computational speed and ease of programming. Much work in recent years has gone into extending the conventional Von Neuman (fetch and execute) computing paradigm to handle parallel schemes [Feng81] Hayn82] Hill85] [Hill86], Hwang87a] Hwang87b] Walt87] Machines that use the Von Neuman paradigm in parallel are classifed as conventional parallel machines. Examples of these are the Connection Machine [Hill85] the Supercomputer [Hwang87a] pipelined processors and systolic arrays. Research has shown that the ....

. Hillis, W.D., Guy L. Steele, Jr. "Data Parallel Algorithms. "Communications of the ACM, Vol. 29, #12. pp. 1170-1183. (December 1986).


Fortran D Language Specification - Fox, Hiranandani, Kennedy, Koelbel.. (1991)   (61 citations)  (Correct)

....FO90] and Delirium [LS91] are valuable when used to coordinate coarse grained functional parallelism. However, these languages do not meet the needs of computational scientists because they are inefficient for capturing fine grain data parallelism (of the type described by Hillis and Steele [HS86] and Karp [Kar87] This is mainly due to the fact that existing parallel languages lack both language and compiler support to assist in efficient data placement [PB90] Parallelism must also be explicitly specified because these languages do not provide compilers that can automatically detect ....

W. Hillis and G. Steele, Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, 1986.


The Data Field Model - Lisper, Hammarlund (2001)   (Correct)

....can be very convenient. This model is an instance of collection oriented programming [58] The classical example is APL [16] which provides arrays and a rich set of operations on them. Indexed data structures are very important in high performance computing. The data parallel programming model [25] is a collection oriented paradigm for explicit parallelism, originally for SIMD architectures where distributed entities, like arrays indexed by processor coordinates, are manipulated in parallel. Many historical data parallel languages, like C and Lisp for the Connection Machine [64, 65] ....

W. D. Hillis and G. L. Steele, Jr. Data parallel algorithms. Comm. ACM, 29(12):1170--1183, Dec. 1986.


Parallel Algorithmic Techniques for Combinatorial Computation - Eppstein, Galil (1988)   (26 citations)  (Correct)

.... nevertheless we hope that the techniques described will be useful not only on shared memory machines, but also on other types of parallel computers, either through simulations of shared memory [48, 70, 35, 57] or through analogous techniques for di#erent models of parallel computation (e.g. see [30]) 1. THE MODEL OF PARALLELISM The model of parallel computation we use is the shared memory parallel random access machine (PRAM) This model consists of a collection of identical processors and a separate collection of memory cells; any processor can access any memory cell in unit time. ....

W.D. Hillis and G.L. Steele, Data Parallel Algorithms. C. ACM 29(12), 1986, 1170--1183.


Generating Parallel Programs from Skeleton Based Specifications - Parsons, Rabhi (1998)   (1 citation)  (Correct)

....interaction between points in the data space, and therefore communication. Specifying and implementing this communication is an important challenge in designing skeleton based programming systems. 3.1. 1 SIT Algorithms The SIT skeleton is powerful enough to capture many data parallel algorithms [24]. One example is the Jacobi Iterative Solver [30, p. 1019] This algorithm operates on a rectangular grid, where every point contains a value derived from some discretisation technique. Transformation steps are performed on every point in the local data space which does not lie on the boundary of ....

W.D. Hillis and G.L. Steele Jr. Data Parallel Algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


Design, Implementation and Evaluation of ParaDict, a Data.. - Gabarro, Silvestre (2001)   (Correct)

....on 2 3 trees given by W. Paul, U. Vishkin and H. Wagener in [18] Since these algorithms were originially described in an informal style, our task has been done in two steps. First, we wrote the algorithms in a pidgin data parallel language close to the one given by D. Hillis and G. Steele in [12]. Widely applying stepwise refinement development techniques we obtained clear, complete and readable high level algorithms. In the second step, we coded these algorithms into the C programming language [24] To evaluate the resulting code, the running time of some usual operations has been ....

....and quite challenging for a C programmer. As a consequence, the implementation of the algorithms was done in two steps. In the first step, we transforme the informal algorihms into a not ambiguous description, somehow related to the data parallel pidging language defined by Hillis and Steele [12]. We follow the directions given by Gabarro and Gavalda [8] dealing with the correctness of data parallel algorithms and to apply the same modularity and descending design techniques well known in the case of designing sequential algorithms. As in [9] the result was a clear and readable highlevel ....

D. Hillis and G. Steele. Data parallel algorithms. CACM, 29(12):1170--1183, 1986.


Parallel Functional Programming on Recursively Defined Data.. - Nishimura, Ohori (1993)   (Correct)

....calls still requires a series of n communications that must be processed sequentially. However, there is a data parallel algorithm, originally due to Wyllie (1979) which computes the sum of an n element integer list in O(log n) parallel steps. Here we use the presentation by Hillis and Steele (Hillis Steele Jr. 1986). The main idea behind the algorithm is to use an extra chum pointer to maintain information about the necessary value to complete a partially computed recursive call. The following pseudo code describes such an algorithm that computes the sux sum of an integer list with extra chum pointer, where ....

Hillis, W.D., & Steele Jr., Guy L. (1986). Data parallel algorithms. Communications of ACM, 29(12), 1170-1183.


Compiling for Massively Parallel Machines - Philippsen, Tichy (1991)   (10 citations)  (Correct)

....among synchronous and asynchronous execution mode at any level of granularity. Thus, programs can use SIMD mode for a synchronous algorithm, or use MIMD mode where asynchronous concurrency is more appropriate. The two modes can even be intermixed freely. The data parallel approach, discussed in [9] and exemplified in languages such as LISP, C , and MPL is currently quite successful, because it has reduced machine dependence of parallel programs. Data parallelism extends a synchronous, SIMD model with a global name space, which obviates the need for explicit message passing between ....

....of SIMD machines with an important generalization for parallel branches. As an example, consider the computation of all postfix sums of a vector V of length N . The program should place into V [i] the sum of all elements V [i] V [N 0 1] A recursive doubling technique as in reference [9] computes all postfix sums in O(logN ) time, where N is the length of the vector. VAR V : ARRAY[0 . N 1] OF REAL; VAR s : CARDINAL; BEGIN s : 1; WHILE s N DO FORALL i: 0. N 1] IN SYNC IF (i s) N THEN V[i] V[i] V[i s] END END; s : s 2 END END v v v v v v v v v v v v v v v v v ....

Hillis WD, Steele GL. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, 1986.


Modula-2* and its Compilation - Philippsen, Tichy (1991)   (3 citations)  (Correct)

....and asynchronous execution mode at any level of granularity. Thus, programs can use SIMD mode where proper synchronization is difficult, or use MIMD mode where synchronization is simple or infrequent. The two modes can even be intermixed freely. The data parallel approach, discussed in [6] and exemplified in languages such as LISP, C , and MPL is currently quite successful, because it has reduced machine dependence of parallel programs. Dataparallelism extends a synchronous, SIMD model with a global name space, which obviates the need for explicit message passing between ....

....of SIMD machines with an important generalization for parallel branches. As an example, consider the computation of all postfix sums of a vector V of length N . The program should place into V [i] the sum of all elements V [i] V [N Gamma 1] A recursive doubling technique as in reference [6] computes all postfix sums in O(log N ) time, where N is the length of the vector. Figure 1 illustrates the process. The program operates by computing partial sums of length s = 2 j , where j counts the iterations. The inner forall creates N processes. Note that there is a one to one mapping ....

W. Daniel Hillis and Guy L. Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


CoPa: a Parallel Programming Language for Collections - Suciu, Tannen (1998)   (Correct)

....for sets. There has been a good deal of work on data parallel techniques and languages that are all or in part data parallel: there exist parallel extensions of FORTRAN, like High Performance Fortran [43] and PTRAN [3, 24] parallel extensions of C, like C [69] C [45] of Lisp, like CM List [39] and Paralation Lisp [60, 51] and applicative parallel programming languages, like NESL [8, 10, 11] Sisal [31, 62, 30] Crystal [21] Proteus [54, 34, 59] and Data parallel ML [32, 37, 38] None of these have been concerned with query constructs and their integration in such languages. In this ....

Daniel Hillis and Guy Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, 1986.


The Data-Parallel Programming Model: a Semantic Perspective - Bouge (1992)   (3 citations)  (Correct)

....to clarify in the area of sequential computing, and we can see now that there is no fixed connection between architectures and languages. It is considered as a desirable feature that all languages be available on all (most ) architectures. A similar awareness is now emerging in parallel computing [13]. But there is still a long way to a complete separation between parallel architectures and parallel languages. We have tried to demonstrate that data parallelism is a first step in this direction. Another remark is that such a separation is desirable only if compilers are intelligent enough to ....

W.D. Hillis, G. Steele, Data Parallel Algorithms, Comm. ACM 29, 12 (1989) 1170--1183.


C++ and Massively Parallel Computers - Lickly, Hatcher   (Correct)

....across parallel architectures and leverages the existing compiler technology for translating data parallel programs onto both SIMD and MIMD hardware. 1 Introduction The data parallel programming model is based upon the simultaneous execution of the same operation across a set of data [9]. Most scientific and engineering problems, and many others, have data parallel solutions. The model s single locus of control simplifies design, implementation, and debugging of programs. The model is high level, which greatly enhances the portability of programs across sequential, SIMD, and MIMD ....

W. Hillis and G. Steele Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170-- 1183, December 1986.


Protocol Compilation: High-Performance Communication for Parallel .. - Felten (1993)   (25 citations)  (Correct)

....code. This model is supported directly by SIMD machines, including the Connection Machine CM 1 and CM 2 architectures and the MasPar MP 1 and MP 2; the existence of these machines, which run only dataparallel code, has fostered the development of a rich variety of data parallel algorithms [Hillis Steele 86] Recent research has shown how (SIMD) data parallel programs can be run efficiently on MIMD hardware [Hatcher Quinn 91, TMC 91] This is done by using compile time analysis to determine where and when synchronization and communication are needed to maintain the illusion of lockstep behavior ....

W. D. Hillis and G. L. Steele Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


The Cranium Network Interface Architecture: Support for Message.. - McKenzie (1997)   (Correct)

.... dense matrix multiplication (DMM) is called Cannon s algorithm [88] The algorithm is very regular and well suited to special purpose systolic cellular automata hardware [89] It has been implemented on general purpose multicomputers using a variety of parallel high level languages including C [90], Spot [91] and Orca C [92] Here is a brief description of the algorithm. For simplicity, the two input matrices A and B are square and each contains n Theta n elements. The desired product C is AB and is also an n Theta n matrix. The initialization step in Cannon s algorithm requires the array ....

W. Daniel Hillis and Guy L. Steele, Jr. Data parallel algorithms. Communications of the ACM 29(12), December 1986, pp. 1170-1183.


A Distributed Search Program for the 3x+1 Problem - Leavens (1989)   (Correct)

.... best way to proceed would be to have each processor check a single number rather than an interval, have each processor that found a peak record that peak, and then have each processor work on the next number as determined by adding the number of processors 1 to the number it worked on before [HS86] This algorithm would be difficult to manage if the processors could not move in lock step. Moreover, it would be difficult to synchronize the recording of peaks, since it is possible that two processors would find a peak during the same step of the algorithm. 2.2.8 Availability As the system ....

W. Daniel Hillis and Guy L. Steele Jr. Data Parallel Algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


Implementation Manual, Version 4.0 - Er Si On   (Correct)

....these numbers should only be regarded as performance indicators of SNNS but may not be quoted as 4 CHAPTER 2. DESIGN OF THE SIMULATOR KERNEL machine architecture benchmarks. The use of simulators for neural nets almost demands the use of parallel computers like the Connection Machine CM 2 [Hil85, HS86a, HS86b] or the MasPar MP 1, because of the inherent parallelism of the algorithm. The use of parallel computers will therefore surely increase the above values. 2.3 Layer Model of the Simulator Kernel The simulator kernel is internally structured in three layers. Each higher layer represents a higher ....

W.D. Hillis and G.L. Steele. Data parallel algorithms. ACM, 29(12):1170--1183, 1986.


Graph Augmentation And Related Problems: Theory And Practice - Hsu (1993)   (6 citations)  (Correct)

....the implementation of parallel algorithms. In view of the above, a natural strategy is to use parallel algorithms developed on an abstract parallel machine model. Several abstract models that are closely related to real parallel machine architectures have been proposed [Ble89, DNS81, GMR93, HS86, Sch80, CKP 93] Instead of using a new model, we have performed a direct implementation of parallel algorithms based on the popular PRAM model [J aJ92, KR90, Rei93] Although the PRAM is an idealized theoretical model that does not capture the real cost of performing inter processor ....

W. D. Hillis and G. L. Steele Jr. Data parallel algorithms. Communications of the ACM, 29:1170--1183, 1986.


Parallel Sets: An Object-Oriented Methodology for Massively.. - Kilian (1992)   (7 citations)  (Correct)

....The basic strategy is divide and conquer. We link objects in small chains and merge the smaller chains to create longer ones. The algorithm assumes a CREW SIMD architecture similar to that of the Connection Machine [Hil85] No assumption is made on the distribution of objects (other algorithms ( HS86] HS87] assume objects to be linked are within some neighborhood of one another) A A B C B D B C Figure B.1: Linking like objects together. 107 level A A B C B D B C 0 ( 1 (B E ) 2 (B E B E ) 3 B E B B E E Figure B.2: Matching process. B.1 ....

W. Daniel Hillis and Guy L. Steele Jr. Data Parallel Algorithms. Communications of the ACM, 29(12), December 1986.


From equations to hardware. Towards the systematic .. - Charot, Frison.. (1992)   (Correct)

....language defines a new storage class specifier, systolic, to allocate variables in each cell of the network. A statement operating on systolic variables describes a simultaneous execution of this operation on all the cells of the network. This programming model is referred to as data parallel [HS86]. Figure 7 provides an example of such a statement: the statement B = A is executed on the host, whenever the statement y = x is a parallel component wise assignment of systolic variable x to y. This example also shows that, despite the fact that these operations are written sequentially, they can ....

W. Hillis and G. Steele. Data Parallel Algorithms. ACM, 29(12):1170--1183, dec 1986.


Architectures Systoliques Et Parallelisme . . . - Raimbault, al. (1993)   (Correct)

....sur des variables systoliques produit une ex ecution simultan ee de cette op eration sur toutes les cellules. Une instruction manipulant des variables scalaires ne se d eroule que sur l hote. Ce mod ele de programmation parall ele est r ef erenc e sous le terme de parall elisme par les donn ees [14]. Les variables impliqu ees dans une expression doivent etre de meme classe. L echange de valeurs entre variables systoliques et variables scalaires s exprime explicitement au moyen d op erateurs de communication d ecrits dans le paragraphe 3.4. La notion de classe, telle qu elle vient d etre ....

W. Hillis and G. Steele. Data Parallel Algorithms. ACM, 29(12):1170--1183, dec 1986.


I/O and Computation Overlap on SIMD Systolic Arrays - Lavenier, Raimbault, al. (1993)   (Correct)

.... variables used for controlling the execution sequence, such as loop counters and boolean conditions, should exist only once on the host computer (static or auto storage class) Statements operating on cell variables (systolic storage class) perform parallel execution in a data parallel fashion [4]. Mixed expressions, which attempt to combine 14 D. Lavenier, F. Raimbault, P. Frison host and network variables, are rejected by the compiler according to the locality principle. Because the cells do not have independent control, while loops, if tests and computed loop bounds on the cells are ....

W. Hillis and G. Steele. Data Parallel Algorithms. ACM, 29(12):1170--1183, dec 1986.


Specifying Problems in a Paradigm Based Parallel Programming.. - Parsons, Rabhi (1995)   (Correct)

....a single instance for the entire problem. The data set is repeatedly transformed until some termination condition is satisfied with a transformation step consisting of one or more operations on either local or global data. The paradigm is powerful enough to capture many data parallel algorithms [10]. Other examples of problems which fit this algorithmic paradigm include iterative PDE solvers [11] and n body simulations [12] 2. Specifying Problems To provide the maximum advantage from using a paradigm based programming approach, the method of specification should relate as closely as ....

W.D. Hillis and G.L. Steele Jr., Data Parallel Algorithms, Communications of the ACM 29(12), pp. 1170-1183, 1986.


The Data-Parallel Programming Model: a Semantic Perspective - Bouge (1996)   (3 citations)  (Correct)

....a combination of these regular communications, even in apparently irregular areas. Striking examples of this approach applied to linked lists can be found in a famous RR n3044 8 Luc Boug e Memory Processor Figure 4: The macroscopic viewpoint: a processor of arrays paper of Steele and Hillis [19]. In fact, a complete methodology of programming can be developed with these only communication primitives [3] The macroscopic viewpoint is the basis for data parallel array manipulation languages. We must of course cite their (glorious ) ancestors APL [22] and Actus [28] The macroscopic ....

....(most ) architectures. This effort has paved the way for the emergence of programming models with no immediate operational counterparts: functional programming, logic programming, some aspects of object oriented programming, etc. A similar awareness is now slowly emerging in parallel computing [19]. In some sense, the emergence of PVM and now MPI is a first step in this direction: at least, a standard low level programming model for MIMD Distributed Memory (MIMD DM) architectures is now widely accepted. But this programming model is still very close to the execution model, very much as the ....

W.D. Hillis and G.L. Steele, Jr. Data-parallel algorithms. Communications of the ACM, 29(12):1170--1183, 1986.


An Implementation of Back-Propagation Learning on GF11, a.. - Witbrock, Zagha   (Correct)

....with Processors Configured in a Ring 4.4.2. Summing Weight Changes in a Tree GF11 s powerful communication facility allows a more efficient approach to summing weight changes. We use several switch configurations to sum weight changes in a binary tree using a standard data parallel algorithm[8]. On step i (starting at zero) processor P sends its weight change to processor (P 2 i ) mod NPROCS and adds the weight change it received to its weight change. After log 2 NPROCS steps, the total weight change in each processor contains the sum of the individual weight changes. See figure ....

Hillis, W.D. and Steele, G.L., "Data Parallel Algorithms", Communications of the ACM, December, 1986.


A Parallel Object-Oriented System for Realizing Reusable and.. - Lim (1993)   (7 citations)  (Correct)

....parallel language that supports a shared address model for both sharedmemory and distributed memory multiprocessors, and makes NUMA an integral part of the language design. In addition to the usual control parallel constructs (e.g. threads [35] pSather also supports a form of data parallelism [131] (Section 2.6) which is usable for both irregular, dynamic data structures (e.g. trees) and the regular arrays. It decouples the execution model from the execution mode of single instruction multiple data (SIMD) machines. The semantics of data parallel and control parallel constructs are ....

W. Daniel Hillis and Guy L. Steele, Jr. Data Parallel Algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


A Shared-Memory Multiprocessor Implementation of.. - Suciu, Huelsbergen (1994)   (Correct)

....programming languages that primarily provide portability across many general architectures and through many generations of a specific architecture. Performance especially parallel performance is currently a less pressing concern when programming commodity machines. The data parallel model [HS86] of computation extends existing languages (e.g. Fortran [KLS 91] and C [Thi93] with new constructs that operate on aggregate data in parallel. However, with new language constructs come new semantics (cf. HPF and C) We favor the introduction of parallel programming on machines with a ....

....even further, from O(n 2 ) to O(n) by sequentializing the parallel map at the next level (in the innerWith function) the resulting T = O(n 2 ) algorithm did not perform as well, due to the total O(n 2 ) synchronizations required. 7 Related Work Data parallel programming is advocated in [HS86] where a collection of data parallel algorithms can be found. Our data parallel primitives and the complexity measures T; W are in the spirit of NESL described in [Ble93, BC93] NESL is a parallel functional language with sequences as central types. It has only some limited form of higher order ....

D. Hillis and G. Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, 1986.


Three-Dimensional Monte Carlo Device Simulation for.. - Architectures Henry..   (Correct)

No context found.

D. Hillis and Jr. G. Steele. Data Parallel Algorithms. Communications of the ACM, Vol. 29, pages 1170--1183, December 1986.


Self-Stabilizing Structured Ring Topology P2P Systems - And (2005)   (Correct)

No context found.

W. D. Hillis and J. Guy L. Steele. Data parallel algorithms. Commun. ACM, 30(1):78--78, 1987.


Evaluating Parallel Algorithms: Theoretical and Practical Aspects - Natvig (1996)   (Correct)

No context found.

W. Daniel Hillis and Guy L. Jr. Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


Patterns for Parallel Application Programs - Massingill, Mattson, Sanders (1999)   (4 citations)  (Correct)

No context found.

W.D. Hillis and G.L Steele, Jr. "Data Parallel Algorithms" Comm. ACM, Vol 29 No 12 pp 11701183.


Class Notes : Programming Parallel Algorithms - Cs Fall Guy (1993)   (1 citation)  (Correct)

No context found.

W. Daniel Hillis and Guy L. Steele Jr. Data parallel algorithms. Comm. ACM, 29(12), December 1986. 139


OOPAL: Integrating Array Programming in Object-Oriented.. - Mougin, Ducasse (2003)   (Correct)

No context found.

W. D. Hillis and J. Guy L. Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, 1986.


Unresponsiveness-Tolerant Collective Communication - Pakin (2001)   (Correct)

No context found.

W. Daniel Hillis and Guy L. Steele Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986. Available from http://www.acm.org/pubs/articles/journals/cacm/1986-29-12/ p1170-hillis/p1170-hillis.pdf.


Reasoning about Synchronic Groups - Gruia-Catalin Roman And (1992)   (1 citation)  (Correct)

No context found.

W. D. Hillis and G. L. Steele Jr. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.


Thinking in Parallel: Some Basic Data-Parallel Algorithms and.. - Vishkin (2002)   (1 citation)  (Correct)

No context found.

W.D. Hillis and G.L. Steele. Data parallel algorithms. Comm. ACM, 29(12):1170--1183, 1986.


Unknown -   (Correct)

No context found.

D. Hillis and G. Steele Jr., Data parallel algorithms. Comm. of the ACM, 29:1170-1183, 1986.


Computational Structure of the N-body Problem - Katzenelson (1989)   (21 citations)  (Correct)

No context found.

D. Hillis and G. Steele Jr., Data parallel algorithms. Comm. of the ACM, 29:1170-1183, 1986.


Descriptive Simplicity in Parallel Computing - Marr (1997)   (Correct)

No context found.

W Daniel Hillis and Jr Guy L Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170--1183, December 1986.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC