• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

NESL: A nested data-parallel language (1521)

by Guy Blelloch
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 152
Next 10 →

Models and Languages for Parallel Computation

by David B. Skillicorn, Domenico Talia - ACM COMPUTING SURVEYS , 1998
"... We survey parallel programming models and languages using 6 criteria [:] should be easy to program, have a software development methodology, be architecture-independent, be easy to understand, guranatee performance, and provide info about the cost of programs. ... We consider programming models in ..."
Abstract - Cited by 168 (4 self) - Add to MetaCart
We survey parallel programming models and languages using 6 criteria [:] should be easy to program, have a software development methodology, be architecture-independent, be easy to understand, guranatee performance, and provide info about the cost of programs. ... We consider programming models in 6 categories, depending on the level of abstraction they provide.

Region streams: functional macroprogramming for sensor networks

by Ryan Newton, Matt Welsh, Cambridge Ma - In Proceeedings of the 1st international workshop on Data manageBibliography 295 ment for sensor networks: in conjunction with VLDB 2004 , 2004
"... Sensor networks present a number of novel pro-gramming challenges for application develop-ers. Their inherent limitations of computational power, communication bandwidth, and energy de-mand new approaches to programming that shield the developer from low-level details of resource management, concurr ..."
Abstract - Cited by 138 (8 self) - Add to MetaCart
Sensor networks present a number of novel pro-gramming challenges for application develop-ers. Their inherent limitations of computational power, communication bandwidth, and energy de-mand new approaches to programming that shield the developer from low-level details of resource management, concurrency, and in-network pro-cessing. We argue that sensor networks should be programmed at the global level, allowing the com-piler to automatically generate nodal behaviors from a high-level specification of the network’s global behavior. This paper presents the design of a functional macroprogramming language for sensor net-works, called Regiment. The essential data model in Regiment is based on region streams, which represent spatially distributed, time-varying col-lections of node state. A region stream might rep-resent the set of sensor values across all nodes in an area or the aggregation of sensor values within that area. Regiment is a purely functional lan-guage, which gives the compiler considerable lee-way in terms of realizing region stream opera-tions across sensor nodes and exploiting redun-dancy within the network. We describe the initial design and implementation of Regiment, including a compiler that transforms a macroprogram into an efficient nodal program based on a token machine. We present a progress-sion of simple programs that illustrate the power of Regiment to succinctly represent robust, adap-tive sensor network applications. Copyright 2004, held by the author(s)
(Show Context)

Citation Context

...ative languages. Prominent (callby-value) functional languages include Lisp, Scheme and OCaml. Functional languages have been used to explore high-level programming for parallel machines—such as NESL =-=[6]-=- and *LISP [26]—and for distributed machines [24]. In our system, we get the most benefit from restricting ourselves to a purely functional (effect free), callby-need language similar to Haskell [16]....

Accelerator: using data parallelism to program GPUs for general-purpose uses

by David Tarditi, Sidd Puri, Jose Oglesby - in Proceedings of the 12th international conference on Architectural , 2006
"... GPUs are difficult to program for general-purpose uses. Programmers can either learn graphics APIs and convert their applications to use graphics pipeline operations or they can use stream programming abstractions of GPUs. We describe Accelerator, a system that uses data parallelism to program GPUs ..."
Abstract - Cited by 117 (0 self) - Add to MetaCart
GPUs are difficult to program for general-purpose uses. Programmers can either learn graphics APIs and convert their applications to use graphics pipeline operations or they can use stream programming abstractions of GPUs. We describe Accelerator, a system that uses data parallelism to program GPUs for general-purpose uses instead. Programmers use a conventional imperative programming language and a library that provides only high-level data-parallel operations. No aspects of GPUs are exposed to programmers. The library implementation compiles the data-parallel operations on the fly to optimized GPU pixel shader code and API calls. We describe the compilation techniques used to do this. We evaluate the effectiveness of using data parallelism to program GPUs by providing results for a set of compute-intensive benchmarks. We compare the performance of Accelerator versions of the benchmarks against hand-written pixel shaders. The speeds of the Accelerator versions are typically within 50 % of the speeds of hand-written pixel shader code. Some benchmarks significantly outperform C versions on a CPU: they are up to 18 times faster than C code running on a CPU.
(Show Context)

Citation Context

...e brought forth new language proposals such as the Paralation model [18]. Over the years many specialized high-level languages have been developed for numerically intensive computation including NESL =-=[1]-=-, ZPL [9] and Single Assignment C (SaC) [20]. Accelerator shares the use of high-level aggregate data structures with all of these approaches. 3. Background GPUs are special-purpose processors designe...

Powerlist: a structure for parallel recursion

by Jayadev Misra - ACM Transactions on Programming Languages and Systems , 1994
"... Many data parallel algorithms – Fast Fourier Transform, Batcher’s sorting schemes and prefixsum – exhibit recursive structure. We propose a data structure, powerlist, that permits succinct descriptions of such algorithms, highlighting the roles of both parallelism and recursion. Simple algebraic pro ..."
Abstract - Cited by 66 (2 self) - Add to MetaCart
Many data parallel algorithms – Fast Fourier Transform, Batcher’s sorting schemes and prefixsum – exhibit recursive structure. We propose a data structure, powerlist, that permits succinct descriptions of such algorithms, highlighting the roles of both parallelism and recursion. Simple algebraic properties of this data structure can be exploited to derive properties of these algorithms and establish equivalence of different algorithms that solve the same problem.
(Show Context)

Citation Context

...em may be seen in [25]. They have constructs similar to tie and zip, though they allow unbalanced decompositions of lists. An effective method of programming with vectors has been proposed by Blelloch=-=[5, 6]. He propo-=-ses a small set of "vector-scan" instructions that may be used as primitives in describing parallel algorithms. Unlike our method he is able to control the division of the list and the numbe...

Kaapi: A thread scheduling runtime system for data flow computations on cluster of multi-processors

by Thierry Gautier, Xavier Besseron, Laurent Pigeon - In PASCO ’07: Proceedings of the 2007 international workshop on Parallel symbolic computation , 2007
"... The high availability of multiprocessor clusters for com-puter science seems to be very attractive to the engineer because, at a first level, such computers aggregate high per-formances. Nevertheless, obtaining peak performances on irregular applications such as computer algebra problems re-mains a ..."
Abstract - Cited by 64 (14 self) - Add to MetaCart
The high availability of multiprocessor clusters for com-puter science seems to be very attractive to the engineer because, at a first level, such computers aggregate high per-formances. Nevertheless, obtaining peak performances on irregular applications such as computer algebra problems re-mains a challenging problem. The delay to access memory is non uniform and the irregularity of computations requires to use scheduling algorithms in order to automatically balance the workload among the processors. This paper focuses on the runtime support implementa-tion to exploit with great efficiency the computation re-sources of a multiprocessor cluster. The originality of our approach relies on the implementation of an efficient work-stealing algorithm for a macro data flow computation based on minor extension of POSIX thread interface.
(Show Context)

Citation Context

...core, multi-processor, cluster 1. INTRODUCTION Multithreaded languages have been proposed as a general approach to model dynamic, unstructured parallelism. They include data parallel ones – e.g. NESL =-=[4]-=- –, data flow – ID [9] –, macro dataflow –Athapascan [15] –, languages with fork-join based constructs –Cilk [7] – or with additional synchronization primitives – Jade [34], EARTH [19] –. Efficient ex...

Transforming High-Level Data-Parallel Programs into Vector Operations

by Jan F. Prins, Daniel W. Palmer - Proceedings Principles and Practices of Parallel Programming 93, ACM , 1993
"... Fully-parallel execution of a high-level data-parallel language based on nested sequences, higher order functions and generalized iterators can be realized in the vector model using a suitable representation of nested sequences and a small set of transformational rules to distribute iterators throug ..."
Abstract - Cited by 54 (21 self) - Add to MetaCart
Fully-parallel execution of a high-level data-parallel language based on nested sequences, higher order functions and generalized iterators can be realized in the vector model using a suitable representation of nested sequences and a small set of transformational rules to distribute iterators through the constructs of the language. 1.
(Show Context)

Citation Context

...utility of nested parallelism in this setting is limited. In [BS90] it is shown how to compile a subset of Paralation LISP into vector model code. More recently, the nested vector model language NESL =-=[Blel92]-=- has used similar techniques to yield vector model code. Compared to these approaches, the translation of Proteus includes translation of function values (which are critical elements of the higher-ord...

Optimal Evaluation of Array Expressions on Massively Parallel Machines

by Siddhartha Chatterjee, John R. Gilbert, Robert Schreiber, Shang-Hua Teng - ACM TRANS. PROG. LANG. SYST , 1992
"... ..."
Abstract - Cited by 50 (11 self) - Add to MetaCart
Abstract not found

C**: A Large-Grain, Object-Oriented, Data-Parallel Programming Language

by James R. Larus, Brad Richards, Guhan Viswanathan - LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (5TH INTERNATIONAL WORKSHOP , 1992
"... C** is a new data-parallel programming language based on a new computation model called large-grain data parallelism. C** overcomes many disadvantages of existing data-parallel languages, yet retains their distinctive and advantageous programming style and deterministic behavior. This style makes da ..."
Abstract - Cited by 47 (3 self) - Add to MetaCart
C** is a new data-parallel programming language based on a new computation model called large-grain data parallelism. C** overcomes many disadvantages of existing data-parallel languages, yet retains their distinctive and advantageous programming style and deterministic behavior. This style makes data parallelism well-suited for massively-parallel computation. Large-grain data parallelism enhances data parallelism by permitting a wider range of algorithms to be expressed naturally. C is an object-oriented programming language that inherits data abstraction features from C++. Existing scientific programming languages do not provide modern programming facilities such as operator extensibility, abstract datatypes, or object-oriented programming. C ---and its sequential subset C ++ ---support modern programming practices and enable a single language to be used for all parts of large, complex programs and libraries. This technical report consists of three parts. The body of t...
(Show Context)

Citation Context

...mit nested parallelism, subsequent work lifted this restriction [3]. 2.1.4 NESL Blelloch's language NESL is a strongly-typed, applicative data-parallel language designed to support nested parallelism =-=[2]-=-. In NESL, a programmer applies a pure function (i.e., without side-effects) to a one-dimensional aggregate (vector) that contains arbitrary elements. The function application results are collected in...

The Cilk System for Parallel Multithreaded Computing

by Christopher F. Joerg , 1996
"... Although cost-effective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications ..."
Abstract - Cited by 43 (2 self) - Add to MetaCart
Although cost-effective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications whose communication patterns are either highly irregular or dependent upon dynamic information. Multithreading has become an increasingly popular way to implement these dynamic, asynchronous, concurrent programs. Cilk (pronounced "silk") is our C-based multithreaded computing system that provides provably good performance guarantees. This thesis describes the evolution of the Cilk language and runtime system, and describes applications which affected the evolution of the system.
(Show Context)

Citation Context

...ctable as well. A careful programmer can therefore write a program and be confident that the program's performance will scale as the machine size grows. Blelloch has taken this a step further in NESL =-=[Ble93]-=-, where every built in function has two complexity measures, which a programmer can use to derive the asymptotic running time of his program. Although the data-parallel paradigm is quite popular, it h...

Parallelization in Calculational Forms

by Zhenjiang Hu, Masato Takeichi, Wei-ngan Chin - In 25th ACM Symposium on Principles of Programming Languages , 1998
"... The problems involved in developing efficient parallel programs have proved harder than those in developing efficient sequential ones, both for programmers and for compilers. Although program calculation has been found to be a promising way to solve these problems in the sequential world, we believe ..."
Abstract - Cited by 40 (27 self) - Add to MetaCart
The problems involved in developing efficient parallel programs have proved harder than those in developing efficient sequential ones, both for programmers and for compilers. Although program calculation has been found to be a promising way to solve these problems in the sequential world, we believe that it needs much more effort to study its effective use in the parallel world. In this paper, we propose a calculational framework for the derivation of efficient parallel programs with two main innovations: - We propose a novel inductive synthesis lemma based on which an elementary but powerful parallelization theorem is developed. - We make the first attempt to construct a calculational algorithm for parallelization, deriving associative operators from data type definition and making full use of existing fusion and tupling calculations. Being more constructive, our method is not only helpful in the design of efficient parallel programs in general but also promising in the construc...
(Show Context)

Citation Context

...i x 0 1 ) m i=1 ; (B i 7! G i x 0 2 ) m i=1 ; y 7! x 0 2 ]: 2 Another direction for enhancement is to generalize the present result to nested linear data types. As amply demonstrated by the NESL work =-=[Ble92]-=-, nested sequences are particularly important for expressing irregular problems, such as sparse matrixes. Effective parallelization of such problems depend on the ability to parallelize flattened vers...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University