Results 1 -
3 of
3
Nested Algorithmic Skeletons from Higher Order Functions
, 2000
"... Algorithmic skeletons provide a promising basis for the automatic utilisation of parallelism at sites of higher-order function use through static program analysis. However, decisions about whether or not to realise particular higher-order function instances as skeletons must be based on informati ..."
Abstract
-
Cited by 25 (12 self)
- Add to MetaCart
Algorithmic skeletons provide a promising basis for the automatic utilisation of parallelism at sites of higher-order function use through static program analysis. However, decisions about whether or not to realise particular higher-order function instances as skeletons must be based on information about available processing resources, and such resources may change subsequent to program analysis. In principle, nested higher-order functions may be realised as nested skeletons. However, where higher-order function arguments result from partially applied functions, free-variable bindings must be identified and communicated through the corresponding skeleton hierarchy to where those arguments are actually applied. Here, a skeleton based parallelising compiler from Standard ML to native code is presented. Hybrid skeletons, which can change from parallel to serial evaluation at run-time, are considered and mechanisms for their nesting are discussed. Compilation stages are illustra...
An Interactive Approach to Profiling Parallel Functional Programs
- In The Draft Proceedings of the International Workshop on the Implementation of Functional Languages
, 1998
"... . The full details of a parallel computation can be very complex. To understand and improve performance one useful resource is a log-file recording all the computational events that change the number or status of parallel tasks. Raw log-files are not easy reading for the programmer, so profiling too ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
. The full details of a parallel computation can be very complex. To understand and improve performance one useful resource is a log-file recording all the computational events that change the number or status of parallel tasks. Raw log-files are not easy reading for the programmer, so profiling tools present only graphical summary charts. These charts sometimes show just what the programmer needs to know, but they often become too crowded, or fail to show enough, or both. Moreover, the charts are static so the programmer has little control over what is displayed. In this paper we discuss a tool that combines the advantages of graphical representation of computations with a query interface. The programmer interactively extracts specific information about the computation. Results of queries can be displayed in a graphical form; and parameters of subsequent queries can be specified by pointing within a past display. 1 Introduction It is often claimed that as functional languages are ref...
Engineering a Parallel Compiler for Standard ML
- In Proceedings of the 10th International Workshop on Implementations of Functional Language
, 1998
"... . We present the design and partial implementation of an automated parallelising compiler for Standard ML using algorithmic skeletons. Source programs are parsed and elaborated using the ML Kit compiler and a small set of higher order functions are automatically detected and replaced with parallel e ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
. We present the design and partial implementation of an automated parallelising compiler for Standard ML using algorithmic skeletons. Source programs are parsed and elaborated using the ML Kit compiler and a small set of higher order functions are automatically detected and replaced with parallel equivalents. Without the presence of performance predictions, the compiler simply instantiates all instances of the known HOFs with parallel equivalents. The resulting SML program is then output as Objective Caml and compiled with the parallel skeleton harnesses giving an executable which runs on networks of workstations. The parallel harnesses are implemented in C with MPI providing the communications subsystem. A substantial parallel exemplar is presented, an implementation of the Canny edge tracking algorithm from computer vision which is parallelised by implementing the map skeleton over a set of subimages. 1 Introduction We are investigating the development of a fully automatic parallel...

