| Fox, Geoffrey, "What Have We Learnt from Using Real Parallel Machines to Solve Real Problems", Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, Vol. 2, ACM Press, pp. 897-955, 1988. |
....time reducing monitoring intrusions, we propose a notion of time based on accurate built in cycle counters. We avoid unnecessary synchronizations through a highly precise synchronization at the start of the monitoring session and, as the execution progresses, relaxing to a loose synchronization [7] in which the monitoring tool synchronizes sampling by looking at the highly accurate cycle counters in the CPUs. The built in cycle counters mechanism acts as virtual barriers. The cycle counters have to be delivered as time stamps in the performance packets containing the sample information sent ....
G.C. Fox. What have we learnt from using real parallel machines to solve real problems? In Proc. of the 3rd conference on Hypercube concurrent computers and applications, pages Vol.2 pp 897--955, Pasadena, CA, January 1988.
....propose a notion of time based on accurate built in cycle counters rather than on frequent timestamp messages. We avoid unnecessary synchronizations through a highly precise synchronization at the start of the monitoring session and, as the execution progresses, relaxing to a loose synchronization [7] in which the monitoring tool synchronizes sampling by looking at the highly accurate cycle counters in the CPUs. The built in cycle counters mechanism acts as virtual barriers. The cycle counters have to be delivered as timestamps in the performance packets containing the sample information sent ....
G.C. Fox. What have we learnt from using real parallel machines to solve real problems? In Proceedings of the 3rd conference on Hypercube concurrent computers and applications, pages Vol.2 pp 897--955, Pasadena, CA, USA, Jan 1988.
....time reducing monitoring intrusions, we propose a notion of time based on accurate built in cycle counters. We avoid unnecessary synchronizations through a highly precise synchronization at the start of the monitoring session and, as the execution progresses, relaxing to a loose synchronization [7] in which the monitoring tool synchronizes sampling by looking at the highly accurate cycle counters in the CPUs. The built in cycle counters mechanism acts as virtual barriers. The cycle counters have to be delivered as timestamps in the performance packets containing the sample information sent ....
G.C. Fox. What have we learnt from using real parallel machines to solve real problems? In Proceedings of the 3rd conference on Hypercube concurrent computers and applications, pages Vol.2 pp 897--955, Pasadena, CA, USA, Jan 1988.
....dramatically simplifies the state space of a program compared to that of an MIMD program with thousands of independent loci of control. 3) There is a wide range of data parallel algorithms. Most parallel algorithms in textbooks are data parallel (compare for instance [1, 7] According to Fox [6], more than 80 of the 84 existing, parallel applications he examined fall in the class of synchronous, data parallel programs. Furthermore, systolic algorithms as well as vector algorithms are special cases of data parallel algorithms. But data parallelism, at least as defined by current ....
Fox GC. What have we learnt from using real parallel machines to solve real problems. In Proc. of the Third Conf. on Hypercube Concurrent Computers and Applications, vol 2, pp 897--955, Pasadena, CA, 1988. ACM Press, New York.
....dramatically simplifies the state space of a program compared to that of an MIMD program with thousands of independent loci of control. 3) There is a wide range of data parallel algorithms. Most parallel algorithms in textbooks are data parallel (compare for instance [1, 5] According to Fox [4], more than 80 of the 84 existing, parallel applications he examined fall in the class of synchronous, data parallel programs. Furthermore, systolic algorithms as well as vector algorithms are special cases of data parallel algorithms. But data parallelism, at least as defined by current ....
Geoffrey C. Fox. What have we learnt from using real parallel machines to solve real problems? In Proc. of the Third Conference on Hypercube Concurrent Computers and Applications, volume 2, pages 897--955, Pasadena, CA, 1988. ACM Press, New York.
....In addition, the majority of successful scientific applications of parallel computers are amenable to data parallel implementation. In a study of 84 scientific applications of high performance computers, Fox found that 70 applications, or 83 of those considered, fit the data parallel model [Fox 88] In the data parallel model, a program has a single locus of control a single program counter that advances through the program. This single execution stream invokes small pieces of parallel code. This model is supported directly by SIMD machines, including the Connection Machine CM 1 and ....
G. C. Fox. What have we learnt from using real parallel machines to solve real problems. In Proceedings of Third Conference on Hypercube Concurrent Computers and Applications, pages 897--955, 1988.
....parallel language and then compiled to a specific communication architecture. To make this task manageable, we focus on the class of data parallel languages, and we pick the C language as one representative for our experiments. The data parallel model is an important one; a study by Fox [Fox 88] has shown that a majority of existing scientific applications fit that model well. As a framework for our architectural studies, we concentrate on MIMD parallel computers and three competing communication architectures message passing, remote memory access and cache coherent shared memory. ....
....counter. This execution model avoids race conditions and thus greatly simplifies the understanding and debugging of data parallel programs. While the data parallel model of execution is not as general as arbitrary MIMD computation, it is nonetheless a very powerful model. A study by Fox [Fox 88] showed that 70 out of 84 scientific applications studied, or over 80 , fit the data parallel model. Klaiber and Frankel [Klaiber Frankel 93] have demonstrated that even an application that intuitively does not seem to fit the data parallel model a distributed event driven simulation ....
G. C. Fox. What have we learnt from using real parallel machines to solve real problems. In Proceedings of Third Conference on Hypercube Concurrent Computers and Applications, pages 897--955, 1988. 116
....to the independent execution of disconnected components of the same program. Synchronous and loosely synchronous problems present a somehow different synchronization structure, but both rely on data decomposition techniques. According to an extensive analysis of 84 real applications presented in [14], it was estimated that these two classes of problems dominated scientific and engineering applications being used in 76 percent of the applications. Asynchronous problems, which are for instance represented by event driven simulations, represented 10 percent of the studied problems. Finally, ....
G. Fox. What Have We Learnt from Using Real Parallel Machines to Solve Real Problems. In Proceedings 3rd Conf. Hypercube Concurrent Computers and Applications, 1988.
....chosen a data parallel style of programming, where parallelism is expressed as (parallel) operations over (large) sets of data. A large fraction of the existing parallel algorithms for PRAM and other machine models [46] are either data parallel in nature or can be easily converted to such a form [9, 32, 58]. Data parallel languages have historically been linked with SIMD parallel computers, and researchers have largely shied from implementing such languages on MIMDparallel machines. A nave implementation of data parallelism on a MIMD machine has the following performance bottlenecks, which affect ....
....Vectors are enclosed in square brackets. Segment descriptors are represented in lengths form enclosed in parentheses. Name Description and example make segdes#len# Returns a segment descriptor given the lengths of the segments. All elements of the input must be non negative. len =[32] result = 32) lengths#segdes# Returns the lengths of the segments given the segment descriptor. segdes = 301) result = 301] Tab l e 3 : Illustration of the segment descriptor manipulation instructions. Segment descriptors are shown in lengths form enclosed in parentheses. Vectors are ....
FOX, G. C. What have we learnt from using real parallel machines to solve real problems? In Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, Volume II (Pasadena, CA, Jan. 1988), G. C. Fox, Ed., pp. 897--955.
....increasingly apparent that most of the available parallelism is at the data level. More precisely, most massive scale parallelism is achieved by processing numerous data points that can be processed concurrently, rather than by the presence of numerous functions that interact to solve a problem [ Fox, 1988 ] In addition, even in applications where functional parallelism exists at a massive scale, it is difficult to write, coordinate, and debug thousands of concurrent interacting threads. 2 Since SIMD machines have been built specifically to take advantage of data parallelism, the move to massive ....
....patterns, each pattern must be broadcast separately time sharing the use of the control unit. This phenomenon is called the SIMD control overhead. Fox found in an extensive survey of massively parallel applications that more than 50 of applications were loosely synchronous or asynchronous [ Fox, 1988 ] Such applications require a more flexible control organization to execute efficiently. Accordingly, MIMD machines have been viewed as the only practical alternative for general purpose massively parallel processing. Consequently, research has focused on making the coordination among the MIMD ....
[Article contains additional citation context not shown here]
G. Fox. What have we learnt from using real parallel machines to solve real problems? Technical Report C3P--522, Caltech Concurrent Computation Program, California Institute of Technology, Pasadena, CA 91125, March 1988.
....algorithms have several advantages over their control parallel counterparts: ffl Data parallel algorithms are more common. Although data parallelism is not appropriate for all applications, it has been estimated that 90 of scientific and engineering applications are data parallel in nature [17]. ffl Data parallel algorithms are easier to design and debug. Control parallel algorithms tend to suffer from time related errors such as deadlocks and data incoherence [35] Data parallel algorithms, however, are logically synchronous and inherently deterministic. ffl Data parallel algorithms ....
G. Fox. What have we learnt from using real parallel machines to solve real problems? Technical Report C3P-522, California Institute of Technology, December 1989.
....simplifies the state space of a data parallel program compared to that of a MIMD program with thousands of independent loci of control. 3) There is a wide range of data parallel algorithms. Most parallel algorithms in textbooks are data parallel (compare for instance [1, 8] According to Fox [7], more than 80 of the 84 existing, parallel applications he examined fall in the class of synchronous, data parallel programs. Furthermore, systolic algorithms as well as vector algorithms are special cases of data parallel algorithms. But data parallelism, at least as defined by current ....
Geoffrey C. Fox. What have we learnt from using real parallel machines to solve real problems. In Proc. of the Third Conf. on Hypercube Concurrent Computers and Applications, volume 2, pages 897--955, Pasadena, CA, 1988. ACM Press, New York.
....algorithms have several advantages over their control parallel counterparts: ffl Data parallel algorithms are more common. Although data parallelism is not appropriate for all applications, it has been estimated that 90 of scientific and engineering applications are data parallel in nature [17]. ffl Data parallel algorithms are easier to design and debug. Control parallel algorithms tend to suffer from time related errors such as deadlocks and data incoherence [35] Data parallel algorithms, however, are logically synchronous and inherently deterministic. ffl Data parallel algorithms ....
G. Fox. What have we learnt from using real parallel machines to solve real problems? Technical Report C3P-522, California Institute of Technology, December 1989.
....algorithms have several advantages over their control parallel counterparts: Data parallel algorithms are more common. Although data parallelism is not appropriate for all applications, it has been estimated that 90 of scientific and engineering problems are amenable to data parallel solutions [28]. Data parallel algorithms are easier to design and debug. Control parallel algorithms tend to suffer from time related errors such as deadlocks and data incoherence [49] Data parallel algorithms, however, are logically synchronous and inherently deterministic. Data parallel algorithms are ....
....performing administrative functions, the APACHE system can be used for programs that conform to the master slave model; the slave program is simply treated as if it were a true SPMD program and the master program is disregarded. This first simplifying assumption is not a serious limitation. Fox [28] reports that 90 percent of scientific problems are amenable to SPMD solutions. Also, Geist [32] indicates that most PVM programs conform to a master slave model of computation. 2. The nodes and network connections of the parallel machine are homogeneous. Although one of the most important ....
G. C. Fox. What Have We Learnt From Using Real Parallel Machines to Solve Real Problems?. Tech. Rep. C3P-522 (December 1989), California Institute of Technology.
....performance issues along with conclusions and future work. 2 Scalable Algorithms Scalable, data parallel applications are an important sub class of problems which can be solved on multicomputers. It has been estimated that 90 of scientific and engineering applications are data parallel in nature [13]. Many of these programs can be scaled up by varying the number and size of data elements in order to improve accuracy and extend results. Scalable algorithms are able to utilize increasing numbers of processing elements efficiently. The number of processors that a fixed problem size application ....
G. Fox. What have we learnt from using real parallel machines to solve real problems? Technical Report C3P-522, Caltech, December 1989.
....have chosen a data parallel style of programming, where parallelism is expressed as (parallel) operations over (large) sets of data. A large fraction of the existing parallel algorithms for PRAM and other machine models are either data parallel in nature or can be easily converted to such a form [5, 16, 31]. Data parallel languages have historically been linked with SIMD parallel computers, and researchers have largely shied from implementingsuch languages on MIMD parallel machines. A naive implementation of data parallelism on a MIMD machine has the following performance bottlenecks, which affect ....
Geoffrey C. Fox. What Have We Learnt from Using Real Parallel Machines to Solve Real Problems? In Geoffrey Fox, editor, Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, Volume II, pages 897--955, Pasadena, CA, January 1988.
....entire problem. While a control parallel implementation provides fine control of the computation (in essence, an assembly language of parallel programming) data parallel systems are generally more easy to program, and are capable of expressing effective solutions to a wide variety of problems (Fox, 1988). Two significant extensions for parallel programming support the data parallel paradigm. The most recent standard for Fortran, denoted Fortran 90, contains support for operating on whole arrays without an enclosing DO loop: conceptually, the specified operation is performed on each element of the ....
Fox, G. (1988). What have we learnt from using real parallel machines to solve real problems.
....parallel algorithms on a DEC MPP 12000SX massively parallel system, the (MaspPar MP1) The MasPar ASIMD [BN] Ka] Autonomous Single Instruction Multiple Data) architecture is advertised as having all the beauty of SIMD and the power of MIMD. It allows loosely synchronous data parallelism [Fo] by providing each processing ele1 ment with processor autonomy, which specifically enhances each PE with execution autonomy, addressing autonomy, connection autonomy and I O autonomy. The MPP model resembles the PRAM in that it consists of a number of PE s connected to a global shared memory, ....
Fox G.C., "What Have We Learnt from Using Real Parallel Machines to Solve Real Problems?", Caltech Report C 3 P-522, December 1989.
....parallel programs in a Single Program Multiple Data (SPMD) style, where data is distributed and the identical program is run on each processor. In a study by Geoffrey Fox, 83 out of 84 scientific problems addressed at the California Institute of Technology could be expressed in the SPMD style [22]. Dataflow computer architectures [4] 5] 26] 30] 43] have been primarily associated with functional programming and exploitation of control parallelism. However, the advantages of dataflow computers are not specific to efficient implementation of functional languages, and recent research has ....
Geoffrey C. Fox. What have we learnt from using real parallel machines to solve real problems? In Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, pages 897--955, New York, 1988. ACM.
....performance issues along with conclusions and future work. 2 Scalable Algorithms Scalable, data parallel applications are an important sub class of problems which can be solved on multicomputers. It has been estimated that 90 of scientific and engineering applications are data parallel in nature [15]. Many of these programs can be scaled up by varying the number and size of data elements. Scalable algorithms are able to utilize increasing numbers of processing elements efficiently. The number of processors that a fixed problem size application can effectively use is limited by the fraction of ....
G. Fox, "What have we learnt from using real parallel machines to solve real problems?," Tech. Rep. C3P-522, Caltech, December 1989.
....in [Fox:92e] Fox:94b] Fox:94h] Fox:94c] Fox:94i] Mills:93a] Here, we summarize relevant features of it in Section 2. Section 3 reviews and extends a classification of problem architectures originally developed in 1988 from a rather complete survey of parallel applications at the time [Fox:88b], Fox:88tt] Fox:91g] Fox:94a] In Section 4, we show how the different problem categories or architectures are addressed by parallel software systems with different capabilities. We give illustrative examples, but not an exhaustive list of existing software systems with these ....
....or loosely synchronous class. It is interesting that massively parallel distributed memory MIMD machines that have an asynchronous hardware architecture are perhaps most relevant for loosely synchronous scientific problems. We have looked at many more applications since the detailed survey in [Fox:88b], and the general picture described above remains valid. Industrial applications have less synchronous and more loosely synchronous problems than academic problems. We have recently recognized that many complicated problems are mixtures of the basic classifications. The first major example with ....
Fox, G. C. "What have we learnt from using real parallel machines to solve real problems?," in G. C. Fox, editor, The Third Conference on Hypercube Concurrent Computers and Applications, Volume 2, pages 897--955. ACM Press, New York, January 1988. Caltech Report C3P522.
....essential ffl See SIMNET or DSI with very loosely coupled distributed event driven simulation Figure 8: The Asynchronous Problem Class architecture are perhaps most relevant for loosely synchronous scientific problems. We have looked at many more applications since the detailed survey in [Fox:88b], and the general picture described above remains valid. Industrial applications have less synchronous and more loosely synchronous problems than academic problems. We have recently recognized that many complicated problems are mixtures of the basic classifications. The first major example with ....
Fox, G. C. "What have we learnt from using real parallel machines to solve real problems?," in G. C. Fox, editor, The Third Conference on Hypercube Concurrent Computers and Applications, Volume 2, pages 897--955. ACM Press, New York, January 1988. Caltech Report C3P-522.
No context found.
Fox, Geoffrey, "What Have We Learnt from Using Real Parallel Machines to Solve Real Problems", Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, Vol. 2, ACM Press, pp. 897-955, 1988.
No context found.
Fox, Geoffrey, "What Have We Learnt from Using Real Parallel Machines to Solve Real Problems", Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, Vol. 2, ACM Press, pp. 897-955, 1988.
No context found.
Geoffrey C. Fox. What Have We Learnt from Using Real Parallel Machines to Solve Real Problems? In Geoffrey Fox, editor, Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, Pasadena, CA, Volume II, pages 897--955, New York, NY, January 1988. ACM Press.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC