| M. Nilsson and H. Tanaka, "MIMD Execution by SIMD Computers," Journal of Information Processing, Information Processing Society of Japan, vol. 13, no. 1, 1990, pp. 58-61. |
....the number of iterations necessary to compute a point of the Mandelbrot set [1] In [2] nested loops typical for numerical applications are discussed. A very general (but often inefficient) solution is to write the program in a RISC like machine language which can be stored and interpreted locally [3, 4]. Let s be a stack containing the root node of a subtree to be processed on this processor IF isSolution(top(s) THEN printSolution(top(s) WHILE noMoreSiblings(top(s) DO top(s) nextSibling(top(s) ELSE push(firstSuccessor(top(s) Figure 1: Nonrecursive generic depth first search. ....
....the program (possibly dynamically) into a sequence of several test loops each with a different set of tests but, since the loops can be investigated one at a time, we can restrict ourselves to one loop. A simple proof that a single unnested loop is always sufficient, can be taken from [3, 4] by observing that an interpreter for a machine instruction set has this form. By using a locally stored machine program and program counter, each PE can have its own flow of control. Due to its large interpretation overhead, this approach has not raised very much interest yet. However, the ....
[Article contains additional citation context not shown here]
M. Nilsson, H. Tanaka. MIMD execution by SIMD computers. Journal of Information Processing 13-1, 58--61, 1990.
....execution within the cycle. The throughput of the multiple fetch model is sensitive to this order if the operation probabilities are independent (i.e. the probability profile is uniform) For nonuniform profiles, the optimal order can be found using an algorithm developed by Nilsson and Tanaka [ Nilsson and Tanaka, 1988b ] 7.5 Operation Granularity In this section, the two interpretation models are analyzed in order to determine the characteristics of operation sets suitable for each organization. The effect of the operation granularity on the relative performance of the two organizations is studied. The ....
....number of operations executed in each cycle , relative , 64 can be defined as multiple single , where single = M (Eq. 7.4) Clearly, 1 relative k. For a given probability matrix, there is an algorithm for finding the optimal order of operations for the multiple fetch model [ Nilsson and Tanaka, 1988b ] Thus, there is a single optimal multiple value and, consequently, a single relative value associated with an operation set. We now consider the cost of the two organizations. Both the single fetch and multiple fetch organizations have an associated cost that is independent of the ....
[Article contains additional citation context not shown here]
M. Nilsson and H. Tanaka. MIMD execution by SIMD computers. Information Processing Letters, 13(1):58--61, 1988. 239
....2 productive tests per iteration are performed. The test loop Search; GetNextChoice; Search; GetNextChoice; MakeChoicepoint; Solution on the other hand, takes 15 units but about 4 productive tests per iteration are performed it is almost two times more efficient. Similar ideas are discussed in [2, 1, 8]. In [8] it is argued that duplicating tests is useless since some PEs are actually delayed due to large deviations from the average control flow. But this is not always a problem. Often all PEs have quite similar control flow characteristics, and even if there are PEs which are delayed, this only ....
....per iteration are performed. The test loop Search; GetNextChoice; Search; GetNextChoice; MakeChoicepoint; Solution on the other hand, takes 15 units but about 4 productive tests per iteration are performed it is almost two times more efficient. Similar ideas are discussed in [2, 1, 8] In [8] it is argued that duplicating tests is useless since some PEs are actually delayed due to large deviations from the average control flow. But this is not always a problem. Often all PEs have quite similar control flow characteristics, and even if there are PEs which are delayed, this only means ....
[Article contains additional citation context not shown here]
M. Nilsson and H. Tanaka. MIMD execution by SIMD computers. Journal of Information Processing, 13(1):58--61, 1990.
....2 productive tests per iteration are performed. The test loop Search; GetNextChoice; Search; GetNextChoice; MakeChoicepoint; Solution on the other hand, takes 15 units but about 4 productive tests per iteration are performed its almost two times more efficient. Similar ideas are discussed in [3, 1, 10]. In [10] it is argued that duplicating tests is useless since some PEs are actually delayed due to large deviations from the average control flow. But this is not always a problem. Often all PEs have quite similar control flow characteristics, and even if there are PEs which are delayed, this ....
....tests per iteration are performed. The test loop Search; GetNextChoice; Search; GetNextChoice; MakeChoicepoint; Solution on the other hand, takes 15 units but about 4 productive tests per iteration are performed its almost two times more efficient. Similar ideas are discussed in [3, 1, 10] In [10] it is argued that duplicating tests is useless since some PEs are actually delayed due to large deviations from the average control flow. But this is not always a problem. Often all PEs have quite similar control flow characteristics, and even if there are PEs which are delayed, this only means ....
[Article contains additional citation context not shown here]
M. Nilsson and H. Tanaka, MIMD execution by SIMD computers, Journal of Information Processing 13 (1) (1990) 58--61.
....to allow most PEs to prefetch their instructions and operands. Another approach to speeding the instruction interpretation process is to chain together sets of MINTABS instructions to achieve a higher utilization of the execute control signals generated by the central control algorithm. Nilsson [11] and Tanaka have done some preliminary theoretical work in this area. ffl Modification of the MINTABS instruction set. The current MINTABS instruction set was selected primarily to develop a minimal number of distinct operations that must occur in the central control algorithm. Because execution ....
Nilsson, M., and Tanaka, H. Mimd execution by simd computers. Journal of Information Processing 13, 1 (1990).
....of logic circuits on a SIMD machine [6, 20] The major difference between these works and our approach is that we handle general asynchronous and loosely synchronous problems instead of studying individual problems. The instruction level approach has been studied by a number of researchers [8, 9, 22, 25, 41, 40]. As mentioned above, the major restriction of this approach is that the entire instruction set must be cycled through to execute one instruction step for every processor. A common method to reduce the average number of instructions emanated in each execution cycle is to perform global or s to ....
....degree of divergence for certain applications. Having a barrier at the end of each WHERE statement (as well as each FORALL statement) is a good idea [8] Other work includes an adaptive algorithm which changes the order of instructions emanated to maximize the expected number of active processors [25]. Besides this issue, load balancing and processor suspension are also unsolved problems. The applications implemented in these systems are non communicating [41] or have a barrier at the end of the program [9] Collins has discussed the communication issue and proposed a scheme to delay execution ....
M. Nilsson and H. Tanaka. MIMD execution by SIMD computers. Journal of Information Processing, 13(1), 1990.
.... instruction cycle, then an MIMD like program could efficiently execute on the SIMD machine [8] The instruction level approach implements this idea directly: the instructions are interpreted in parallel across all of the processors by control signals emanating from the central control unit [3, 4, 12, 14, 23]. The major constraint of this approach is that the central control unit has to cycle through almost the entire instruction set for each instruction execution because each processor may execute different instructions. A common method to reduce the average number of instructions emanated in each ....
M. Nilsson and H. Tanaka. MIMD execution by SIMD computers. Journal of Information Processing, 13(1), 1990.
....data parallel programs into a form suitable for execution on MIMD processors. Thus, their compilers allow a MIMD machine to support both SIMD and MIMD processing. Likewise, several investigators have also proposed the interpretation of task program parallelism on the PEs of a SIMD machine [1, 4, 6, 15, 20, 23, 25]. Thus, the SIMD control unit can execute traditional data parallel programs or, by loading programs and data into the PE memories, concurrently interpret di erent programs (tasks) on each PE. In contrast to these software solutions, some e orts have been directed to building hardware solutions ....
....of the code where signi cant branching occurs are executed via MIMD interpretation. In experiments with the CM 2, Collins reports the need for very few parallel branches (in two examples only 6) before this approach shows any speedup. The more general problem of MIMD interpretation is discussed in [1, 4, 6, 15, 20, 23, 25]. Nilsson reports the performance of two distinct MIMD instruction sets designed for the interpretation of logic programs [18, 19] In addition, Nilsson also proposes to dynamically adjust the instruction interpretation cycle based upon the programs behavior [20] The work of [1, 4, 6, 23, 25] all ....
[Article contains additional citation context not shown here]
Nilsson, M., and Tanaka, H. MIMD execution by SIMD computers. Information Processing Letters 13, 1 (1988), 58-61.
No context found.
M. Nilsson and H. Tanaka, "MIMD Execution by SIMD Computers," Journal of Information Processing, Information Processing Society of Japan, vol. 13, no. 1, 1990, pp. 58-61.
No context found.
M. Nilsson and H. Tanaka, "MIMD Execution bySIMD Computers," Journal of Information Processing, Information Processing Society of Japan, vol. 13, no. 1, 1990, pp.58-61.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC