28 citations found. Retrieving documents...
F. Berman, and L. Snyder, "On Mapping Parallel Algorithms Into Parallel Architectures," In Proceedings of the 1984.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

A Lower Bound on Embedding Large Hypercubes into Small.. - Gupta, Boals, Sherwani (1990)   (1 citation)  (Correct)

....an edge in G. The load is defined as the maximum number of nodes of G assigned to any node of 1 H: We say that an embedding achieves a balanced load when = d N M e. Embeddings of a guest network G into a host network H of the same topology, but smaller size have previously been studied in [2, 5, 6, 7]. In [5] Fishburn and Finkel consider various architectures for specific values of N and M . Berman and Snyder [2] present embeddings by performing contractions which guarantee a dilation of 1, but do not achieve a balanced load. In [7] Gupta and Hambrusch present efficient balanced load ....

....achieves a balanced load when = d N M e. Embeddings of a guest network G into a host network H of the same topology, but smaller size have previously been studied in [2, 5, 6, 7] In [5] Fishburn and Finkel consider various architectures for specific values of N and M . Berman and Snyder [2] present embeddings by performing contractions which guarantee a dilation of 1, but do not achieve a balanced load. In [7] Gupta and Hambrusch present efficient balanced load embeddings of complete binary trees for all the values of N and M . More recently, Sang and Sudborough [13] have ....

F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. Journal of Parallel and Distributed Computing, 4:439--458, 1987.


On Improving the Performance of Tree Machines - Gupta, Wang (1995)   (Correct)

....any arbitrary algorithm from T is to be simulated on CT . We can attack this question by formulating the problem of simulation as a graph embedding problem. In graph embedding problems, tree T is embedded into CT so that the cost measures dilation, congestion and node utilization are minimized [3, 4, 12]. In [6] we present an efficient embedding of T into CT so that the embedding achieves a dilation, congestion and node utilization of two. This implies that no more than two nodes of T are simulated by any node of CT and the simulation of any general algorithm from T onto CT incurs a slow down ....

F. Berman and L. Snyder. On Mapping Parallel Algorithms into Parallel Architectures. Journal of Parallel and Distributed Computing, 4:439--458, 1987.


Load Balanced Tree Embeddings - Gupta, Hambrusch (1991)   (2 citations)  (Correct)

....with dependencies to processors. The objective of a mapping is to minimize execution time. A general approach is to distribute work evenly among the processors and to minimize interprocessor communication. Graph embeddings have been used successfully as models for developing efficient mappings [4, 5, 8, 19, 20, 23, 27] and for understanding the computational equivalence between parallel architectures [2, 6, 7, 16, 18, 24] In this paper we consider the problem of mapping, via graph embeddings, a complete binary tree T of size n to that of a complete binary tree H of size m with n m. Embeddings of one ....

....16, 18, 24] In this paper we consider the problem of mapping, via graph embeddings, a complete binary tree T of size n to that of a complete binary tree H of size m with n m. Embeddings of one architecture into another one of the same topology, but smaller size have previously been studied in [5, 12, 25]. Furthermore, a number of solutions for mapping task graphs onto a parallel machine can be viewed as embeddings into architectures of smaller size and different topology [9, 11, 19, 26] A complete binary tree is a binary tree in which all leaves are on the same level and every interior node has ....

[Article contains additional citation context not shown here]

F. Berman and L. Snyder. On Mapping Parallel Algorithms into Parallel Architectures. JPDC, 4:439--458, 1987.


Multiple Network Embeddings into Hypercubes - Gupta, Hambrusch (1996)   (6 citations)  (Correct)

....computational equivalence (or non equivalence) between networks of different topology, but efficient embeddings lead to efficient simulations of algorithms originally designed for G on host H. Embeddings and their implications to parallel processings have recently been studied extensively [1, 2, 4, 6, 11, 13, 14, 19]. The hypercube architecture has proven to be a versatile and suitable architecture for designing and implementing parallel algorithms [12, 17] and it allows for efficient embeddings of many networks (e.g. trees, meshes, butterflies, pyramids) 3, 5, 6, 15, 19] It is not surprising that many ....

....problem with a totally unbalanced load distribution can be viewed as embedding an n node guest network G i into an m PE host with m = n=r, where the m PE host is now a subcube of the hypercube H. Embeddings of a single network into one of smaller size, but the same topology, are described in [2, 7, 8, 9, 10, 20]. 19 We first describe how to embed r complete binary trees into H with a dilation of 2 and a congestion of 2. Recall that the congestion was 5 in the balanced load embedding. The main idea in the embedding is to contract the guest networks and then embed the contracted networks into H. Let T 0 ....

F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. Journal of Parallel and Distributed Computing, 4:439--458, 1987.


A Mapping Heuristic for Minimizing Network Contention - Perego (1997)   (Correct)

.... In general, a mapping algorithm must solve two main sub problems: a) the topological variation problem, arising when G p and G a graphs are not isomorphic, and, b) the cardinality variation problem, arising because the number of nodes of G p may be greater than the number of processors of G a [6]. Due to the NP Hardness of the problem, to date many exact and approximated solutions dealing with static process based models of the mapping problem have been proposed [20] Most proposals can be grouped on the basis of algorithm structure into the following four classes: Branch Bound ....

F. Berman and L. Snyder. On Mapping Parallel Algorithms into Parallel Architectures. Journal of Parallel and Distributed Computing, 4(5):439--458, October 1987.


Efficient Algorithms for Scheduling and Mapping of Parallel.. - Kwok (1994)   (Correct)

....among the processors of a distributed computing system to optimize overall system performance. In contrast, the scheduling and mapping problem requires the allocation of multiple interacting tasks of a single parallel program in order to minimize the completion time on the parallel computer system [10], 15] 17] 45] 55] 65] While job scheduling requires dynamic run time scheduling that is not a priori decidable, the scheduling and mapping problem can be addressed in both static [4] 7] 9] 18] 67] 39] 42] 43] 44] 50] 51] 52] 56] 58] 61] 62] 70] as well as ....

F. Berman and L. Snyder, "On Mapping Parallel Algorithms into Parallel Architectures," Journal of Parallel and Distributed Computing, vol. 4, pp. 439-458, 1987.


Design, Implementation and Evaluation of Parallel.. - Choudhary, Liao..   (1 citation)  (Correct)

....in the system. Ideally, to achieve maximum parallelism, the load must be evenly distributed across the processors. The problem of statically mapping the workload of a parallel algorithm to processors in a distributed memory system, has been studied under different problem models, such as [1, 2]. The mapping policies are adequate when an application consists of a single task, and the computational load can be determined statically. These static mapping 3 policies do not model applications consisting of a sequence of tasks (algorithms) where the output of one task becomes the input to ....

F. Berman and L. Snyder, "On Mapping Parallel Algorithms into Parallel Architectures," Journal of Parallel and Distributed Computing, vol. 4, pp. 439--458, 1987.


On the Embedding of a Class of Regular Graphs in a Faulty.. - Tseng, Lai (1993)   (Correct)

....processor allocation, hypercube, binary reflected trees, fault tolerance. 1 Introduction Embedding a guest graph into a host graph, or the graph embedding problem, has long been recognized as being suitable for modeling the problem of processor allocation in a parallel or distributed system [1, 2]. Because of the importance and popularity of the hypercube as a network architecture for concurrent computers, the problem of embedding in a hypercube has received much attention from researchers and has been intensively studied for various guest graphs, such as rings [12] trees [20, 21] ....

F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. J. of Parallel and Distrib. Comput., 4:439--458, 1987.


Design, Implementation and Evaluation of Parallel.. - Choudhary, Liao.. (1998)   (1 citation)  (Correct)

....in the system. Ideally, to achieve maximum parallelism, the load must be evenly distributed across the processors. The problem of statically mapping the workload of a parallel algorithm to processors in a distributed memory system, has been studied under different problem models, such as [1, 2]. These static mapping policies do not model applications consisting of a sequence of tasks (algorithms) CFAR Processing Reports Detection Cube Data CPI Processing Filter Doppler Weight Computation Weight Computation (Hard Case) Easy Case) Compression Pulse P P P P PP SD ....

F. Berman and L. Snyder. "On Mapping Parallel Algorithms into Parallel Architectures,". Journal of Parallel and Distributed Computing, 4:439--458, 1987.


Models of Machines and Modules for Mapping to Minimise.. - Norman, Thanisch   (Correct)

....the communication overhead. In the case of a multicomputer, the overhead is often defined as some mis match between the links of the process graph and the links of the processor graph. The algorithms can either assume as many processors as processes [Bokhari 1981] or can assume multiprocessing [Berman and Snyder 1987]. Udiavarand Stiles [1990] use simulated annealing to solve a mapping problem in which communications costs 8 between processors are variable dependent upon the distances between processors in a graph the topology of which is a parameter in the optimisation. We can describe a model similar ....

....that messages are routed between processors according to some deterministic routing strategy. The overall cost of a set of message transfers is the time to completion which is the maximum of any time to completion allowing for contention. This approach is taken by Lee and Aggarwal [1987] and Berman and Snyder [1987]. The second extension is to consider the problem when there are more processes than processors. Berman and Snyder [1987] extend the analysis to consider multiprocessing, having contracted the process graph to have the same number of processors as the processor graph. 8.4 Introducing Precedence ....

[Article contains additional citation context not shown here]

Berman, F. and Snyder, L. (1987). On mapping parallel algorithms into parallel architectures. J. Parallel Dist. Comput., 4:439--458.


Models of Machines and Computation for Mapping in Multicomputers - Norman, Thanisch (1993)   (55 citations)  (Correct)

....messages are routed between processors according to some deterministic routing strategy. The overall cost of a set of message transfers is the time to completion which is the maximum of any time to completion allowing for message contention. This approach is taken by Lee and Aggarwal [1987] and Berman and Snyder [1987]. The second extension is to consider the problem when there are more processes than processors. Berman and Snyder [1987] extend the analysis to consider multiprocessing, having contracted the process graph to have the same number of processors as the processor graph. An alternative is to ....

....message transfers is the time to completion which is the maximum of any time to completion allowing for message contention. This approach is taken by Lee and Aggarwal [1987] and Berman and Snyder [1987] The second extension is to consider the problem when there are more processes than processors. Berman and Snyder [1987] extend the analysis to consider multiprocessing, having contracted the process graph to have the same number of processors as the processor graph. An alternative is to consider the problem of building a multicomputer whose structure matches the computation. Thanisch and Norman [1990] point out ....

Berman, F. and Snyder, L. (1987). On mapping parallel algorithms into parallel architectures. J. Parallel Dist. Comput., 4:439--458.


A Tight Layout of the Butterfly Network - Avior, Calamoneri, Even, Litman, .. (1996)   (11 citations)  (Correct)

....in the study of the VLSI layout problem for integrated circuits [13] as well as in the study of algorithms for drawing graphs. Further, each such layout is a restricted form of embedding of a graph in the grid, hence contributes to the study of the mapping problem for parallel architectures [2, 5], particularly the problem of mapping parallel programs onto mesh structured parallel architectures; cf. 12] The fields of graph embedding and VLSI layout have developed powerful techniques which produce embeddings and layouts which are quite efficient often within constant factors of optimal; ....

F. Berman and L. Snyder (1987): On mapping parallel algorithms into parallel architectures. J. Parallel Distr. Comput. 4, 439-458.


Multi-Domain WDM Network Structures for Large-Scale.. - Khaled Aly   (Correct)

.... patterns of parallel algorithms and the various network topologies [20] Parallel computer architectures with fixed topologies have been proposed and implemented to support potential applications [21] Different algorithm structures may be mapped onto an architecture possessing a fixed topology [22]. To attain more versatility in parallel computing, reconfigurable interconnection networks have been sought through three main approaches: topology embedding, permutation networks and embedded switch lattices. Embedding one interconnection network onto another while maintaining unity expansion ....

F. Berman, "On mapping parallel algorithms into parallel architectures," Journal of Parallel and Distributed Computing, vol. 4, pp. 439--458, 1987.


Reconfigurable Parallel Computer Architecture Based on.. - Khaled Aly (1992)   (Correct)

.... of efficient parallel algorithms and the various network topologies [1] Parallel computer architectures with fixed topologies have been proposed and implemented to support potential applications [2] Different algorithm structures may be mapped onto an architecture possessing a fixed topology [3]. To attain more versatility in parallel computing, reconfigurable interconnection networks have been sought through three main approaches: topology embedding, permutation networks and embedded switch lattices. This work was supported by the National Science Foundation under Grants CCR 9010774 ....

F. Berman, "On mapping parallel algorithms into parallel architectures," J. Parallel and Distributed Systems, vol. 4, pp. 439--458, 1987.


A Fast Recursive Mapping Algorithm - Chen, Eshaghian (1995)   (Correct)

....to be matched against the system graph in order to minimize the overall execution time. This problem has been known to be NP complete in its general form as well as in several restricted forms [10] In an attempt to solve the problem in a general case, a number of heuristics have been introduced [2, 23, 1, 16, 10]. Bokhari in [2] searches for the best matching of the edges of the undirected task graph versus the system graph. This heuristic algorithm is based on local search and pair wise exchange. Lee and Aggarwal s mapping strategy is another example of this approach but considers directed task graph ....

....[16] They both assume the number of nodes of the task graph to be no greater than that of the system graph. The time complexities of both algorithms are O(N 3 ) To reduce the complexity of the mapping problem, a number of approaches such as graph contraction and clustering have been studied [1, 15, 22, 25, 26]. However, in all of these graph matching based techniques, only the task graph is clustered and the entire task graph is then matched against the entire system graph. In this paper, we will present a new mapping technique which not only clusters the task graph but also clusters the system graph ....

[Article contains additional citation context not shown here]

F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. Journal of Parallel and Distributed Computing, 4:439--458, 1987.


Video Signal Processing and Coding on Data-Parallel.. - Moulin, Ogielski.. (1995)   (2 citations)  (Correct)

....many VSP tasks can be efficiently implemented when the local PE memories are explicitly considered. The physical interconnection topology need not limit the possible communication patterns. The emulation of one network by another enables software portability across different parallel architectures [11, 12]. It has been convenient to classify parallel computer architectures into two families: the data parallel machines and program parallel machines. ffl In a data parallel (SIMD) computer all PEs execute essentially the same program, although they may contain different data. There is a ....

Berman, F., and Snyder, L. On mapping parallel algorithms into parallel architectures. J. of Parallel and Distributed Computing 4 (1987), pp. 439-458.


On Embedding Ternary Trees into Boolean Hypercubes (Extended.. - Gupta, al.   (Correct)

....the congestion of the embedding. The expansion ffl is defined to be the ratio of the number of PEs in H to the number of nodes in G. Graph embeddings minimizing ffi, and ffl for various pairs of graphs G and H and their implications to parallel processing have recently been studied extensively [1, 2, 3, 7, 10, 11, 13, 15, 14, 19]. In this paper our main focus is to study the problem of embedding when G is a complete ternary (3 ary) tree and H is a boolean hypercube. The problem of efficiently embedding a k ary tree into hypercube with k 3 has largely remained unsolved, even though optimal embeddings (i.e. embeddings ....

F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. JPDC, 4:439-- 458, 1987.


A Methodology for Initiating Arbitrary Structured Programs in.. - Cotronis (1995)   (Correct)

....Methodology for Initiating Arbitrary Structured Programs in PARIX by Interpreting Graphs J.Y. Cotronis Dept. of Informatics, Univ. of Athens, Panepistimiopolis, 157 71 Athens, GREECE. tel. 30 1 7230172 fax: 30 1 7219 561 e mail: cotronis di.uoa.ariadne t.gr Abstract. We present a distributed program design methodology by which we overcome the programming overhead effort required for creating arbitrary structured parallel applications in PARIX. We develop general program module ....

....Methodology for Initiating Arbitrary Structured Programs in PARIX by Interpreting Graphs J.Y. Cotronis Dept. of Informatics, Univ. of Athens, Panepistimiopolis, 157 71 Athens, GREECE. tel. 30 1 7230172 fax: 30 1 7219 561 e mail: cotronis di.uoa.ariadne t.gr Abstract. We present a distributed program design methodology by which we overcome the programming overhead effort required for creating arbitrary structured parallel applications in PARIX. We develop general program module creation and ....

[Article contains additional citation context not shown here]

F. Berman, L.Snyder, `On mapping parallel algorithms into parallel architectures', J. Parall. Distrib. Comput. 4, 5, 439-458.


Composition of Specifications of Message Passing.. - Cotronis, Tsiatsoulis (1997)   (Correct)

.... th Hellenic Conference on Informatics, December 1997, Athens, Greece. Composition of Specifications of Message Passing Applications Composed by the Ensemble Methodology J.Y. Cotronis and Z. Tsiatsoulis Department of Informatics, University of Athens, Panepistimiopolis, 157 71 Athens, Greece. Tel. 301 7230172, fax: 301 7219561, e mail: ....

....th Hellenic Conference on Informatics, December 1997, Athens, Greece. Composition of Specifications of Message Passing Applications Composed by the Ensemble Methodology J.Y. Cotronis and Z. Tsiatsoulis Department of Informatics, University of Athens, Panepistimiopolis, 157 71 Athens, Greece. Tel. 301 7230172, fax: 301 7219561, e mail: cotronis, zack di.uoa.gr Abstract We present a specification composition technique which supports the message passing composition of applications by the Ensemble methodology. In Ensemble, applications are built by composing ....

[Article contains additional citation context not shown here]

F. Berman and L. Snyder, "On Mapping Parallel Algorithms into Parallel Architectures", Journal of Parallel and Distributed Computing, 4(5), 439-458.


Compressing Cube-Connected Cycles and Butterfly Networks - Klasing, Lüling, Monien (1990)   (Correct)

....needing little simulation time) on the smaller target network. Solutions to this problem, which is commonly modeled as a graph embedding problem, have been proposed so far for common network structures like hypercubes, binary trees, meshes, shuffle exchange networks, deBruijn networks, etc. in [1, 3, 4, 5, 6, 7, 8, 10, 16, 17]. So far, only partial results are known about two classes of networks which are very important for practical purposes, namely the cube connected cycles (CCC) as introduced in [18] and the butterfly network (BFN ) In [4, 8, 17] embeddings with optimum dilation and load are presented in the ....

....The authors also restrict themselves to special kinds of embeddings of a very regular structure, like coverings [4] homogeneous emulations [8] and homomorphisms [17] Because of the very restricted nature, Bodlaender [4] and Peine [17] are also able to classify their embeddings completely. In [3], a general procedure is described for mapping parallel algorithms into parallel architectures. This procedure is applied to the CCC network achieving dilation 1, but very high load. Also, only special kinds of embeddings, so called contractions, are considered. This paper investigates the ....

F. Berman, L. Snyder, "On mapping parallel algorithms into parallel architectures", Journal of Parallel and Distributed Computing, vol. 4 (1987), pp. 439-458.


Multiple Network Embeddings into Hypercubes (Extended Abstract) - Gupta, al. (1989)   (Correct)

....computational equivalence (or non equivalence) between networks of different topology, but efficient embeddings lead to efficient simulations of algorithms originally designed for G on host H. Embeddings and their implications to parallel processings have recently been studied extensively [2, 4, 6, 8, 10, 11, 14]. The hypercube architecture has proven to be a versatile and suitable architecture for designing and implementing parallel algorithms and it allows for efficient embeddings of many networks (e.g. trees, meshes, butterflies, pyramids) 3, 5, 6, 13, 14] It is not surprising that many ....

F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. J. of Parallel and Distributed Computing, V-4:439--458, 1987.


Evaluation of Two Programming Paradigms for.. - Chen, Eshaghian.. (1995)   (Correct)

....each task module is assigned a priority. Whenever a processor is available, a task module with the highest priority is selected from the list and assigned to a processor. MH has a time complexity of O(M 2 N 3 ) The mapping problem can also be addressed as a graph matching problem [3, 18, 1, 11]. The input to the mapping problem is two graphs. The first graph is called the task graph which is similar to the data flow representation of the execution process, where each node is a task module and edges represent dependency and flow of data. The second graph is called the system graph which ....

....as the matching of these two graphs such that the overall execution time is minimized. This problem has been known to be NP complete in its general form as well as several restricted forms [11] In an attempt to solve the problem in a general case, a number of heuristics have been introduced [3, 18, 1, 11]. Lee and Aggarwal s mapping strategy is an example of this approach [18] It assumed the number of nodes of the task graph to be no greater than that of the system graph. Its time complexity is O(N 3 ) To reduce the complexity of the mapping problem, a number of approaches such as graph ....

F. Berman and L. Snyder. On mapping parallel algorithms into parallel architectures. Journal of Parallel and Distributed Computing, pages 439--458, 4 1987.


An Optimized Hardware Architecture and Communication Protocol.. - Shoemaker (1997)   (2 citations)  (Correct)

....the final system maps the tasks and schedules their communications across the system. Mage89] Graph Description Language (GDL) GDL is another configuration language that allows users to specify network communication. It is designed for the Prep P system that targets reconfigurable networks. [Berm87] . Graph Abstractions for Concurrent Programming (GARP) GARP is designed as a set of rules to fully specify a graph of network communications in a parallel application. It is used as notation in traditional parallel languages to allow directed graphs to be extracted easily from the parallel ....

F. Berman and L. Snyder. "On Mapping Parallel Algorithms Into Parallel Architectures", Journal of Parallel and Distributed Computing. Vol. 4, No. 5, pages 439458, October 1987.


A Communication-Ordered Task Graph Allocation Algorithm - Evans, Kessler (1992)   (Correct)

No context found.

F. Berman, and L. Snyder, "On Mapping Parallel Algorithms Into Parallel Architectures," In Proceedings of the 1984.


Message-Passing Program Development by Ensemble - Cotronis (1997)   (2 citations)  (Correct)

No context found.

F. Berman, L.Snyder, `On mapping parallel algorithms into parallel architectures', J. P. Distr. Comput. 4, 5, 439-458.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC