| Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992. |
....for small size and number of messages. LP based algorithm is better when the number of messages is large. For all other cases, the DS based strategy is better than the other two strategies. Airfoil Mesh This test set contains communication matrices generated by graphpartitioning algorithm [9] applied to computational grids generated for fluid dynam8 4 5 6 7 8 9 10 11 12 Message size (2 X bytes) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 NS LP DS (a) 4 5 6 7 8 9 10 11 12 Message size (2 X bytes) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 NS LP DS (b) ....
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....belong to the class of NPcomplete problems [4] hence exact solutions are computationally intractable for large problems. However, good suboptimal solutions are sufficient for effective parallelization of most applications. There are a number of partitioning algorithms available in the literature [5, 7, 10, 11, 19]. This paper is focused on a subclass of applications in which the computational graph is such that the vertices correspond to two or three dimensional coordinates, and the interaction between computations is limited to vertices that are physically proximate. Examples of such applications ....
N. Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....we use to generate these test sets is described in [10] The message lengths used in our test is COM(i; j) multiplied by the variable msg unit in order to study the effect of message size on each scheme. 2. This test set contains communication matrices generated by graph partitioning algorithms [7]; the samples represent fluid dynamics simulations of a part of an airplane (Figure 11) with different granularities (2800 point, 9428 point, and 53961 point) In order to observe the NICE primitives performances with different message sizes, we multiplied the matrices in this test set by a ....
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....to the previous one, except that the message sizes are nonuniform, where the size is equal to COM(i; j) multiplied by msg unit. The different values of msg unit used in this test set are 2 k for 4 k 13. 3. This test set contains communication matrices generated by graph partitioning algorithms [8]; the samples represent fluid dynamics simulations of a part of an airplane (Figure 6) with different granularities (2800 point, 9428 point, and 53961 point) In order to observe the algorithm s performance with different message sizes, we have multiplied the matrices in this test set by a ....
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....inferred. P0 P2 P6 P4 P1 P3 P5 P7 Figure 1: The partitioning of irregular mesh However, we emphasize that good suboptimal solutions are sufficient for effective parallelization of a large class of irregular problems. There are a large number of partitioning algorithms available in the literature [2], 6] 9] Depending on the requirement application, one may be more useful than the other. The following are some important features of a partitioning algorithm. 1. Cost of partitioning vs. quality: For a given application, a cheaper algorithm generating a solution of reasonable quality may be ....
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....4. For machine likes iPSC 860, it is worthwhile exploiting pairwise bidirectional communication to achieve concurrent send and receive. There is a large amount of literature on how to partition the task graph so as to minimize the communication cost. Many of these methods are iterative in nature [10]. After a particular threshold any improvement in partitioning is expensive. For problems which require runtime partitioning, it is critical that this partitioning be completed extremely fast. For such problems, the gains provided by effective communication scheduling may far outperform the gains ....
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....NLP. This observation reveals that when the variance in message size is large, it is worthwhile maintaining the heap structure. Figure 13: The unstructured grid used for our simulations 2. The second test set (applications) contains communication matrices generated by load balancing algorithms [14] on some realistic data samples for a 32 node hypercube. The samples represent fluid dynamics simulations of a part of a airplane (Figure 13) with different granularities (2800 point, 3681 point, 9428point, and 53961 point) We will only present the results of 53961 point samples. In order to ....
....of the same size, in every phase, can pay reasonable dividend in terms of communication cost (although at a higher computation cost) There is a large amount of literature on how to partition the task graph so as to minimize the communication cost. Many of these methods are iterative in nature [14]. After a particular threshold any improvement in partitioning is expensive. For problems which require runtime partitioning, it is critical that this partitioning be completed extremely fast. For such problems, the gains provided by effective communication scheduling may far outperform the ....
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....belong to the class of NP complete problems [9] hence exact solutions are computationally intractable for large problems. However, good suboptimal solutions are sufficient for effective parallelization of most applications. There are a number of partitioning algorithms available in the literature [1, 10, 14, 18, 19, 28]. This list is by no means complete. In a large number of such problems the computational structure (or dependencies) can be constructed only during execution [5] For such cases these graphs must be constructed at runtime; thus it is important that the partitioning of data be done at runtime. ....
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....Because the computational pattern is available only at run time, this may not be done directly by the compiler; instead, calls to a run time environment may need to be generated to do the partitioning dynamically. Several algorithms are available in the literature to perform this partitioning [14]. Partitioning for dynamic problems requires optimization methods that quickly and reliably produce reasonable but not exact results. Partitioning such applications can be posed as a graph partitioning problem necessarily based on the computational graph for each phase. The partitioning problem is ....
....cost of mapping versus the quality of mapping, depending on the needs of a given application. On the other hand, in direct methods, the cost of mapping and the quality of mapping is fixed for a given problem. 3. Parallelizability: Partitioning methods such as genetic algorithm based partitioners [14] are inherently parallel. On the other hand, parallelizing spectral bisection methods is relatively more difficult (in fact, a good parallelization of such methods depends on the partitioning of the matrix structure based on the problem to be partitioned) 4. Incremental updates: For many ....
[Article contains additional citation context not shown here]
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....must also be computed. The computational pattern may only be available at run time, this may not be done directly by the compiler; instead, calls to a run time environment need be generated to do the partitioning. Several algorithms are available in the literature to perform this partitioning (see [15] for a detailed list of such references) The partitioning described in Figure 2 generates a 8 Theta 8 communication matrix COM (Table 1) A 1 in the (i; j) entry represents processor P i needs to communicate to processor P j . Each message is of different size and each processor may send ....
....to observe the case where a few processors have a small amount of large messages, while other processors have a bulk of small messages. The total amount of data to be sent by every processor is equal. 3. The third test set contains communication matrices generated by graph partitioning algorithms [15]; the samples represent fluid dynamics simulations of a part of a airplane (Figure 12) with different granularities (2800 point, 3681 point, 9428 point, and 53961point) We will only present the results of 2800 point and 53961 point samples. In order to observe the algorithms performance with ....
[Article contains additional citation context not shown here]
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
....must also be computed. The computational pattern may only be available at runtime and may not be done directly by the compiler; instead, calls to a runtime environment need to be generated to do the partitioning. Several algorithms are available in the literature to perform this partitioning (see [16] for a detailed list of such references) The partitioning described in Figure 2 generates an 8 Theta 8 communication matrix COM (Table 1) A 1 in the (i; j) entry represents the fact that processor P i needs to communicate to processor P j . Each message is of different size and each ....
....while other processors have a bulk of small messages. The total amount of data to be sent by every processor is equal. The different values of msg unit used for our experiments are 2 k for 4 k 14. 3. This test set contains communication matrices generated by graph partitioning algorithms [16]; the samples represent fluid dynamics simulations of a part of an airplane (Figure 12) with different granularities (2800 point and 53961 point) In order to observe the algorithm s performance with different message sizes, we multiplied the matrices in this test set by a variable msg unit. The ....
[Article contains additional citation context not shown here]
Nashat Mansour. Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing. PhD thesis, Syracuse University, Syracuse, NY 13244, 1992.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC