28 citations found. Retrieving documents...
S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architectures (SPAA), July 1995. http://HTTP.CS.Berkeley.EDU/~yelick/soumen/mixed-spaa95.ps.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Library Support for Hierarchical Multi-Processor Tasks - Rauber, Rünger (2002)   (Correct)

....p=32 linear p=32 extended Figure 8. Runtimes of the different execution schemes of the extrapolation method with 4 different stepsizes for # ##and # ##processors (top) and for # ###and # ###processors (bottom) of a Cray T3E. 2, 19] for an overview of systems and approaches and see [4] for a detailed investigation of the benefits of combining task and data parallel executions. Most closely related to our work concerning the parallel programming model are approaches which combine multiprocessor task and data parallelism. Several models support the programmer in writing ....

S. Chakrabarti, J. Demmel, and Yelick K. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architecture (SPAA), pages 74--83, 1995.


Parallelizing the Divide and Conquer Algorithm for the.. - Tisseur, Dongarra (2000)   (2 citations)  (Correct)

....the tree is well balanced among the processors, then the implementation should still achieve 85 efficiency even in the worst case of bad distribution of deflations. However, it is worth considering this problem for our implementation. A possible issue is dynamic splitting versus static splitting [6]. A task list is used to keep track of the various parts of the matrix during the decomposition process and make use of data and task parallelism. This approach has been investigated for the parallel implementation of the spectral divide and conquer algorithm for the unsymmetric eigenvalue ....

Soumen Chakrabarti, James Demmel, and Katherine Yelick. Modeling the benefits of mixed data and task parallelism. Technical Report CS-95-289, Department of Computer Science, University of Tennessee, Knoxville, TN, USA, May 1995. LAPACK Working Note 97.


G. Research - Computer Science Research   (Correct)

....language framework to aid in analysis and new language constructs to obviate the need for some analyses. For the optimization problem, we will use a combination of static and dynamic performance information. 5 We have extensive experience using analytical models to optimize parallel programs [18, 93, 56, 113]. Some compiler projects also use performance models to help choose tile sizes in automatic cache optimizations, for example. These models are useful for eliminating bad algorithmic choices, and to a lesser extent in choosing the best one. The models are limited by their accuracy of reflecting ....

....level that is sensible to the programmer, either by creating a high level model or by refining an analytical one supplied in advance. The information in performance models will be used by the runtime system and operating system to make decisions about degree of parallelism that should be used [18], whether to load balance computation [19] or whether to move entire jobs to other clusters. In Titanium we will adapt our language and compiler analyses in two ways to address memory hierarchy optimizations on array based programs: 1) through the type system, the compiler will prove there are ....

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architectures (SPAA), Santa Barbara, California, July 1995.


Scheduling of Multiprocessor Tasks for Numerical Applications - Rauber, Rünger (1996)   (Correct)

....the tasks. Efficient dynamic scheduling algorithms for different networks are studied by Feldmann, Sgall, and Teng in [10] A good overview of theoretical work in this area can be found in [35] Practical approaches for the scheduling of M tasks are presented by Chakrabarti, Demmel, and Yelick in [3] where an upper bound on benefits of mixed (data and task) parallelism for divide and conquer task trees and for irregular task graphs are given. 7 Conclusions Many algorithms from the area of scientific computing exhibit two levels of parallelism, expressed as an upper level of M tasks each ....

S. Chakrabarti, J. Demmel, and Yelick K. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architecture (SPAA), pages 74--83, 1995.


A Parallel Divide And Conquer Algorithm For The Symmetric.. - Tisseur, Dongarra (1999)   (Correct)

....faster set of processes (those that experience deflation) will have to wait for the other set of processes before beginning the next merge. This reduces the speed up gained though the use of the tree. A possible issue concerning load balancing the work is dynamic splitting versus static splitting [6]. In dynamic splitting, a task list is used to keep track of the various parts of the matrix during the decomposition process and to make use of data and task parallelism. This approach has been investigated 1 for the parallel implementation of the spectral divide and conquer algorithm for the ....

S. Chakrabarti, J. Demmel, and K. Yelick, Modeling the Benefits of Mixed Data and Task Parallelism, Technical Report CS-95-289, Department of Computer Science, University of Tennessee, Knoxville, TN, 1995. LAPACK Working Note 97.


Optimal Use of Mixed Task and Data Parallelism for.. - Jaspal Subhlok.. (2000)   (11 citations)  (Correct)

....and task (or function) parallel computing. Compiler and runtime support for task and data parallel computing is an active area of research, and several solutions have been proposed [4, 5, 9, 10, 19, 21] Recent research has also examined the benefits of mixed task and data parallel programming [3, 7, 18]. This paper specifically addresses the mapping of applications composed of a linear chain of data parallel tasks that act on a stream of input data sets. In this model, each task repeatedly receives input from its predecessor task, performs its computation, and sends the output to its successor ....

CHAKRABARTI, S., DEMMEL, J., AND YELICK, K. Modeling the benefits of mixed data and task parallelism. In Seventh Annual ACM Symposium on Parallel Algorithms and Architectures (Santa Barbara, CA, July 1995). 16


Algorithm for the Symmetric Tridiagonal Eigenvalue/Eigenvector.. - Dhillon (1997)   (4 citations)  (Correct)

.... This method was designed to work well on parallel computers, offering both task and data parallelism [46] Efficient parallel implementations are not straightforward to program, and the decision to switch from task to data parallelism depends on the characteristics of the underlying machine [17]. Due to such complications, all the currently available parallel software libraries, such as ScaLAPACK [22] and PeIGS [52] use algorithms based on bisection and inverse iteration. A drawback of the current divide and conquer software in LAPACK is that it needs extra workspace of more than 2n 2 ....

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architectures (SPAA), Santa Barbara, California, july 1995.


Integrating Task and Data Parallelism Using Shared Objects - Hassen, Bal (1996)   (8 citations)  (Correct)

....threads of control (processes or tasks) that can synchronize and communicate in arbitrary ways. Recently, interest has arisen in integrating task and This research is supported in part by a PIONIER grant from the Netherlands Organization for Scientific Research (N.W.O. data parallelism [3, 5, 7, 9, 11, 12, 18, 19]. Such integration offers several advantages. First, programmers can use a single language to write either a data parallel or a task parallel program, whichever is most suitable for the application at hand. Second, and even more important, many applications can exploit both types of parallelism in ....

S. Chakrabati, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In ACM Symp. on Parallel Algorithms and Architectures, 1995.


Space and Time Efficient Execution of Parallel Irregular.. - Cong Fu (1997)   (4 citations)  (Correct)

....objects, data access patterns. Data dependence graph (DDG) Dependencecomplete task graph Iterative asynchronous Task assignments, data object owners schedules and execution Figure 1: The stages of run time parallelization in RAPID. T[8] T[8,9] T[8,9] T[8,9] 0 3 4 7 10 11 12 1 2 5 6 8 9 (a) T[1] T[3] T[4] T[7] T[2] T[1,6] T[1,10] T[3,8] T[3,9] T[3,10] T[4,8] T[5,8] T[5,9] T[5,10] T[7,8] T[7,10] T[5] Proc0 Proc1 Proc0 Proc1 (b) c) T[3] T[5] T[7] T[4] T[2] T[3,8] T[4,8] T[5,8] T[1,6] T[1,10] T[7,8] T[8] T[7,10] T[1] T[3,9] T[5,9] T[3] T[5] T[7] T[4] T[2] T[3,8] T[4,8] T[5,8] T[7,8] T[8] ....

....dependence graph (DDG) Dependencecomplete task graph Iterative asynchronous Task assignments, data object owners schedules and execution Figure 1: The stages of run time parallelization in RAPID. T[8] T[8,9] T[8,9] T[8,9] 0 3 4 7 10 11 12 1 2 5 6 8 9 (a) T[1] T[3] T[4] T[7] T[2] T[1,6] T[1,10] T[3,8] T[3,9] T[3,10] T[4,8] T[5,8] T[5,9] T[5,10] T[7,8] T[7,10] T[5] Proc0 Proc1 Proc0 Proc1 (b) c) T[3] T[5] T[7] T[4] T[2] T[3,8] T[4,8] T[5,8] T[1,6] T[1,10] T[7,8] T[8] T[7,10] T[1] T[3,9] T[5,9] T[3] T[5] T[7] T[4] T[2] T[3,8] T[4,8] T[5,8] T[7,8] T[8] T[1,10] T[3,10] T[5,10] T[7,10] T[1] T[3,9] ....

[Article contains additional citation context not shown here]

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the Benefits of Mixed Data and Task Parallelism. In Proceedings of 7th ACM Symposium on Parallel Algorithms and Architectures, pages 74--83, July 1995.


A Model for Speedup of Parallel Programs - Downey (1997)   (11 citations)  (Correct)

....following functional form: S(n) 1 Gamma n ) 1 Gamma ) with 0 1. The motivation for this model is to facilitate analysis. Again, the parameter has no semantic content. No prior study has demonstrated that a proposed model describes the behavior of real programs. Chakrabarti et al. [2] propose a model for efficiency of data parallel tasks; they use measurements of ScaLAPACK programs to validate this model. Many of the allocation strategies that have been proposed for malleable jobs assume that the scheduler knows the average parallelism of all jobs [16] 8] 15] 17] 3] 12] ....

Soumen Chakrabarti, James Demmel, and Katherine Yelick. Modeling the benefits of mixed data and task parallelism. In Seventh Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '95), July 1995.


An Efficient Implementation of Nested Data Parallelism for.. - Hardwick (1996)   (4 citations)  (Correct)

....it tackles many of the same problems, namely run time adaptation to changing data layouts, use of sequential code to improve efficiency, and minimizing the overhead of parallel code. Additionally, Chakrabarti and others have analyzed the theoretical benefits of mixed data and control parallelism [12]. They conclude that best results are obtained when communication is slow or when there is a large number of processors, and that a single switch between data and control parallelism can achieve most of the benefits of a more general model. Most work on parallel divide and conquer algorithms has ....

S. Chakrabarti, J. Demmel, andK. Yelick. Modeling the benefits of mixed data and task parallelism. In Proceedings of ACM Symposium on Parallel Algorithms and Architectures, July 1995.


Optimal Latency-Throughput Tradeoffs for Data Parallel Pipelines - Subhlok (1996)   (17 citations)  (Correct)

....data and task (or function) parallel computing. Compiler and runtime support for task and data parallel computing is an active area of research, and several solutions have been proposed [3, 4, 8, 9, 14] Recent research has also examined the benefits of mixed task and data parallel programming [2, 6, 13]. This paper specifically addresses the mapping of applications composed of a linear chain of data parallel tasks that act on a stream of input data sets. Each task repeatedly receives input from its predecessor task, performs its computation, and sends the output to its successor task. The first ....

Chakrabarti, S., Demmel, J., and Yelick, K. Modeling the benefits of mixed data and task parallelism. In Seventh Annual ACM Symposium on Parallel Algorithms and Architectures (Santa Barbara, CA, July 1995).


Parallelizing the Divide and Conquer Algorithm for the.. - Tisseur, Dongarra (1998)   (2 citations)  (Correct)

....the tree is well balanced among the processors, then the implementation should still achieve 85 efficiency even in the worst case of bad distribution of deflations. However, it is worth considering this problem for our implementation. A possible issue is dynamic splitting versus static splitting [6]. A task list is used to keep track of the various parts of the matrix during the decomposition process and make use of data and task parallelism. This approach has been investigated 1 for the parallel implementation of the spectral divide and conquer algorithm for the unsymmetric eigenvalue ....

Soumen Chakrabarti, James Demmel, and Katherine Yelick. Modeling the benefits of mixed data and task parallelism. Technical Report CS-95-289, Department of Computer Science, University of Tennessee, Knoxville, TN, USA, May 1995. LAPACK Working Note 97.


Partitioning and Scheduling for Parallel Image Processing.. - Lee, Yang, Wang (1995)   (1 citation)  (Correct)

....and repartionable machine (PASM) Library based parallel systems were discussed in [8, 15] Our work differs from the above in exploiting both the task and data parallelism on distributed memory architectures. Scheduling task and data parallelism has recently been studied for program compilation [3, 13, 18, 19, 24]. The optimization function in [18, 19] is for maximizing the throughput with fixed task parameters. The work in [3, 13] deals with DAGs of fixed data distribution and processor parameters, but not with graphs with loops. In [5] techniques for optimizing the data distribution are presented for ....

....both the task and data parallelism on distributed memory architectures. Scheduling task and data parallelism has recently been studied for program compilation [3, 13, 18, 19, 24] The optimization function in [18, 19] is for maximizing the throughput with fixed task parameters. The work in [3, 13] deals with DAGs of fixed data distribution and processor parameters, but not with graphs with loops. In [5] techniques for optimizing the data distribution are presented for nested loops, which can be viewed as the optimization for one macro task. In contrast, our current scheduling scheme deals ....

S. Chakrabarti, J. Demmel, and K. Yelick, Modeling the Benefits of Mixed Data and Task Parallelism, To appear in Proc. of ACM SPAA, Santa Barabra, 1995.


PHANTOM: Parallelization of Hierarchical Applications usiNg.. - Goil (1996)   (Correct)

....called Data Parallelism. Of course, it is possible to use a combination of these strategies for optimal scheduling, and such a strategy is referred to as Mixed Parallelism. Several researchers have worked on exploiting mixed parallelism, both in theory [BB90, FST92, LT94, TWY92] and in practice [CDY95, Cha91, RSB94, SSOG93] In a number of problems, all the tasks may not be known in advance but may be generated dynamically as existing tasks are processed. This is the case with problems whose efficient solutions use the divide and conquer strategy. The execution of an instance of such a problem ....

....in the number of processors. CHAPTER 6. CONCATENATED PARALLELISM 79 If the time required to divide the subtasks is significantly higher than the cost of redistribution, communication time due to allocation of the subtasks can be ignored. Such an assumption is often made in the literature [CDY95] Unfortunately, it is not valid for several important problems, which include quicksort, quickhull, construction of quad octrees and multidimensional binary search trees. In this chapter, we propose a new strategy called Concatenated Parallelism for efficient solution of problems resulting in ....

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. Technical report, University of California, Berkeley, 1995.


Space/Time-Efficient Scheduling and Execution of Parallel.. - Tao Yang   (Correct)

....programming tools such as RAPID. It is still challenging to develop a fully automatic system. In the future, it is interesting to study automatic generation of coarsegrained DAGs from sequential code [Cosnard and Loi 1995] extend our results for more complicated dependence structures [Chakrabarti et al. 1995; Girkar and Polychronopoulos 1992; Ramaswamy et al. 1994] and investigate use of the proposed techniques in performance engineered parallel systems [DARPA 1998] While massively parallel distributed memory machines will still be valuable for high end large scale application problems in the ....

Chakrabarti, S., Demmel, J., and Yelick, K. 1995. Modeling the Benefits of Mixed Data and Task Parallelism. In Proceedings of 7th ACM Symposium on Parallel Algorithms and Architectures. 74--83.


Concatenated Parallelism: A Technique for Efficient.. - Aluru, Goil, Ranka (1996)   (1 citation)  (Correct)

....the other is called Data Parallelism. Of course, it is possible to use a combination of these strategies for optimal scheduling, and such a strategy is referred to as Mixed Parallelism. Several researchers have worked on exploiting mixed parallelism, both in theory [3, 6, 11, 17] and in practice [4, 5, 13, 16]. In a number of problems, all the tasks may not be known in advance but may be generated dynamically as existing tasks are processed. This is the case with problems whose efficient solutions use the divide and conquer strategy. The execution of an instance of such a problem can be represented by ....

....parallel algorithm often decreases with increase in the number of processors. If the time required to divide the subtasks is significantly higher than the cost of redistribution, communication time due to allocation of the subtasks can be ignored. Such an assumption is often made in the literature [4]. Unfortunately, it is not valid for several important problems, which include quicksort, quickhull, construction of quad octrees and multidimensional binary search trees. In this paper, we propose a new strategy called Concatenated Parallelism for efficient solution of problems resulting in ....

S. Chakrabarti, J. Demmel and K. Yelick, Modeling the benefits of mixed data and task parallelism, Computer Science Division, University of California, Berkeley.


Array Prefetching for Irregular Array Accesses in Titanium - Jimmy Su And (2004)   Self-citation (Yelick)   (Correct)

No context found.

S. Chakrabarti, J. Demmel, and K. Yelick, "Modeling the benefits of mixed data and task parallelism", Symposium on Parallel Algorithms and Architectures, 1995.


Array Prefetching for Irregular Array Accesses in Titanium - Su, Yelick (2004)   Self-citation (Yelick)   (Correct)

No context found.

S. Chakrabarti, J. Demmel, and K. Yelick, "Modeling the benefits of mixed data and task parallelism", Symposium on Parallel Algorithms and Architectures, 1995.


Efficient Resource Scheduling in Multiprocessors - Chakrabarti (1996)   (1 citation)  Self-citation (Chakrabarti)   (Correct)

....to be scheduled dynamically, and therefore cannot use the expensive optimization algorithms, such as linear programming, that are used in compile time scheduling. In Chapter 3 we will describe a simple and effective heuristic for scheduling divide and conquer applications with mixed parallelism [26]. The algorithm classifies the tasks into two types. The large problems near the root are allocated all the processors in turn, while small problems close to the leaves are packed in a task parallel fashion, each task being assigned exactly one processor. There is some internal frontier at which ....

....There is also task parallelism between unordered nodes. Recently algorithms have been designed for trading between locality and load balance in this scenario [33] We will come back to similar problems in Chapter 4. The switching technique has been independently discovered after our paper [26] was published in a different context: that of scheduling tasks with penalties. Every task has a running time, and a penalty for rejection; the goal is to minimize the sum of the makespan of accepted tasks and the penalty of rejected tasks [15] While their main result is an on line algorithm for ....

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architectures (SPAA). ACM, 1995.


ScaLAPACK: A Portable Linear Algebra Library for.. - Blackford, Choi.. (1995)   (14 citations)  Self-citation (Demmel)   (Correct)

....HQR algorithm in the last paragraph (matrix inversion, matrix multiply and QR factorization) but also requires more floating point operations; it remains to be seen for which problems and on which machines which algorithm is faster. The sign function also entails a dynamic load balancing scheme [7] to implement its divide and conquer approach most efficiently. 3.7 Singular Value Decomposition Let A be a general real m by n matrix. The singular value decomposition (SVD) of A is the factorization A = USV T , where U and V are orthogonal, and S = diag(oe 1 ; oe r ) r = min(m; n) ....

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architectures (SPAA), July 1995.


Execution Time of Symmetric Eigensolvers - Stanley (1997)   (7 citations)  (Correct)

No context found.

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architectures (SPAA), July 1995. http://HTTP.CS.Berkeley.EDU/~yelick/soumen/mixed-spaa95.ps.


Orthogonal Processor Groups for Message-Passing Programs - Rauber, Reilein, Rünger (2001)   (Correct)

No context found.

S. Chakrabarti, J. Demmel, and Yelick K. Modeling the benefits of mixed data and task parallelism. In Symposium on Parallel Algorithms and Architecture (SPAA), pages 74--83, 1995.


Practical Parallel Divide-and-Conquer Algorithms - Hardwick (1997)   (1 citation)  (Correct)

No context found.

Soumen Chakrabarti, James Demmel, and Katherine Yelick. Modeling the benefits of mixed data and task parallelism. In Proceedings of the 7th Annual ACM Symposium on Parallel Algorithms and Architectures, July 1995.


A Distributed Memory Implementation of the Nonsymmetric.. - Dongarra, Henry, Watkins (1996)   (Correct)

No context found.

Chakrabarti, S., Demmel, J., Yelick, K., Modeling the Benefits of Mixed Data and Task Parallelism, Seventh Annual ACM Symposium on Parallel Algorithms and Architectures, July 17-19, 1995, UC Santa Barbara, CA.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC