10 citations found. Retrieving documents...
C. Fu and T. Yang. Space and time efficient execution of parallel irregular computations. In sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'97), Las Vegas, June 1997.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
The Design, Implementation, and Evaluation of Jade - Rinard, Lam (1998)   (Correct)

....must derive information equivalent to this task graph before it can determine the structure of the final factored matrix. Several researchers have built two phase systems that schedule a programmerprovided task graph, then execute the scheduled task computation [Dongarra and Sorensen 1987; Fu and Yang 1997]. Because these systems have the entire task graph available when they schedule the computation, they may be able to generate very efficient schedules. They also eliminate scheduling overhead during the execution of the computation. The primary limitation is that these systems are not suitable for ....

Fu, C. and Yang, T. 1997. Space and time efficient execution of parallel irregular computations. In Proceedings of the 6th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York.


Low Memory Cost Dynamic Scheduling of Large Coarse Grain.. - Cosnard, Jeannot, Rougeot (1998)   (Correct)

....at run time. Cilk can handle big size problems but, communication cost is not taken into consideration. The Cilk system performs very well for tree like style computation (min max search, backtrack exploration, etc. but, it has not been designed for scientific loop nest computations. In [2, 13] run time methods to schedule task graphs are described addressing the problem of processor memory requirement, but these works do not consider DAG memory requirement. In [1] a tool CASCH, is presented. It allows to generate a schedule and a parallel code for a sequential program. Nevertheless ....

C. Fu and T. Yang. Spaceand time efficient execution of parallel irregular computations. In sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'97), Las Vegas, June 1997.


Low Memory Cost Dynamic Scheduling of Large Coarse Grain.. - Cosnard, Jeannot, Rougeot (1998)   (Correct)

....at run time. Cilk can handle big size problems but, communication cost is not taken into consideration. The Cilk system performs very well for tree like style computation (min max search, backtrack exploration, etc. but, it has not been designed for scientific loop nest computations. In [2, 13] run time methods to schedule task graphs are described addressing the problem of processor memory requirement, but these works do not consider DAG memory requirement. In [1] a tool CASCH, is presented. It allows to generate a schedule and a parallel code for a sequentiel program. Nevertheless ....

C. Fu and T. Yang. Space and time efficient execution of parallel irregular computations. In sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'97), Las Vegas, June 1997.


Low Memory Cost Dynamic Scheduling of Large Coarse Grain.. - Cosnard, Jeannot, Rougeot (1998)   (Correct)

....scheduled at run time. Cilk can handle big size problems but communication cost is not taken into consideration. The Cilk system give good results with tree like style computation (min max search, backtrack exploration, etc. but it has not been designed for scientific loop nest computations. In [2, 13] run time methods to schedule task graphs are described addressing the problem of processor memory requirement, but these works do not consider DAG memory requirement. In [1] a tool CASCH, is presented. It allows to generate a schedule and a parallel code for a sequential program. Nevertheless ....

C. Fu and T. Yang. Spaceand time efficient execution of parallel irregular computations. In sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'97), Las Vegas, June 1997.


Symbolic Partitioning and Scheduling of Parameterized Task.. - Cosnard, Jeannot   Self-citation (Yang)   (Correct)

....prediction and code optimization for parallel applications [1, 3, 8, 11, 13] Recently task graphs are adopted by several DARPA research teams for performance prediction and modeling of parallel applications on current and future machine architectures 1 . The previous work on task scheduling [12, 16, 22, 25] for performance prediction and optimization has mainly focused on searching on a graph corresponding to a particular problem instance. The scheduling complexity and the derived solution depend on the graph size. Thus the scheme is not 1 ....

C. Fu and T. Yang. Space and time efficient execution of parallel irregular computations. In sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'97), Las Vegas, June 1997.


Elimination Forest Guided 2D Sparse LU Factorization - Shen, Jiao, Yang (1998)   (2 citations)  Self-citation (Yang)   (Correct)

.... caused by dynamic pivoting; 2) identify data regularity from the sparse structure obtained by the symbolic factorization so that efficient dense operations can be used to perform most of the computation; 3) make use of graph scheduling techniques and efficient run time support called RAPID [9, 11] to exploit irregular parallelism. The preliminary experiments are encouraging and good performance results are obtained with 1D data mapping for a set of nonsymmetric benchmark matrices. We have achieved up to 1.35GFLOPS with RAPID code on 64 Cray T3E 300MHz nodes. Our previous design uses task ....

C. Fu and T. Yang. Space and Time Efficient Execution of Parallel Irregular Computations. In Proceedings of ACM Symposium on Principles & Practice of Parallel Programming, June 1997.


Efficient Sparse LU Factorization with Lazy Space Allocation - Jiang, Richman, Shen, Yang (1999)   (1 citation)  Self-citation (Yang)   (Correct)

.... distributed memory machines, in [9, 10] we proposed an approach that adopts a static symbolic factorization scheme [12] to avoid data structure variation, identifies data regularity to maximize the use of BLAS 3 operations, and utilizes graph scheduling techniques and efficient run time support [11] to exploit irregular parallelism. Recently [16] we have further studied the properties of elimination forests to guide supernode partitioning amalgamation and execution scheduling. The new code with 2D mapping, called S , effectively clusters dense structures without introducing too many ....

C. Fu and T. Yang. Space and Time Efficient Execution of Parallel Irregular Computations. In Proceedings of ACM Symposium on Principles & Practice of Parallel Programming, June 1997.


Compile/Run-time Support for Threaded MPI Execution on.. - Hong Tang (1999)   (3 citations)  Self-citation (Yang)   (Correct)

....operations used in our design should also be available in other SMMs such as SUN Enterprise. We plan to investigate this issue. We also plan to extend our compile time support for C Fortran and examine the usefulness of our techniques for irregular computation with chaotic communication patterns [15, 28]. TMPI is a proof of concept system to demonstrate the effectiveness of our techniques, and we plan to add more MPI functions to TMPI. Acknowledgment This work was supported in part by NSF CCR 9702640 and by DARPA through UMD (ONR Contract Number N6600197C8534) We would like to thank Anurag ....

C. Fu and T. Yang. Space and Time Efficient Execution of Parallel Irregular Computations. In Proceedings of ACM Symposium on Principles & Practice of Parallel Programming, pages 57--68, June 1997.


Efficient Sparse LU Factorization with Partial Pivoting on.. - Fu, Jiao, Yang (1998)   (10 citations)  Self-citation (Fu Yang)   (Correct)

....the 2 D code if there is sufficient memory since the scheduling and execution techniques for the 2 D code are simple, and are not competitive to graph scheduling. Recently we have conducted research on developing space efficient scheduling algorithms while retaining good time efficiency [18]. It is still an open problem to develop advanced scheduling techniques that better exploit parallelism for 2 D sparse LU factorization with partial pivoting. There are other issues which are related to this work and need to be further studied, for example, alternative for parallel sparse LU based ....

C. Fu and T. Yang. Space and Time Efficient Execution of Parallel Irregular Computations. In Proc. of 6th ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, pages 57--68, June 1997.


Compile/Run-time Support for Threaded MPI Execution on.. - Tang, Shen, Yang (1999)   (3 citations)  Self-citation (Yang)   (Correct)

....operations used in our design should also be available in other SMMs such as SUN Enterprise. We plan to investigate this issue. We also plan to extend our compile time support for C Fortran and examine the usefulness of our techniques for irregular computation with chaotic communication patterns [15, 28]. TMPI is a proof of concept system to demonstrate the effectiveness of our techniques, and we plan to add more MPI functions to TMPI. Acknowledgment This work was supported in part by NSF CCR 9702640 and by DARPA through UMD (ONR Contract Number N6600197C8534) We would like to thank Anurag ....

C. Fu and T. Yang. Space and Time Efficient Execution of Parallel Irregular Computations. In Proceedings of ACM Symposium on Principles & Practice of Parallel Programming, pages 57--68, June 1997.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC