4 citations found. Retrieving documents...
Shen K, Tang H, Yang T. Adaptive two-level thread management for fast MPI execution on shared memory machines. Proceedings of the IEEE/ACM Supercomputing Conference (SC), Seattle, WA, 1999. IEEE Computer Society Press: Los Alamitos, CA, 1999.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Optimizing Threaded MPI Execution on SMP Clusters - Hong Tang And (2001)   (2 citations)  Self-citation (Tang Yang)   (Correct)

....communicate with its peers. As a result, global variables declared in an MPI program are private to each MPI node. It is natural to map an MPI node to a process. However, communication between processes have to go through operating system kernels, which could be very costly. Our previous studies [16, 18] show that process based implementations can su#er large performance loss on multiprogrammed shared memory machines (SMMs) Mapping each MPI node to a thread opens the possibility of fast synchronization through address space sharing. This approach requires a compiler to transform an MPI program ....

....loss on multiprogrammed shared memory machines (SMMs) Mapping each MPI node to a thread opens the possibility of fast synchronization through address space sharing. This approach requires a compiler to transform an MPI program into a thread safe form. As demonstrated in our previous TMPI work [16, 18], the above approach can deliver significant performance gain for a large class of MPI C programs on multiprogrammed SMMs. Extending a threaded MPI implementation for a single SMM to support an SMP cluster is not straightforward. In an SMP cluster environment, processes (threads) within the same ....

[Article contains additional citation context not shown here]

K. Shen, H. Tang, and T. Yang. Adaptive two-level thread Management for fast MPI execution on shared memory machines. In Proceedings of ACM/IEEE SuperComputing '99, New York, November 1999. ACM/IEEE. Available from www.cs.ucsb.edu/research/tmpi.


Program Transformation and Runtime Support for Threaded MPI.. - Hong Tang Kai (2000)   (3 citations)  Self-citation (Shen Tang Yang)   (Correct)

....on Shared Memory Machines 1023 potential advantage is that when we use a user level thread to execute an MPI node, we can dynamically control the number of active kernel threads to match the number of available physical processors in order to minimize kernel level context switch cost. Recently [Shen et al. 1999] we have studied this idea and we find that minimizing unnecessary use of kernel level threads in a multiprogrammed environment can lead to an additional 88 performance improvement. TMPI is a proof of concept system intended for demonstrating the effectiveness of our techniques. Our current ....

SHEN, K., TANG, H., AND YANG, T. 1999. Adaptive two-level thread Management for fast MPI execution on shared memory machines. In Proceedings of ACM/IEEE SuperComputing '99. ACM/IEEE, New York. Will be available from www.cs.ucsb.edu/research/tmpi.


Program Transformation and Runtime Support for Threaded MPI.. - Tang, Shen, Yang (1999)   (3 citations)  Self-citation (Shen Tang Yang)   (Correct)

....potential advantage is that managing MPI nodes in terms of threads can allow us to dynamically switch kernel level and user level threads based on the number of available physical processors since context switch of kernel level threads is more expensive than that of user level threads. Recently [32] we have verified this adavantage and avoiding use of unnecessary kernel threads in a multiprogrammed environment can lead to an additional 88 performance improvement. TMPI is a proof of concept system to demonstrate the effectiveness of our techniques, and we plan to add more MPI functions to ....

K. Shen, H. Tang, and T. Yang. Adaptive Two-level Thread Management for Fast MPI Execution on Shared Memory Machines. In Proc. of ACM/IEEE SuperComputing'99 (SC'99), November 1999. Will be available from www.cs.ucsb.edu/research/tmpi.


Loosely Coordinated Coscheduling In The Context Of . . . - Sodan (2005)   (Correct)

No context found.

Shen K, Tang H, Yang T. Adaptive two-level thread management for fast MPI execution on shared memory machines. Proceedings of the IEEE/ACM Supercomputing Conference (SC), Seattle, WA, 1999. IEEE Computer Society Press: Los Alamitos, CA, 1999.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC