See this document in CiteSeerX!

Adaptive Two-level Thread Management for Fast MPI Execution on Shared Memory Machines (1999)  (Make Corrections)  (4 citations)
Kai Shen, Hong Tang, Tao Yang



  Home/Search   Context   Related

 
View or download:
ucsb.edu/research/tmpi/SC99.ps.gz
ucsb.edu/~kshen/papers/SC99.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  ucsb.edu/~tyang/papers/ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper addresses performance portability of MPI code on multiprogrammed shared memory machines. Conventional MPI implementations map each MPI node to an OS process, which suffers severe performance degradation in multiprogrammed environments. Our previous work (TMPI) has developed compile/run-time techniques to support threaded MPI execution by mapping each MPI node to a kernel thread. However, kernel threads have context switch cost higher than user-level threads and this leads to longer... (Update)

Context of citations to this paper:   More

...of available physical processors since context switch of kernel level threads is more expensive than that of user level threads. Recently [32] we have verified this adavantage and avoiding use of unnecessary kernel threads in a multiprogrammed environment can lead to an...

...kernel threads to match the number of available physical processors in order to minimize kernel level context switch cost. Recently [Shen et al. 1999] we have studied this idea and we find that minimizing unnecessary use of kernel level threads in a multiprogrammed environment...

Cited by:   More
Loosely Coordinated Coscheduling In The Context Of . . . - Sodan (2005)   (Correct)
Optimizing Threaded MPI Execution on SMP Clusters - Hong Tang And (2001)   (Correct)
Program Transformation and Runtime Support for Threaded MPI.. - Hong Tang Kai (2000)   (Correct)

Active bibliography (related documents):   More   All
0.8:   Compile/Run-time Support for Threaded MPI Execution on.. - Hong Tang (1999)   (Correct)
0.4:   S+: Efficient 2D Sparse LU Factorization on Parallel Machines - Shen, Yang, Jiao   (Correct)
0.3:   Fast Synchronization on Scalable Cache-Coherent.. - Nikolopoulos..   (Correct)

Similar documents based on text:   More   All
0.5:   A Flexible QoS Framework for Cluster-based Network Services - Kai Shen Hong (2002)   (Correct)
0.3:   Integrated Resource Management for Cluster-based Internet.. - Kai Shen Hong (2002)   (Correct)
0.2:   Personal Authentication Based On Generalized Symmetric Max.. - Zhang, Chen (2003)   (Correct)

Related documents from co-citation:   More   All
3:   mpi-forum (context) - Forum, www
3:   MPI-SIM: Using Parallel Simulation to Evaluate MPI Programs - Bagrodia, Prakash - 1998
3:   TPVM: Distributed Concurrent Computing with Lightweight Processes - Ferrari, Sunderam - 1995

BibTeX entry:   (Update)

K. Shen, H. Tang, and T. Yang. Adaptive Two-level Thread Management for Fast MPI Execution on Shared Memory Machines. In Proc. of ACM/IEEE SuperComputing'99 (SC'99), November 1999. Will be available from www.cs.ucsb.edu/research/tmpi. http://citeseer.ist.psu.edu/shen99adaptive.html   More

@inproceedings{ shen99adaptive,
    author = "Kai Shen and Hong Tang and Tao Yang",
    title = "Adaptive Two-level Thread Management for Fast {MPI} Execution on Shared Memory Machines",
    pages = "??--??",
    year = "1999",
    url = "citeseer.ist.psu.edu/shen99adaptive.html" }
Citations (may not include all citations):
663   The Grid: Blueprint for a New Computing Infrastructure (context) - Foster, Kesselman - 1998
447   Exokernel: An Operating System Architecture for Application-.. - Engler, Kaashoek et al. - 1995
447   MPI: The Complete Reference (context) - Snir, Ottto et al. - 1996
304   Scheduler Activations: Effective Kernel Support for User-lev.. - Anderson, Bershad et al. - 1992
246   Portable Implementation of the MPI Message Passing Interface.. (context) - Gropp, Lusk et al. - 1996
198   Scheduling Techniques for Concurrent Systems (context) - Ousterhout - 1982
197   The Performance of Spin Lock Alternatives for Shared-memory .. (context) - Anderson - 1990
190   Process Control and Scheduling Issues for Multiprogrammed Sh.. (context) - Tucker, Gupta - 1989
187   Netsolve: a network server for solving computational science.. - Casanova, Dongarra - 1996
181   ACM Transactions on Programming Languages and Systems (context) - Herlihy - 1991
137   Performance of Multiprogrammed Multiprocessor Scheduling Alg.. (context) - Leutenegger, Vernon - 1990
102   Empirical studies of competitive spinning for a shared-memor.. - Karlin, Li et al. - 1991
98   Processor Scheduling in Shared Memory Multiprocessors (context) - Zahorjan, McCann - 1990
75   Competitive Randomized Algorithms for Non-Uniform Problems (context) - Karlin, Manasse et al. - 1989
74   The Implications of Cache Affinity on Processor Scheduling f.. (context) - Vaswani, Zahorjan - 1991
64   SunOS Multi-thread Architecture - Powell, Kleiman et al. - 1991
45   Multiprogramming on Multiprocessors - Crovella, Das et al. - 1991
45   Thread Scheduling for Multiprogrammed Multiprocessors - Arora, Blumofe et al. - 1998
32   mpi-forum (context) - Forum, www
32   NanoThreads: A User-Level Threads Architecture (context) - Polychronopoulos, Bitar et al. - 1993
26   Scheduler-Conscious Synchronization - Kontothanassis, Wisniewski et al. - 1997
16   Dynamic Processor Allocation with the Solaris Operating Syst.. - Yue, Lilja - 1998
16   Kernel-Level Scheduling for the Nano-Threads Programming Mod.. (context) - Polychronopoulos, Martorell et al. - 1998
8   A Multi-threaded Message Passing Interface (context) - Protopopov, Skjellum - 1998
7   Elimination Forest Guided 2D Sparse LU Factorization - Shen, Jiao et al. - 1998
5   is Thread Partitioning and How (context) - Tang, Gao et al. - 1998
4   An Effective Processor Allocation Strategy for Multiprogramm.. (context) - Yue, Lilja - 1997
3   Efficient Sparse LU Factorization with Lazy Space Allocation - Jiang, Richman et al. - 1999
3   Run-time Support for Threaded MPI Execution on Multiprogramm.. (context) - Tang, Shen et al. - 1999
2   NCSA note on SGI Origin 2000 IRIX (context) - on, IRIX et al. - 2000

Documents on the same site (http://www.cs.ucsb.edu/~tyang/papers/):   More
The WWW Prototype of the Alexandria Digital Library - Andresen, Carver, Dolin.. (1995)   (Correct)
A Comparison of 1-D and 2-D Data Mapping for Sparse LU.. - Fu, Jiao, Yang   (Correct)
Performance Bounds for Column-Block Partitioning of Parallel .. - Gerasoulis, Yang   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC