• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 23,516
Next 10 →

for Shared Memory Machines

by Monte Carlo, Mesfet Model, S. M. Goodnick, Monte Carlo, Mesfet Model, Carl R. Huster , 1992
"... Abstract approved: ..."
Abstract - Add to MetaCart
Abstract approved:

Unifying Data and Control Transformations for Distributed Shared-Memory Machines

by Michal Cierniak, Wei Li , 1994
"... We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Control transformations involve changing the execution order of programs. We have developed new techniques for compiler optimiz ..."
Abstract - Cited by 176 (10 self) - Add to MetaCart
optimizations for distributed shared-memory machines, although the same techniques can be used for sequential machines with a memory hierarchy. Our compiler optimizations are based on an algebraic representation of data mappings and a new data locality model. We present a pure data transformation algorithm

Parallel sequence mining on shared-memory machines

by Mohammed J. Zaki - Journal of Parallel and Distributed Computing , 2001
"... We present pSPADE, a parallel algorithm for fast discovery of frequent sequences in large databases. pSPADE decomposes the original search space into smaller suffix-based classes. Each class can be solved in main-memory using efficient search techniques, and simple join operations. Further each clas ..."
Abstract - Cited by 24 (0 self) - Add to MetaCart
class can be solved independently on each processor requiring no synchronization. However, dynamic inter-class and intraclass load balancing must be exploited to ensure that each processor gets an equal amount of work. Experiments on a 12 processor SGI Origin 2000 shared memory system show good speedup

MPI Performance Comparison on Distributed and Shared Memory Machines

by Tom Loos, Randall Bramley, All Bramley , 1996
"... The widely implemented MPI Standard [10] defines primitives for point-to-point interprocessor communication (IPC), collective IPC, and synchronization based on message passing. The main reason to use a message passing standard is to ease the development, porting, and execution of applications on ..."
Abstract - Add to MetaCart
on the variety of parallel computers that can support the paradigm, including shared memory, distributed memory, and shared memory array multiprocessors. This paper compares the SGI Power Challenge, a shared memory multiprocessor, with the Intel Paragon, a distributed memory machine. This paper addresses two

Implementation Tradeoffs in Distributed Shared Memory Machines

by Radhika Thekkath, Amit Pal Singh, Jaswinder Pal Singh, Susan John, John Hennessy
"... The construction of a cache-coherent distributed shared memory (DSM) machine involves many organizational and implementation trade-offs. This paper studies the performance implications of these trade-offs as made on some real DSM machines. We focus on characteristics related to communication and exa ..."
Abstract - Add to MetaCart
The construction of a cache-coherent distributed shared memory (DSM) machine involves many organizational and implementation trade-offs. This paper studies the performance implications of these trade-offs as made on some real DSM machines. We focus on characteristics related to communication

Parallel Sequence Mining on Shared-Memory Machines

by Mohammed Zaki Computer, Mohammed J. Zaki - Journal of Parallel and Distributed Computing , 2000
"... We present pSPADE, a parallel algorithm for fast discovery of frequent sequences in large databases. pSPADE decomposes the original search space into smaller suffix-based classes. Each class can be solved in main-memory using efficient search techniques, and simple join operations. Further each clas ..."
Abstract - Add to MetaCart
class can be solved independently on each processor requiring no synchronization. However, dynamic inter-class and intraclass load balancing must be exploited to ensure that each processor gets an equal amount of work. Experiments on a 12 processor SGI Origin 2000 shared memory system show good speedup

Comparison of MPI implementations on a shared memory machine

by Brian Vanvoorst, Steven Seidel - in in Proceedings of the 15th IPDPS 2000 Workshops on Parallel and Distributed Processing , 2000
"... Abstract. There are several alternative MPI implementations available to parallel application developers. LAM MPI and MPICH are the most common. System vendors also provide their own implementations of MPI. Each version of MPI has options that can be tuned to best t the characteristics of the applic ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract. There are several alternative MPI implementations available to parallel application developers. LAM MPI and MPICH are the most common. System vendors also provide their own implementations of MPI. Each version of MPI has options that can be tuned to best t the characteristics of the application and platform. The parallel application developer needs to know which implementation and options are best suited to the problem and platform at hand. In this study the RTCOMM1 communication benchmark from the Real Time Parallel Benchmark Suite is used to collect performance data on several MPI implementations for a Sun Enterprise 4500. This benchmark provides the data needed to create a re ned cost model for each MPI implementation and to produce visualizations of those models. In addition, this benchmark provides best, worst, and typical message passing performance data which is of particular interest to real-time parallel programmers. 1

ALGORITHMS ON PC CLUSTERS AND SHARED MEMORY MACHINES

by Albert Chan, Frank Dehne, Ryan Taylor, Albert Chan, Frank Dehne, Ryan Taylor
"... In this paper, we present CGMgraph, the first integrated library of parallel graph methods for PC clusters based on Coarse Grained Multicomputer (CGM) algorithms. CGMgraph implements parallel methods for various graph problems. Our implementations of deterministic list ranking, Euler tour, connected ..."
Abstract - Add to MetaCart
In this paper, we present CGMgraph, the first integrated library of parallel graph methods for PC clusters based on Coarse Grained Multicomputer (CGM) algorithms. CGMgraph implements parallel methods for various graph problems. Our implementations of deterministic list ranking, Euler tour, connected components, spanning forest, and bipartite graph detection are, to our knowledge, the first efficient implementations for PC clusters. Our library also includes CGMlib, a library of basic CGM tools such as sorting, prefix sum, one-to-all broadcast, all-to-one gather, h-Relation, all-to-all broadcast, array balancing, and CGM partitioning. Both libraries are available for download at

Pipelined Iterative Methods for Shared Memory Machines

by Purdue E-pubs, John P. Bonomo, Wayne R. Syksen, John P. Bonomo , 1987
"... In this paper we describe a new parallel iterative technique to solve a set of linear equations. The technique can be applied to any serial iterative scheme and involves pipelining sllccessive iterations. We give an example of this technique by modifying the classical successive Qver-relaxation meth ..."
Abstract - Add to MetaCart
In this paper we describe a new parallel iterative technique to solve a set of linear equations. The technique can be applied to any serial iterative scheme and involves pipelining sllccessive iterations. We give an example of this technique by modifying the classical successive Qver-relaxation method (SOR). The algorithm is implemented on a Sequent Balance 21000 and the experimental results are presented.

Program transformation and runtime support for threaded MPI execution on shared-memory machines

by Hong Tang, Kai Shen, Tao Yang - ACM Transactions on Programming Languages and Systems , 2000
"... Parallel programs written in MPI have been widely used for developing high-performance applications on various platforms. Because of a restriction of the MPI computation model, conventional MPI implementations on shared memory machines map each MPI node to an OS process, which can suffer serious per ..."
Abstract - Cited by 19 (1 self) - Add to MetaCart
Parallel programs written in MPI have been widely used for developing high-performance applications on various platforms. Because of a restriction of the MPI computation model, conventional MPI implementations on shared memory machines map each MPI node to an OS process, which can suffer serious
Next 10 →
Results 1 - 10 of 23,516
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University