• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 11 - 20 of 1,686
Next 10 →

The Stanford FLASH multiprocessor

by Jeffrey Kuskin, David Ofelt, Mark Heinrich, John Heinlein, Richard Simoni, Kourosh Gharachorloo, John Chapin, David Nakahira, Joel Baxter, Mark Horowitz, Anoop Gupta, Mendel Rosenblum, John Hennessy - In Proceedings of the 21st International Symposium on Computer Architecture , 1994
"... The FLASH multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance message passing, while minimizing both hardware and software overhead. Each node in FLASH contains a microprocessor, a portion of the machine’s global memory, a port to the interconnection n ..."
Abstract - Cited by 349 (20 self) - Add to MetaCart
The FLASH multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance message passing, while minimizing both hardware and software overhead. Each node in FLASH contains a microprocessor, a portion of the machine’s global memory, a port to the interconnection

Active Messages: a Mechanism for Integrated Communication and Computation

by Thorsten Von Eicken, David E. Culler, Seth Copen Goldstein, Klaus Erik Schauser , 1992
"... The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high com ..."
Abstract - Cited by 1054 (75 self) - Add to MetaCart
The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high

PVM: A Framework for Parallel Distributed Computing

by V. S. Sunderam - Concurrency: Practice and Experience , 1990
"... The PVM system is a programming environment for the development and execution of large concurrent or parallel applications that consist of many interacting, but relatively independent, components. It is intended to operate on a collection of heterogeneous computing elements interconnected by one or ..."
Abstract - Cited by 788 (27 self) - Add to MetaCart
or more networks. The participating processors may be scalar machines, multiprocessors, or special-purpose computers, enabling application components to execute on the architecture most appropriate to the algorithm. PVM provides a straightforward and general interface that permits the description

Software Transactional Memory

by Nir Shavit, Dan Touitou , 1995
"... As we learn from the literature, flexibility in choosing synchronization operations greatly simplifies the task of designing highly concurrent programs. Unfortunately, existing hardware is inflexible and is at best on the level of a Load Linked/Store Conditional operation on a single word. Building ..."
Abstract - Cited by 695 (14 self) - Add to MetaCart
on the hardware based transactional synchronization methodology of Herlihy and Moss, we offer software transactional memory (STM), a novel software method for supporting flexible transactional programming of synchronization operations. STM is non-blocking, and can be implemented on existing machines using only a

A Parallel Programming Model for a Multi-FPGA Multiprocessor Machine

by Manuel Alejandro, Saldaña De Fuentes, Manuel Alejandro, Saldaña De Fuentes
"... Recent research has shown that FPGAs can execute certain applications significantly faster than state-of-the-art processors. The penalty is the loss of generality, but the reconfigurability of FPGAs allows them to be reprogrammed for other applications. Therefore, an efficient programming model and ..."
Abstract - Add to MetaCart
processors. In addition, a Network-on-Chip is designed to enable intra-FPGA and inter-FPGA communications. Together, TMD-MPI, TMD-MPE and the network provide a flexible design flow for Multiprocessor System-on-Chip design.

The SPLASH-2 programs: Characterization and methodological considerations

by Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, Anoop Gupta - INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE , 1995
"... The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH-2 programs in terms of fundamental propertie ..."
Abstract - Cited by 1420 (12 self) - Add to MetaCart
The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH-2 programs in terms of fundamental

Literature Study NUMA-Aware Thread Scheduling On Hierarchical Multiprocessor Machines

by unknown authors
"... As the use of computer for massive computation keeps increasing, the applications always request more and more performances. To hold that pace, computer architects have proposed always ..."
Abstract - Add to MetaCart
As the use of computer for massive computation keeps increasing, the applications always request more and more performances. To hold that pace, computer architects have proposed always

Maximizing Multiprocessor Performance with the SUIF Compiler

by Mary Hall, Jennifer M. Anderson, Saman P. Amarasinghe, Brian R. Murphy, Shih-wei Liao, Edouard Bugnion, Monica S. Lam , 1996
"... This paper presents an overview of the SUIF compiler, which automatically parallelizes and optimizes sequential programs for shared-memory multiprocessors. We describe new technology in this system for locating coarse-grain parallelism and for optimizing multiprocessor memory behavior essential to ..."
Abstract - Cited by 280 (22 self) - Add to MetaCart
-memory multiprocessors can potentially deliver supercomputer-like performance to the general public. Today, these machines are mainly used in a multiprogramming mode, increasing system throughput by running several independent applications in parallel. The multiple processors can also be used together to accelerate

Memory access buffering in multiprocessors

by Michel Dubois, Chrlstoph Scheurich, Faye Briggs - In Proceedings of the 13th Annual International Symposium on Computer Architecture , 1986
"... In highly-pipelined machines, instructions and data are prefetched and buffered in both the processor and the cache. This is done to reduce the average memory access la-tency and to take advantage of memory interleaving. Lock-up free caches are designed to avoid processor blocking on a cache miss. W ..."
Abstract - Cited by 254 (4 self) - Add to MetaCart
. Write buffers are often included in a pipelined machine to avoid processor waiting on writes. In a shared memory multiprocessor, there are more advantages in buffering memory requests, since each memory access has to traverse the memory- processor interconnection and has to compete with memory requests

Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors

by Todd Mowry, Anoop Gupta - Journal of Parallel and Distributed Computing , 1991
"... The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Although the provision of coherent caches in many recent machines has alleviated the problem somewhat, cache misses still occur frequently enough that they s ..."
Abstract - Cited by 302 (18 self) - Add to MetaCart
The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Although the provision of coherent caches in many recent machines has alleviated the problem somewhat, cache misses still occur frequently enough
Next 10 →
Results 11 - 20 of 1,686
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University