• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 14,033
Next 10 →

Route Packets, Not Wires: On-Chip Interconnection Networks

by William J. Dally, Brian Towles , 2001
"... Using on-chip interconnection networks in place of ad-hoc global wiring structures the top level wires on a chip and facilitates modular design. With this approach, system modules (processors, memories, peripherals, etc...) communicate by sending packets to one another over the network. The structur ..."
Abstract - Cited by 885 (10 self) - Add to MetaCart
. The structured network wiring gives well-controlled electrical parameters that eliminate timing iterations and enable the use of high-performance circuits to reduce latency and increase bandwidth. The area overhead required to implement an on-chip network is modest, we estimate 6.6%. This paper introduces

Parallel database systems: the future of high performance database systems

by David J. Dewitt, Jim Gray - COMMUNICATIONS OF THE ACM , 1992
"... Parallel database machine architectures have evolved from the use of exotic hardware to a software parallel dataflow architecture based on conventional shared-nothing hardware. These new designs provide impressive speedup and scaleup when processing relational database queries. This paper reviews t ..."
Abstract - Cited by 641 (13 self) - Add to MetaCart
Parallel database machine architectures have evolved from the use of exotic hardware to a software parallel dataflow architecture based on conventional shared-nothing hardware. These new designs provide impressive speedup and scaleup when processing relational database queries. This paper reviews

A high-performance, portable implementation of the MPI message passing interface standard

by Ewing Lusk, Nathan Doss, Anthony Skjellum - Parallel Computing , 1996
"... MPI (Message Passing Interface) is a specification for a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists. Multiple implementations of MPI have been developed. In this paper, we d ..."
Abstract - Cited by 890 (65 self) - Add to MetaCart
describe MPICH, unique among existing implementations in its design goal of combining portability with high performance. We document its portability and performance and describe the architecture by which these features are simultaneously achieved. We also discuss the set of tools that accompany the free

The SGI Origin: A ccNUMA highly scalable server

by James Laudon, Daniel Lenoski - In Proceedings of the 24th International Symposium on Computer Architecture (ISCA’97 , 1997
"... The SGI Origin 2000 is a cache-coherent non-uniform memory access (ccNUMA) multiprocessor designed and manufactured by Silicon Graphics, Inc. The Origin system was designed from the ground up as a multiprocessor capable of scaling to both small and large processor counts without any bandwidth, laten ..."
Abstract - Cited by 497 (0 self) - Add to MetaCart
, latency, or cost cliffs.The Origin system consists of up to 512 nodes interconnected by a scalable Craylink network. Each node consists of one or two RlOOOO processors, up to 4 GB of coherent memory, and a connection to a portion of the X I0 10 subsystem. This paper discusses the motivation for building

Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors

by John M. Mellor-crummey, Michael L. Scott - ACM Transactions on Computer Systems , 1991
"... Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become marke ..."
Abstract - Cited by 573 (32 self) - Add to MetaCart
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become

Vogels, U-Net: a user-level network interface for parallel and distributed computing, in:

by Anindya Basu , Vineet Buch , Werner Vogels , Thorsten Von Eicken - Proceedings of the 15th ACM Symposium on Operating System Principles, ACM, , 1995
"... Abstract The U-Net communication architecture provides processes with a virtual view of a network device to enable user-level access to high-speed communication devices. The architecture, implemented on standard workstations using off-the-shelf ATM communication hardware, removes the kernel from th ..."
Abstract - Cited by 597 (17 self) - Add to MetaCart
Abstract The U-Net communication architecture provides processes with a virtual view of a network device to enable user-level access to high-speed communication devices. The architecture, implemented on standard workstations using off-the-shelf ATM communication hardware, removes the kernel from

Active Messages: a Mechanism for Integrated Communication and Computation

by Thorsten Von Eicken, David E. Culler, Seth Copen Goldstein, Klaus Erik Schauser , 1992
"... The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high com ..."
Abstract - Cited by 1054 (75 self) - Add to MetaCart
The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high

Multiscalar Processors

by Gurindar S. Sohi, Scott E. Breach, T. N. Vijaykumar - In Proceedings of the 22nd Annual International Symposium on Computer Architecture , 1995
"... Multiscalar processors use a new, aggressive implementation paradigm for extracting large quantities of instruction level parallelism from ordinary high level language programs. A single program is divided into a collection of tasks by a combination of software and hardware. The tasks are distribute ..."
Abstract - Cited by 589 (30 self) - Add to MetaCart
Multiscalar processors use a new, aggressive implementation paradigm for extracting large quantities of instruction level parallelism from ordinary high level language programs. A single program is divided into a collection of tasks by a combination of software and hardware. The tasks

The BSD Packet Filter: A New Architecture for User-level Packet Capture

by Steven Mccanne, Van Jacobson , 1992
"... Many versions of Unix provide facilities for user-level packet capture, making possible the use of general purpose workstations for network monitoring. Because network monitors run as user-level processes, packets must be copied across the kernel/user-space protection boundary. This copying can be m ..."
Abstract - Cited by 568 (2 self) - Add to MetaCart
filter evaluator that is up to 20 times faster than the original design. BPF also uses a straightforward buffering strategy that makes its overall performance up to 100 times faster than Sun's NIT running on the same hardware. 1 Introduction Unix has become synonymous with high quality networking

The Google File System

by Sanjay Ghemawat, Howard Gobioff, Shun-tak Leung - ACM SIGOPS OPERATING SYSTEMS REVIEW , 2003
"... We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. While s ..."
Abstract - Cited by 1501 (3 self) - Add to MetaCart
We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. While
Next 10 →
Results 1 - 10 of 14,033
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University