• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 387
Next 10 →

Memory access buffering in multiprocessors

by Michel Dubois, Chrlstoph Scheurich, Faye Briggs - In Proceedings of the 13th Annual International Symposium on Computer Architecture , 1986
"... In highly-pipelined machines, instructions and data are prefetched and buffered in both the processor and the cache. This is done to reduce the average memory access la-tency and to take advantage of memory interleaving. Lock-up free caches are designed to avoid processor blocking on a cache miss. W ..."
Abstract - Cited by 254 (4 self) - Add to MetaCart
In highly-pipelined machines, instructions and data are prefetched and buffered in both the processor and the cache. This is done to reduce the average memory access la-tency and to take advantage of memory interleaving. Lock-up free caches are designed to avoid processor blocking on a cache miss

Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines

by Norman P. Jouppi, David W. Wall , 1989
"... Superscalar machines can issue several instructions per cycle. Superpipelined machines can issue only one instruction per cycle, but they have cycle times shorter than the latency of any functional unit. In this paper these two techniques are shown to be roughly equivalent ways of exploiting instruc ..."
Abstract - Cited by 205 (12 self) - Add to MetaCart
of superpipelining metric is introduced. Our simulations suggest that this metric is already high for many machines. These machines already exploit all of the instruction-level parallelism available in many non-numeric applications, even without parallel instruction issue or higher degrees of pipelining.

Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

by Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph M. Hellerstein , 2012
"... While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill ..."
Abstract - Cited by 141 (2 self) - Add to MetaCart
While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help

Monsoon: an explicit token-store architecture

by Gregory M. Papadopoulos - In Proc. of the 17th Annual Int. Symp. on Comp. Arch , 1990
"... Dataflow architectures tolerate long unpredictable com-munication delays and support generation and coordi-nation of parallel activities directly in hardware, rather than assuming that program mapping will cause these issues to disappear. However, the proposed mecha-nisms are complex and introduce n ..."
Abstract - Cited by 194 (13 self) - Add to MetaCart
with storage local to a processor. Low-level storage management is performed by the compiler in assigning nodes to slots in an activation frame, rather than dy-namically in hardware. The processor is simple, highly pipelined, and quite general. It may be viewed as a generalization of a fairly primitive von

Tuning Paxos for high-throughput with batching and pipelining

by Nuno Santos, André Schiper , 2011
"... Paxos is probably the most popular state machine replication protocol. Two optimizations that can greatly improve its performance are batching and pipelining. Nevertheless, tuning these two optimizations to achieve high-throughput can be challenging, as their effectiveness depends on many parameters ..."
Abstract - Cited by 9 (2 self) - Add to MetaCart
Paxos is probably the most popular state machine replication protocol. Two optimizations that can greatly improve its performance are batching and pipelining. Nevertheless, tuning these two optimizations to achieve high-throughput can be challenging, as their effectiveness depends on many

Multiple-banked register file architectures

by José-Lorenzo Cruz , Antonio González , Mateo Valero , Nigel P Topham - In International Symposium on Computer Architecture(ISCA-27 , 2000
"... Abstract The register file access time is one of the critical delays in current superscalar processors. Its impact on processor performance is likely to increase in future processor generations, as they are expected to increase the issue width (which implies more register ports) and the size of the ..."
Abstract - Cited by 146 (12 self) - Add to MetaCart
of the instruction window (which implies more registers), and to use some kind of multithreading. Under this scenario, the register file access time could be a dominant delay and a pipelined implementation would be desirable to allow for high clock rates. However, a multi-stage register file has severe implications

Experience with a Software-Defined Machine Architecture

by David W. Wall - Unreachable Procedures in Object-oriented WRL Research Report 91/10 , 1991
"... We built a system in which the compiler back end and the linker work together to present an abstract machine at a considerably higher level than the actual machine. The intermediate language translated by the back end is the target language of all high-level compilers and is also the only assembl ..."
Abstract - Cited by 57 (7 self) - Add to MetaCart
We built a system in which the compiler back end and the linker work together to present an abstract machine at a considerably higher level than the actual machine. The intermediate language translated by the back end is the target language of all high-level compilers and is also the only

QUICK PIPING: A Fast, High-Level Model for Describing Processor Pipelines †

by unknown authors
"... Responding to marketplace needs, today’s embedded processors must feature a flexible core that allows easy modification with fast time to market. In this environment, embedded processors are increasingly reliant on flexible support tools. This paper presents one such tool, called Quick Piping, a new ..."
Abstract - Add to MetaCart
new, high-level formalism for modeling processor pipelines. Quick Piping consists of three primary components that together provide an easy-to-build, reusable processor description: Pipeline graphs—a new high-level formalism for modeling processor pipelines, pipe—a companion domain-specific language

High-throughput Execution of Hierarchical Analysis Pipelines on Hybrid Cluster Platforms

by George Teodoro, Tony Pan, Tahsin M. Kurc, Jun Kong, Lee A. D. Cooper, Joel H. Saltz
"... Abstract—We propose, implement, and experimentally evalu-ate a runtime middleware to support high-throughput execution on hybrid cluster machines of large-scale analysis applications. A hybrid cluster machine consists of computation nodes which have multiple CPUs and general purpose graphics process ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Abstract—We propose, implement, and experimentally evalu-ate a runtime middleware to support high-throughput execution on hybrid cluster machines of large-scale analysis applications. A hybrid cluster machine consists of computation nodes which have multiple CPUs and general purpose graphics

TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline

by Jeffrey C. Glaubitz, Terry M. Casstevens, Fei Lu, James Harriman, Robert J. Elshire, Qi Sun, Edward S. Buckler - PLoS One , 2014
"... Genotyping by sequencing (GBS) is a next generation sequencing based method that takes advantage of reduced representation to enable high throughput genotyping of large numbers of individuals at a large number of SNP markers. The relatively straightforward, robust, and cost-effective GBS protocol is ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Genotyping by sequencing (GBS) is a next generation sequencing based method that takes advantage of reduced representation to enable high throughput genotyping of large numbers of individuals at a large number of SNP markers. The relatively straightforward, robust, and cost-effective GBS protocol
Next 10 →
Results 1 - 10 of 387
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University