• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 11 - 20 of 596
Next 10 →

A Study of Different Instantiations of the OpenMP Memory Model and Their Software Cache Implementations

by Chen Chen, Joseph B Manzano, Ge Gan, Guang R. Gao, Vivek Sarkar , 2009
"... An important open problem for future many-core chip architectures is the development of shared-memory organizations and memory consistency models that are effective for small local memory sizes per core, scalable to a large number of cores, and still productive for software to use. Many multicore pr ..."
Abstract - Add to MetaCart
processors, such as the Cell Broadband Engine, Tilera, and Cyclops64, include the use of software-managed local memories that avoid the known power and scalability limitations of hardware-managed cache structures. OpenMP is a natural candidate as a programming model for multicore processors with software-managed

AInstruction Cache Locking for Improving Embedded Systems Performance

by Kapil Anand
"... Cache memories in embedded systems play an important role in reducing the execution time of the applica-tions. Various kinds of extensions have been added to cache hardware to enable software involvement in re-placement decisions, thus improving the run-time over a purely hardware-managed cache. Nov ..."
Abstract - Add to MetaCart
Cache memories in embedded systems play an important role in reducing the execution time of the applica-tions. Various kinds of extensions have been added to cache hardware to enable software involvement in re-placement decisions, thus improving the run-time over a purely hardware-managed cache

Heap Data Allocation To Scratch-Pad Memory In Embedded Systems

by Angel Dominguez , 2007
"... This thesis presents the first-ever compile-time method for allocating a portion of a program’s dynamic data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs ..."
Abstract - Cited by 43 (5 self) - Add to MetaCart
This thesis presents the first-ever compile-time method for allocating a portion of a program’s dynamic data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs

Tempest and Typhoon: User-level Shared Memory

by Steven K. Reinhardt, James R. Larus, David A. Wood - In Proceedings of the 21st Annual International Symposium on Computer Architecture , 1994
"... Future parallel computers must efficiently execute not only hand-coded applications but also programs written in high-level, parallel programming languages. Today’s machines limit these programs to a single communication paradigm, either message-passing or shared-memory, which results in uneven perf ..."
Abstract - Cited by 309 (27 self) - Add to MetaCart
-programmable, user-level processor in the network interface. We demonstrate the utility of Tempest with two examples. First, the Stache protocol uses Tempest’s finegrain access control mechanisms to manage part of a processor’s local memory as a large, fully-associative cache for remote data. We simulated Typhoon

Managing Multi-Configurable Hardware via Dynamic Working Set Analysis

by Ashutosh S. Dhodapkar, James E. Smith - In 29th Annual International Symposium on Computer Architecture , 2002
"... Microprocessors are designed to provide good average performance over a variety of workloads. This can lead to inefficiencies both in power and performance for individual programs and during individual phases within the same program. Microarchitectures with multi-configuration units (e.g. caches, pr ..."
Abstract - Cited by 192 (3 self) - Add to MetaCart
Microprocessors are designed to provide good average performance over a variety of workloads. This can lead to inefficiencies both in power and performance for individual programs and during individual phases within the same program. Microarchitectures with multi-configuration units (e.g. caches

Virtualizing Local Stores

by Henry M. Cook, Professor K. Asanović, Second Reader
"... Software-managed local stores have proven to be more efficient than hardware-managed caches for some important applications, yet their use has been mostly confined to embedded systems that run a small set of applications in a limited runtime environment. Local stores are problematic in general-purpo ..."
Abstract - Add to MetaCart
Software-managed local stores have proven to be more efficient than hardware-managed caches for some important applications, yet their use has been mostly confined to embedded systems that run a small set of applications in a limited runtime environment. Local stores are problematic in general

Dynamic Allocation for Scratch-Pad Memory using Compile-Time Decisions

by Sumesh Udayakumaran, Angel Dominguez, Rajeev Barua - the ACM Transactions on Embedded Computing Systems (TECS , 2006
"... In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees v ..."
Abstract - Cited by 45 (3 self) - Add to MetaCart
In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees

Efficient HPC Data Motion via Scratchpad Memory

by Kayla O Seager, Ananta Tiwari, Michael A. Laurenzano, Joshua Peraza, Pietro Cicotti, Laura Carrington
"... Abstract — The energy required to move data accounts for a significant portion of the energy consumption of a modern supercomputer. To make systems of today more energy efficient and to bring exascale computing closer to the realm of possibilities, data motion must be made more energy efficient. Bec ..."
Abstract - Add to MetaCart
the possible benefits of using a softwaremanaged scratchpad memory for HPC applications. Our goal is to observe how data movement (and the associated energy costs) changes when we utilize software-managed scratchpad memory (SPM) instead of the traditional hardware-managed caches. Using an approximate

Managing Wire Delay in Large Chip-Multiprocessor Caches

by Bradford M. Beckmann, David A. Wood - IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE , 2004
"... In response to increasing (relative) wire delay, architects have proposed various technologies to manage the impact of slow wires on large uniprocessor L2 caches. Block migration (e.g., D-NUCA and NuRapid) reduces average hit latency by migrating frequently used blocks towards the lower-latency bank ..."
Abstract - Cited by 157 (4 self) - Add to MetaCart
In response to increasing (relative) wire delay, architects have proposed various technologies to manage the impact of slow wires on large uniprocessor L2 caches. Block migration (e.g., D-NUCA and NuRapid) reduces average hit latency by migrating frequently used blocks towards the lower

The HP AutoRAID hierarchical storage system

by John Wilkes, Richard Golding, Carl Staelin, Tim Sullivan - ACM Transactions on Computer Systems , 1995
"... Configuring redundant disk arrays is a black art. To configure an array properly, a system administrator must understand the details of both the array and the workload it will support. Incorrect understanding of either, or changes in the workload over time, can lead to poor performance. We present a ..."
Abstract - Cited by 263 (15 self) - Add to MetaCart
excellent storage cost for inactive data, at somewhat lower performance. The technology we describe in this paper, known as HP AutoRAID, automatically and transparently manages migration of data blocks between these two levels as access patterns change. The result is a fully redundant storage system
Next 10 →
Results 11 - 20 of 596
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University