Results 1 -
3 of
3
Analyzing the Impact of Useless Write-Backs on the Endurance and Energy Consumption of PCM Main Memory
"... Abstract—Phase Change Memory (PCM) is an emerging technology that has been recently considered as a cost-effective and energy-efficient alternative to traditional DRAM main memory. Due to the high energy consumption of writes and limited number of write cycles, reducing the number of writes to PCM c ..."
Abstract
- Add to MetaCart
Abstract—Phase Change Memory (PCM) is an emerging technology that has been recently considered as a cost-effective and energy-efficient alternative to traditional DRAM main memory. Due to the high energy consumption of writes and limited number of write cycles, reducing the number of writes to PCM can result in considerable energy savings and endurance improvement. In this paper, we introduce the concept of useless write-backs, which occur when a dirty cache line that belongs to a dead memory region is evicted from the cache (a dead region is a memory location that is not used again by a program). Since the evicted data is not used again, the write-back can be safely avoided to improve endurance and energy consumption. This paper presents a limit study on the improvement that passing information to the memory system about useless writebacks has on the endurance and energy consumption of systems based on PCM main memory. We developed algorithms to measure the number of useless write-backs to PCM for three different types of memory regions and we present an energy model to determine the maximum energy savings that could potentially be achieved through such a scheme. Our results show that avoiding useless write-backs can save up to 19.8 % of energy and improve endurance by up to 26.2%. I.
StagedReads: Mitigating the Impactof DRAM Writeson DRAM Reads ∗
"... Mainmemorylatencieshavealwaysbeenaconcernfor system performance. Given that reads are on the critical path for CPU progress, reads must be prioritized over writes. However, writes must be eventually processed and theyoftendelaypendingreads. Infact,asinglechannelin the main memory system offers almos ..."
Abstract
- Add to MetaCart
Mainmemorylatencieshavealwaysbeenaconcernfor system performance. Given that reads are on the critical path for CPU progress, reads must be prioritized over writes. However, writes must be eventually processed and theyoftendelaypendingreads. Infact,asinglechannelin the main memory system offers almost no parallelism between reads and writes. This is because a single off-chip memory bus is shared by reads and writes and the direction of the bus has to be explicitly turned around when switching from writes to reads. This is an expensive operation and its cost is amortized by carrying out a burst of writes or reads every time the bus direction is switched. As a result, no reads can be processed while a memory channel is busy servicing writes. This paper proposes a
1 System Impact of 3D Processor-Memory Interconnect: A Limit Study
"... Abstract—3D integration with through-silicon-vias (TSVs) can provide enormous bandwidth between processor die and memory die. The central goal of our work is to explore the limits of performance improvement that can be achieved with such integration. Towards this end we propose a model of the impact ..."
Abstract
- Add to MetaCart
Abstract—3D integration with through-silicon-vias (TSVs) can provide enormous bandwidth between processor die and memory die. The central goal of our work is to explore the limits of performance improvement that can be achieved with such integration. Towards this end we propose a model of the impact of 3D TSVs on system performance. The model leads to several key observations i) increased miss tolerance (smaller caches) and hence improved core scaling for a fixed die size, ii) higher sustained IPC per core, iii) significantly smaller, energy efficient DRAM banks, iv) redistribution of system power to the cores and on-die interconnect, and v) TSV utilization is a function of the relationship between reference locality and the bandwidth properties of the intradie network. These observations are repeated in cycle level simulations of a 64 tile architecture. I.

