Abstract:
This paper analyzes memory access scheduling and virtual channels as mechanisms to reduce the latency of main memory accesses by the CPU and peripherals in web servers. Despite the address filtering effects of the CPU’s cache hierarchy, there is significant locality and bank parallelism in the DRAM access stream of a web server, which includes traffic from the operating system, application, and peripherals. However, a sequential memory controller leaves much of this locality and parallelism unexploited, as serialization and bank conflicts affect the realizable latency. Aggressive scheduling within the memory controller to exploit the available parallelism and locality can reduce the average read latency of the SDRAM. However, bank conflicts and the limited ability of the SDRAM’s internal row buffers to act as a cache hinder further latency reduction. Virtual channel SDRAM overcomes these limitations by providing a set of channel buffers that can hold segments from rows of any internal SDRAM bank. This paper presents memory controller policies that can make effective use of these channel buffers to further reduce the average read latency of the SDRAM. 1
Citations
|
220
|
Flash: An efficient and portable Web server
– Pai, Druschel, et al.
- 1999
|
|
155
|
Simics: A full system simulation platform
– Magnusson, Christensson, et al.
- 2002
|
|
120
|
Measuring the capacity of a web server
– Banga, Druschel
- 1997
|
|
80
|
A performance comparison of contemporary DRAM architectures
– Cuppu, Jacob, et al.
- 1999
|
|
66
|
Impulse: Building a smarter memory controller
– Carter, Hsieh, et al.
- 1999
|
|
63
|
Memory Access Scheduling
– Rixner, Dally, et al.
- 2000
|
|
55
|
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
– Lin, Reinhardt, et al.
- 2001
|
|
28
|
Command vector memory systems: High performance at low cost
– Corbal, Espasa, et al.
- 1998
|
|
26
|
Design of a parallel vector access unit for SDRAM memory systems
– Mathew, McKee, et al.
- 2000
|
|
23
|
A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality
– Zhang, Zhu, et al.
- 2000
|
|
19
|
Concurrency, latency, or system overhead: Which has the largest impact on uniprocessor DRAM-system performance
– Cuppu, Jacob
- 2001
|
|
17
|
Dynamic access ordering for streamed computations
– McKee, Wulf, et al.
- 2000
|
|
12
|
Modern DRAM Architectures
– Davis
- 2000
|
|
11
|
Eager Writeback - a Technique for Improving Bandwidth Utilization
– Lee, Tyson, et al.
- 2000
|
|
10
|
DDR2 and low latency variants
– Davis, Mudge, et al.
- 2000
|
|
6
|
DRAM Caching
– Wong, Baer
- 1997
|
|
1
|
KVR266X72RC25/1024 memory module specification
– Kingston
- 2002
|
|
1
|
64M-bit Virtual Channel SDRAM data sheet
– NEC
- 1998
|