| T. Yamauchi, L. Hammond, and K. Olukotun. The Hierarchical Multi-bank DRAM: a High-performance Architecture for Memory Integrated with Processors. In the Proceedings of the 17th Conference on Advanced Research in VLSI, pages 303--19, Ann Arbor, MI, September 1997. |
....banks. Each one consists of a number of sub banks, connected to the bank interface through a shared data and address bus. While this technique was introduced in order to reduce the memory access latency, it can also be used to overlap in time accesses to different sub banks in a pipelined fashion [27]. Random bandwidth is particularly important to applications with strided and indexed loads and stores, where bank conflicts between element accesses can significantly hurt the overall performance. This is because DRAM reads and writes take multiple processor cycles to complete. Hence an access ....
T. Yamauchi, L. Hammond, and K. Olukotun, "The hierarchical multibank DRAM: a High-performance architecture for memory integrated with processors," in the Proceedings of the 17th Conference on Advanced Research in VLSI, Ann Arbor, MI, USA, Sept. 1997.
....and temporal locality. These references also do not typically make consecutive column accesses to the same row, severely limiting the sustainable data bandwidth when those references are satisfied in order. Several memory access schedulers have been proposed as part of systems on a chip [WAM 99] [YHO97]. Hitachi has built a test chip of their access optimizer for embedded DRAM that contains the access optimizer and DRAM [WAM 99] A simple scheduler is implemented which performs accesses for the oldest pending reference that can access the DRAM subject to timing and resource constraints. The ....
Tadaaki Yamauchi, Lance Hammond, and Kunle Olukotun. The Hierarchical Multi-bank DRAM: A High-Performance Architecture for Memory Integrated with Processors. Proceedings of the Conference on Advanced Research in VLSI (September 1997), pp. 303-319.
....only a small number of them can be used at any one time. This restriction appears to be a large waste of SRAM bits, but treating each subarray as an independent memory bank would carry with it an area overhead for additional circuits like Y decoders, read write datapath, etc. Yamauchi et.al. [144] estimate 40 and 80 overhead for 16 and 32 banks respectively, compared to a baseline 4 bank 256Mbit DRAM design. To take advantage of as many sense amplifiers as possible with only a small area overhead, certain schemes have appeared that treat the individual subarrays as semiindependent ....
....40 and 80 overhead for 16 and 32 banks respectively, compared to a baseline 4 bank 256Mbit DRAM design. To take advantage of as many sense amplifiers as possible with only a small area overhead, certain schemes have appeared that treat the individual subarrays as semiindependent subbanks[114, 144]. In both schemes, concurrent activation of neighboring subarrays is prohibited, because of the shared sense amplifiers, but distant subarrays (subbanks) can be active at the same time. One of the schemes[144] with 32 subbanks in each of 4 banks, is estimated to perform up to 65 better than a ....
[Article contains additional citation context not shown here]
T. Yamauchi, L. Hammond, and K. Olukotun. "The Hierarchical Multi-Bank DRAM: A High-Performance Architecture for Memory Integrated with Processors". In Advanced Research in VLSI, Ann Arbor, MI, USA, 15-16 September 1997. pp. 303--319.
No context found.
T. Yamauchi, L. Hammond, and K. Olukotun. The Hierarchical Multi-bank DRAM: a High-performance Architecture for Memory Integrated with Processors. In the Proceedings of the 17th Conference on Advanced Research in VLSI, pages 303--19, Ann Arbor, MI, September 1997.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC