| Duncan Elliott, Martin Snelgrove, Christian Cojocaru, and Michael Stumm. "A PetaOp/s is Currently Feasible by Computing in RAM." In PetaFLOPS Frontier Workshop, Washington DC, February 1995. |
....we must avoid making changes to the architecture of the basic memory arrays, the IC process; and constrain die sizes, signal counts and power. Our work on Computational RAM (C. RAM) has been in placing simple and medium complexity SIMD PEs in memory, from 8 Kb of ASIC SRAM to 16 Mb commodity DRAM [1 3]. These chips have run applications and have been shown to be useful in the fields of signal and image processing, computer graphics, CAD, database, and scien1. Further information can be found at: http: www.ee.ualberta.ca elliott cram email: elliott ee.ualberta.ca; tific computing. So far ....
....could use such a highly interleaved DRAM to increase memory bandwidth for random accesses, not just sequential. Energy Efficiency When the computing is done in the memory, less energy is consumed in sending data between chips. Energy can also be saved if each memory row access is better utilized [3]. C.RAM PEs consume 10 25 of the chip power. The sensing energy per memory row access dominates. A memory array can have 64 to 2048 times as many sense amps powered compared with the number of bits which are actually available after a column decode and used in a memory access. If more bits of ....
Duncan Elliott, Martin Snelgrove, Christian Cojocaru, and Michael Stumm. "A PetaOp/s is Currently Feasible by Computing in RAM." In PetaFLOPS Frontier Workshop, Washington DC, February 1995.
....(which is typically 1, 8, 16, or 32 bits wide) Therefore, processing elements integrated at the sense amplifiers can utilize this high bandwidth to speed up the execution of parallel applications. Figure 2.1 shows the bandwidths available at different points in a computer system. This comparison [8] is based on a system with 256 MBytes of 16 Mb, 50ns DRAM chips, and a 100 MHz CPU with a 64 bit bus. Figure 2.1 Memory Bandwidth in a Computer System Sense Amps Off DRAM Chips Rambus System Bus Cache CPU 2.9 TB s 6.2 GB s 500 MB s 190 MB s 800 MB s Memory bandwidth (MBytes s) 0.001 ....
....unsigned integer cboolef c(d.addr) c points to bit 0 of d cboolef e = d. bit(3) e points to bit 3 (4th bit) of d Declaration Syntax cchar a; 8 bit signed cint cushort diff = 4; 16 bit cuint, initialize all elements to 4 clong d[10] an array of 10 32 bit cint s cnibble rec[8][12] a 2 dimensional array of 96 4 bit cint s CRAM Compiler (CRAM C Library) 91 6.2.3 Memory Allocation Any call to the cvar constructor invokes a CRAM memory allocation function. Since all classes of CRAM variables are descendants of cvar, the cvar( constructor is always called when ....
[Article contains additional citation context not shown here]
D. G. Elliott, W. M. Snelgrove, C. Cojocaru, and M. Stumm, "A PetaOp/s is currently feasible by Computing in RAM", PetaFLOPS Frontier Workshop, February, 1995.
....1. Introduction Computational RAM [1 2] is a SIMD memory hybrid architecture, with 1 bit processing elements (PE) integrated at the sense amplifiers of a standard DRAM SRAM (Fig. 1a) This architecture improves the performance of highly parallel and computation intensive applications [3 7] by utilizing the high bandwidth at the memory sense amplifiers. By making the PE s 1 bit, many of them can be integrated in the pitch of a few sense amplifiers, thus increasing the degree of parallelism. Fig. 1b shows the architecture of a PE in the 64 PE, 1Kbits PE CRAM chip [8] The instruction ....
D. Elliott, M. Snelgrove, C. Cojocaru, and M. Stumm, "A PetaOp/s is Currently Feasible by Computing in RAM", PetaFLOPS Frontier Workshop, Feb. 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC