| M. Oskin, F.T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998. |
....to post on servers or to redistribute to lists, requires prior specific permission and or a fee. ICS 02, June 22 26, 2002, New York, New York, USA. Copyright 2002 ACM 1 58113 483 5 02 0006 . 5.00. PIM) chips to address the well known performance gap between processor and memory speeds [2, 7, 8, 12, 15, 16, 17, 20, 21, 22, 25, 27, 28, 30]. Many previous architectural solutions to the processor memory gap such as multithreading, prefetching, and speculation, seek to reduce or tolerate memory latency, at the expense of increased memory bandwidth requirements [3] PIMs instead dramatically improve memory bandwidth, by 10 100X over ....
.... or compiler technology to configure logic [1] or to manage a complex memory, computation and communication hierarchy [15] DIVA s PIM to PIM interconnect improves upon approaches that serialize communication through the host, which decreases bandwidth by adding tra#c to the processor memory bus [8, 21]. With respect to DIVA s WideWord unit, Table 2 compares the features described in Section 3.2 with two commercial multimedia extensions that support superword parallelism, PowerPC AltiVec and Intel SSE2, as well as a previous research design called ASAP [2] Most other multimedia extensions ....
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998.
....in the system, enabling optimizations similar to those described here. The primary difference between Impulse and Morph is that Impulse is a simpler design that can be used in current systems. The RADram project at UC Davis is building a memory system that lets the memory perform computation [34]. RADram is a PIM, or processor in memory, project similar to IRAM [25] The RAW project at MIT [46] is an even more radical idea, where each IRAM element is almost entirely reconfigurable. In contrast to these projects, Impulse does not seek to put an entire processor in memory, since DRAM ....
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th International Symposium on Computer Architecture, pages 192-- 203, Barcelona, Spain, June 27--July 1, 1998.
....architectural trends have emerged since Romer et al. performed their study. First, superscalar, out of order processor pipelines have replaced singleissue, in order designs. Second, in response to the growing memory wall problem, architects have proposed a number of smart memory system designs [5, 14, 20]. Our work extends that of Romer et al. by considering the impact of these new architectural features when designing a dynamic superpage promotion mechanism. For example, Swanson et al. demonstrate that applications can create superpages without copying using the Impulse memory controller s ....
M. Oskin, F. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proc. of the 25th ISCA, pp. 192--203, June 1998.
....and fast time to market. This limited popularity is not due to any lack of raw hardware capability, as million gate devices are readily available [37] 2] and we have seen recent advances in high clock rates [30] 28] rapid reconfiguration [29] 16] 14] and highbandwidth memory access [24] [19] 25] Rather, we believe that the limited applicability of reconfigurable technology derives largely from the lack of any unifying compute model to abstract away the fixed resource limits of devices which, otherwise, restrict software expressibility as well as longevity across device ....
Mark Oskin, Frederic T. Chong, and Timothy Sherwood. Active Pages: a Model of Computation for Intelligent Memory. In Proceedings of the 25th International Symposium on Computer Architecture (ISCA'98), June 1998.
.... a cache coherent distributed shared memory system [Saulsbury96] and a large scale distributed memory system [Kogge96] The Active Pages project, which is the most closely related to DIVA, associates configurable logic with each memory page to accelerate performance of an external host [Oskin98]. There are also several other architecture approaches, not based on PIM technology, designed to improve processor memory bandwidth [Carter99] Burger97] Rixner98] Impulse augments the memory system to perform application specified scatter gather operations on irregular data in the memory ....
Mark Oskin, Frederic T. Chong, and Timothy Sherwood. "Active Pages: A Model of Computation for Intelligent Memory". In Proc. of the 25th International Symposium on Computer Architecture (ISCA), June, 1998.
....remapping physical memory, letting applications control how data is cached on chip. This approach prefetches and buffers data within the memory controller until the CPU requests them. 6. CONCLUSION Several recent architectures propose to migrate more intelligence into the memory system [6, 25, 29] to help bridge the processor memory performance gap. This paper explores the potential for diverse applications to benefit from hardware only memory access ordering, and shows how relatively simple mechanisms can realize at least some of that potential. The strided prefetcher provides the memory ....
M. Oskin, F. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proc. of the 25th ISCA, pp. 192--203, June 1998.
....systems. In recent years, a number of hardware mechanisms have been proposed to address the problem of increasing memory system overhead. For example, researchers have evaluated the prospects of making the processor cache configurable [25, 26] adding computational power to the memory system [14, 18, 24], and supporting stream buffers [13, 16] All of these mechanisms promise significant performance improvements; unfortunately, most require significant changes to processors, caches, or memories, and thus have not been adopted in current systems. Impulse supports similar optimizations, but its ....
....to those that we have described are possible using Morph. The primary difference between Impulse and Morph is that Impulse is a simpler design that current architectures can take advantage of. The RADram project at UC Davis is building a memory system that lets the memory perform computation [18]. RADram is a PIM ( processor in memory ) project similar to IRAM [14] where the goal is to put processors close to memory. The Raw project at MIT [24] is an even more radical idea, where each IRAM element is almost entirely reconfigurable. In contrast to these projects, Impulse does not seek to ....
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proc. of the 25th ISCA, pp. 192--203, Barcelona, Spain, June 27--July 1, 1998.
....systems. In recent years, a number of hardware mechanisms have been proposed to address the problem of increasing memory system overhead. For example, researchers have evaluated the prospects of making the processor cache configurable [34, 35] adding computational power to the memory system [20, 25, 33], and supporting stream buffers [19] All of these mechanisms promise significant performance improvements; unfortunately, most require significant changes to processors, caches, or memories, and thus have not been adopted in mainstream systems. Impulse supports similar optimizations, but its ....
....similar to those described here. The primary difference between Impulse and Morph is that Impulse is a simpler design that can be profitably exploited by current processor architectures. The RADram project at UC Davis is building a memory system that lets the memory perform computation [25]. RADram is a PIM, or processor in memory, project similar to IRAM [20] The RAW project at MIT [33] is an even more radical idea, where each IRAM element is almost entirely reconfigurable. In contrast to these projects, Impulse does not seek to put an entire processor in memory, since DRAM ....
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th International Symposium on Computer Architecture, pages 192-- 203, Barcelona, Spain, June 27--July 1, 1998.
.... associative caches [22] and prefetching [5, 7, 12] ii) those that increase the processor s ability to tolerate memory latency (e.g. multithreaded processors [1, 8, 25] and (iii) those that migrate computation to or integrate the processor with the memory system (e.g. IRAM [19] and RADRAM [15]) These approaches all have merit, but they also have limitations. Many trade bandwidth for latency, which places a heavier burden on the memory system by fetching unneeded data. In general, complex cache organizations cannot keep up with aggressive processor cycle times, and thus are poor design ....
....changes to conventional processor and cache designs. Finally, researchers have proposed attacking the memory bottleneck from the opposite direction by selectively moving computation to the memory, rather than moving data to the processor. For example, integrated architectures such as RADram [15] and IRAM [19] integrate some form of processor into the memory system. These architectures are adept at handling dense streaming or vector style applications when all data reside within a single DRAM chip. However, once the data cross chip boundaries, these systems effectively become a ....
M. Oskin, F. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 192--203, June 1998.
....systems. In recent years, a number of hardware mechanisms have been proposed to address the problem of increasing memory system overhead. For example, researchers have evaluated the prospects of making the processor cache configurable [26, 27] adding computational power to the memory system [15, 19, 25], and supporting stream buffers [14, 17] All of these mechanisms promise significant performance improvements; unfortunately, most require significant changes to processors, caches, or memories, and thus have not been adopted in current systems. Impulse supports similar optimizations, but its ....
....to those that we have described are possible using Morph. The primary difference between Impulse and Morph is that Impulse is a simpler design that current architectures can take advantage of. The RADram project at UC Davis is building a memory system that lets the memory perform computation [19]. RADram is a PIM ( processor in memory ) project similar to IRAM [15] where the goal is to put processors close to memory. The Raw project at MIT [25] is an even more radical idea, where each IRAM element is almost entirely reconfigurable. In contrast to these projects, Impulse does not seek to ....
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th International Symposium on Computer Architecture, Barcelona, Spain, June 1998.
....to those that we have described are possible using Morph. The primary difference between Impulse and Morph is that Impulse is a simpler design that current architectures can take advantage of. The RADram project at UC Davis is buildinga memory system that lets the memory perform computation [12]. RADram is a PIM ( processor in memory ) project similar to IRAM [8] where the goal is to put processors close to memory. In contrast, Impulse does not seek to put a fullblown processor in memory, since DRAM processes are substantially slower than logic processes. Several researchers have ....
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th International Symposium on Computer Architecture, June 1998. To appear.
....we have described are possible using Morph. The Impulse project is designing hardware that will be built in an academic environment, which requires that we attack hardware outside of the processor. The RADram project at UC Davis is buildinga memory system that lets the memory perform computation [10]. RADram is a PIM ( processor in memory ) project similar to IRAM [6] where the goal is to put processors close to memory. In contrast, Impulse does not seek to put a processor in memory; instead, its memory controller is programmable. In summary, the Impulse project is developing a memory system ....
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th International Symposium on Computer Architecture, June 1998. To appear.
No context found.
M. Oskin, F.T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998.
No context found.
M. Oskin, F.T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998.
No context found.
M. Oskin, F. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proc. of the 25th ISCA, pp. 192--203, June 1998.
No context found.
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th International Symposium on Computer Architecture, pages 192-203, Barcelona, Spain, June 1998.
No context found.
M. Oskin, F. T. Chong, and T. Sherwood. Active pages: A model of computation for intelligent memory. In Proceedings of the 25th International Symposium on Computer Architecture, June 1998. To appear.
No context found.
M. Oskin, F. Chong, and T. Sherwood. Active Pages: A Model of Computation for Intelligent Memory. IN Proceedings of the 25th International Symposium on Computer Architecture (ISCA), June 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC