| J. B. Brockman, P. M. Kogge, V. Freeh, S. K. Kuntz, and T. Sterling. Microservers: A new memory semantics for massively parallel computing. In ICS, 1999. |
....to post on servers or to redistribute to lists, requires prior specific permission and or a fee. ICS 02, June 22 26, 2002, New York, New York, USA. Copyright 2002 ACM 1 58113 483 5 02 0006 . 5.00. PIM) chips to address the well known performance gap between processor and memory speeds [2, 7, 8, 12, 15, 16, 17, 20, 21, 22, 25, 27, 28, 30]. Many previous architectural solutions to the processor memory gap such as multithreading, prefetching, and speculation, seek to reduce or tolerate memory latency, at the expense of increased memory bandwidth requirements [3] PIMs instead dramatically improve memory bandwidth, by 10 100X over ....
....tra#c to the processor memory bus [8, 21] With respect to DIVA s WideWord unit, Table 2 compares the features described in Section 3. 2 with two commercial multimedia extensions that support superword parallelism, PowerPC AltiVec and Intel SSE2, as well as a previous research design called ASAP [2]. Most other multimedia extensions support subword parallelism, which performs parallel operations on subfields of a machine word. The ASAP combines WideWord and scalar capabilities in a single unit. This approach eliminates the need for trans Capability SSE2 AltiVec ASAP DIVA Separate ....
[Article contains additional citation context not shown here]
J. Brockman et al. Microservers: A new memory semantics for massively parallel computing. In Proceedings of the ACM International Conference on Supercomputing, pages 454--463, June 1999.
....of the information close to them. This information then percolates up and down the hierarchy to other processors that need it. This process reduces the amount of data transfered and therefore increases the scalability of the system. The ongoing works on memories with computation capability [8, 9, 10, 11, 12, 13, 14, 15, 16, 17] and intelligent disks [18, 19, 20, 21, 22, 23, 24] provide components that could potentially be used in such a system, but our current analysis simply assumes commodity processing elements situated in proximity to the memory and disk modules. In comparison to equivalent shared memory ....
....In addition to database machines, several research projects have looked at the idea of having computation elements close to the data. Intelligent memories have mostly been targeted to regular numeric applications [8, 9, 10] but recent attempts also look at its use in non regular applications [12, 13, 14, 15, 16, 17]. Likewise the idea of intelligent disks [18, 19, 20, 21, 22, 23, 24] has also been covered by different research groups. However, it is necessary to emphasize that our proposal is for a smart storage hierarchy, not simply smart disks or smart memories. It involves a smart storage hierarchy with ....
J. Brockman, P. Kogge, S. K. Kuntz, , and T. Sterling, "Microservers: A new memory semantics for massively parallel computing," in Proceedings of the International Conference on Supercomputing, (Rhodes, Greece), pp. 454-463, June 20- 25 1999.
....to perform bit level operations such as simple pattern matching, or higher order computations such as searches, and associative and commutative reduction operations. The wide word unit has a large register file, with 32 256 bit registers. Details on a related wide word unit are discussed elsewhere [Brockman99]. During execution, data is transferred directly from the memory array into the register files; there is no on chip data cache. Instead, we use the sense amps in the memory array as a small data cache, holding the full 2k bit row selected from the previous memory access. If two consecutive ....
J. Brockman, P. Kogge, V. Freeh, S. Kuntz, T. Sterling. "Microservers: A New Memory Semantics for Massively Parallel Computing", In Proc. of the ACM International Conference on Supercomputing, June, 1999, pp. 454463.
....available in a PIM. The ASAP unit can be used to perform bit level operations such as simple pattern matching, or higher order computations such as searches, limited pointer chasing, and associative and commutative reduction operations. Details on a related wide word unit are discussed elsewhere [Brockman99]. Figure 2: Processor In Memory Node Organization. 3.2 PIM Interconnection We anticipate PIM chips to be physically grouped as conventional memory chips, mounted on DIMM modules, as shown in Figure 3. Bounded by host bus loading constraints, the number of PIM chips in a hosted cluster is Host ....
....Mechanisms We now present a collection of key mechanisms in DIVA. 4. 1 Parcels A parcel is the general mechanism for coordinating computation in memory, communicating data and performing synchronization across components of the DIVA system, a refinement of the parcel concept described previously [Brockman99]. Similar to an active message [vonEicken92] a parcel incorporates data and an encoded operation to apply to the data; a parcel is directed to a memory object, not a process or processor. A parcel has the following four fields: pid: indicates which process issued the parcel. object: the ....
J. Brockman, P. Kogge, V. Freeh, S. Kuntz, T. Sterling. "Microservers: A New Memory Semantics for Massively Parallel Computing", In Proc. of the ACM International Conference on Supercomputing, June, 1999, pp. 454463.
No context found.
J. B. Brockman, P. M. Kogge, V. Freeh, S. K. Kuntz, and T. Sterling. Microservers: A new memory semantics for massively parallel computing. In ICS, 1999.
No context found.
J. B. Brockman, P. M. Kogge, V. Freeh, S. K. Kuntz, and T. Sterling. Microservers: A new memory semantics for massively parallel computing. In ICS, 1999.
No context found.
Jay B. Brockman, Peter M. Kogge, Vincent W. Freeh, Shannon K. Kuntz, and Thomas L. Sterling. Microservers: A new memory semantics for massively parallel computing. In Conference Proceedings of the 1999 International Conference on Supercomputing, pages 454--463, Rhodes, Greece, June 20--25, 1999.
No context found.
J. B. Brockman, P. M. Kogge, V. Freeh, S. K. Kuntz, and T. Sterling. Microservers: A new memory semantics for massively parallel computing. In ICS, 1999.
No context found.
J.B.Brockman,P.M.Kogge,V.W.Freeh,S.K.Kuntz, and T.L.Sterling. Microservers: A New Memory Semantics for Massively Parallel Computing. Proceedings ACM International Conference on Supercomputing (ICS'99), June 1999.
No context found.
Jay B. Brockman, Peter M. Kogge, Vincent Freeh, Shannon K. Kuntz, and Thomas Sterling. Microservers: A New Memory Semantics for Massively Parallel Computing. In ICS, 1999.
....chips. Intelligent memory systems exhibit characteristics not found in current computing environments. For this reason, the appropriate model of com2 putation for PIM is unclear. The right solution uses existing mechanisms where applicable, and employs new ones where necessary. The PIM project [4, 13, 12] seeks to address the increasingly problematic memory bottleneck. Increases in memory speed are simply not keeping up with increases in logic speed, making it dicult to provide processors with enough data to achieve high CPU utilization. Integrating logic modules with memory macros on a single ....
....several distributed systems, including operating systems and toolkits. Section 2.3 presents some notable programming models for distributed and parallel computing. Finally, Section 2.4 looks at a handful of other related projects. 2. 1 Intelligent Memory Systems Many projects (e.g. PIM [4, 13, 12], IRAM [22] C RAM [5] and PPRAM [18] have explored the possibility of processing in memory. PIM is a response to the growing gap between processor and memory performance. Delays for memory accesses are a limiting factor in system performance. Combining logic and memory helps to overcome the ....
Jay B. Brockman, Peter M. Kogge, Thomas L. Sterling, Vincent W. Freeh, and Shannon K. Kuntz. Microservers: a new memory semantics for massively parallel computing. In International Conference on Supercomputing, pages 454{ 463, 1999.
....implementing caches will help tolerate the latencies of QCA memory. 5.3. Integrated Logic In the traditional silicon world, the observation that the memory access latency is a significant bottleneck on computation has lead to a number of attempts to merge processing logic with memory storage [7] [3]. In QCA, distance is tied even more closely to latency, so integrating logic into QCA memory structure seems natural. The H memory structure is uniquely suited to this application. Because it is based upon a binary tree, there is space to add logic at interior nodes. One mechanism would be to ....
J. B. Brockman, P. M. Kogge, V. Freeh, and T. Sterling. Microservers: A new memory semantics for massively parallel computing. In International Conference on Supercomputing.
....expose substantially greater memory bandwidth while imposing significantly lower latency and requiring less power consumption than conventional systems. A number of projects began exploring this technology over the past few years, including IRAM, HTMT, DIVA, FlexRAM, Blue Gene BG C, and Pim Lite [29, 7, 17, 22, 2, 8]. This paper is an early presentation of the hardware and software strategy being developed for a new PIM based high end computer as part of the Gilgamesh Project conducted at the NASA Jet Propulsion Laboratory and the California Institute of Technology. Gilgamesh (billions of Logic Gate ....
....destination. Under specific conditions, parcels can significantly improve e#ciency of bandwidth resource usage and contribute to load balancing. But of equal importance is that parcels are a latency hiding mechanism, supporting decoupled computation. Parcels were proposed in the HTMT project [7]; various forms of parcel computation are being pursued by other PIM projects as well including the DIVA project, the University of Notre Dame s PIM Lite project [8] and the work at the University of Delaware on percolation. Prior art includes but is not restricted to the MIT J Machine project ....
[Article contains additional citation context not shown here]
J.B.Brockman,P.M.Kogge,V.W.Freeh,S.K.Kuntz, and T.L.Sterling. Microservers: A New Memory Semantics for Massively Parallel Computing. Proceedings ACM International Conference on Supercomputing (ICS'99), June 1999.
....in advance and placed in memory very close to the SPELLs. Second, gather scatter and pointer chasing in the memory structure are operations that are necessary to minimize the waiting time for the SPELLs. Third, data structure allocation and initialization and vector array operations [36] are employed whenever the ratio of operations to data references is so low that it is faster to do the operations in memory than to ship the data all the way down to the SPELLs and back. Those operations in the memory require new cache (memory) consistency models [8] Finally, the ....
....in accessing the holographic cubes and disk farm. 3.1. 7 DRAM PIM The DRAM PIM initializes the data structures, exposes fine grain parallelism intrinsic to vector and irregular data structures, e.g. pointer chasing through linked data structures, block moves, synchronization, data balancing [36]. It can stride through regular data structures and transfer data to PIMs and SPELLs, and is capable of performing data parallel basic operations at the row buffer, join like operations, and manages shared resources. 3.1.8 SRAM PIM The SRAM PIM is responsible for the execution of the ....
J. Brockman, P. Kogge, V. Fressh, S. Kuntz, T. Sterling, "Microservers: A New Memory Semantics for Massively Parallel Computing," ACM Intl. Conference on Supercomputing, Greece, June 1999.
....execution model that provides the generalized abstractions of both local and global computation in a uni ed framework. The principal abstract entity of the proposed model is the macroserver, a distributed agentof state and action. It complements the concept of the microserver, a purely local agent[10].This early work explores one possible model that is object based in a manner highly suitable to PIM structures but of a suciently high level with task virtualization that aggregations of PIM nodes can be cooperatively applied to a segment of parallel computation without phase changes in ....
....in the value range. In terms of implementation, a macroserver value is a reference to the home (Section 5.1) of the designated macroserver. Macroserver types are directly supported by the global addressing scheme of the PIM arrayhardware, in particular the in memory address translation facilities [10,46]. # The future type Values of the future type are references to threads. Afuturevalue is generated whenever a new thread is spawned; at that time, it can be bound to a future variable. Once this is done, the future variable can be used to access that thread for status inquiries and thread ....
[Article contains additional citation context not shown here]
J.B.Brockman,P.M.Kogge,V.W.Freeh,S.K.Kuntz, and T.L.Sterling. Microservers: A New Memory Semantics for Massively Parallel Computing. Proceedings ACM International Conference on Supercomputing (ICS'99),June 1999.
....model that provides the generalized abstractions of both local and global computation in a unified framework. The principal abstract entity of the proposed model is the macroserver, a distributed agent of state and action. It complements the concept of the microserver, a purely local agent [10]. This early work explores one possible model that is object based in a manner highly suitable to PIM structures but of a sufficiently high level with task virtualization that aggregations of PIM nodes can be cooperatively applied to a segment of parallel computation without phase changes in ....
....management of first class communication schedules which can be associated with a method or a region of code such as a parallel loop. Work distributions and schedules are discussed in Sections 6.6.1 and 6.6.2. 1 This refers to the ASAP ISA, a row wide ALU developed at Notre Dame University [10]. 8 ffl The macroserver type The values of the macroserver type are references to macroservers. Such values are generated whenever a macroserver is created from a class specification; they can be assigned to variables which then serve as handles to macroservers. We represent a macroserver type ....
[Article contains additional citation context not shown here]
J.B.Brockman,P.M.Kogge,V.W.Freeh,S.K.Kuntz, and T.L.Sterling. Microservers: A New Memory Semantics for Massively Parallel Computing. Proceedings ACM International Conference on Supercomputing (ICS'99), June 1999.
No context found.
J. B. Brockman, P. M. Kogge, V. Freeh, S. K. Kuntz, and T. Sterling. Microservers: A new memory semantics for massively parallel computing. In ICS, 1999.
No context found.
J. Brockman, P. Kogge, V. Freeh, S. Kuntz, and T. Sterling, "Microservers: A new memory semantics for massively parallel computing," in ACM International Conference on Supercomputing (ICS'99), June 1999.
No context found.
J. Brockman, P. Kogge, V. Freeh, S. Kuntz, and T. Sterling. Microservers: A new memory semantics for massively parallel computing. In ACM International Conference on Supercomputing (ICS'99), June 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC