17 citations found. Retrieving documents...
G. Gao, K. Likharev, P. Messina, and T. Sterling, "Hybrid technology multithreaded architecture", in Proc. Frontiers '96, (Annapolis, MD), pp. 98-105, (1996).

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
A Microserver View of HTMT: New Benchmarks and.. - Yerosheva, Kuntz.. (2000)   (Correct)

....system has opened new questions in computation and organization. For existing architectures some problems are well studied subjects, for other problems it is a challenge to understand and make decisions. Moving memory closer to processors (PIMs) 3] creating multi level memory hierarchy (HTMT) [1], and having multithreading at each level of processors in software and hardware gives additional levels of complexity to support the data and task parallelism. There are several well known systems that have similar features to the HTMT architecture or execution model. For example, the Tera ....

....while parallelism focuses on the number of parallel nodes and levels involved in the computation. In addition to these broad issues are the more concrete issues of load balancing, task migration [17] memory distribution and control distribution. 3 HTMT system architecture The HTMT architecture [1] is a new design for a peta op system that attempts to solve the memory latency problem on a deep memory hierarchy, provide multithreading at the processor level, and instrument distributed processing using a PIM approach. It consists of the following components: 4096 multithreaded CPUs, called ....

Gao, G., K. Likharev, P. Messina, T. Sterling, \Hybrid Technology Multithreaded Architecture," 6th Symp. on Frontiers of Massively Parallel Computation, pp.98-105, Annapolis, MD, 1996. 22


A Microserver View of HTMT - Yerosheva, Kuntz, Brockman, Kogge (2001)   (Correct)

....pseudo code for application execution and show memory layouts and execution flows for each benchmark. 2. Related work Development of the massively parallel HTMT system has opened new questions in computation and organization. Moving memory closer to processors [18] creating a memory hierarchy [12], and multithreading give additional levels of complexity. Many systems have similar features to the HTMT architecture or execution model. The Tera machine [8] has a similar underlying architecture and supports multithreading, but does not put the memory and processor on the same chip. The ....

....(MPI) or a Linda system [7] This implies enormous work in designing, testing, and optimizing the system, including concurrency and parallelism [6] load balancing, task migration [5] memory distribution and control distribution. 3. HTMT system architecture The HTMT architecture [12] is a new design for a petaflop system that attempts to solve the memory latency problem on a deep memory hierarchy, provide multithreading, and instrument distributed processing using a PIM approach. It consists of 4096 multithreaded CPUs, called SPELLs (Superconductive Processor ELLements) ....

G. Gao, K. Likharev, P. Messina, and T. Sterling. Hybrid Technology Multithreaded Architecture. In 6th Symp. on Frontiers of Massively Parallel Computation, pages 98--105, 1996.


High-Level Prototyping for the HTMT Petaflop Machine - Yerosheva (2001)   (Correct)

....some statistics for the early HTMT prototype. Chapter 9 concludes and shows extension to our work for the future. Finally, the thesis also includes references to the literature. 5 CHAPTER 2 THE HTMT SYSTEM STRUCTURE Design and implementation of the HTMT prototype is a part of the HTMT project [6] which attempts to explore and characterize a synthesis of technologies, innovative architectures, and aggressive latency management techniques in a way that could accelerate availability of near petaflops scale computing systems; to develop the architecture, to collect statistics to minimize ....

G. Gao, K. Likharev, P. Messina, T. Sterling, "Hybrid Technology Multithreaded Architecture," 6th Sump. on Frontiers of Massively Parallel Computation, MD, pp. 98-105, Oct. 1996.


Timing of Multi-Gigahertz Rapid Single Flux Quantum.. - Gaj, Friedman,, Feldman (1997)   (Correct)

.... applications [16, 17] In the longer term, RSFQ may also provide the speed and power characteristics required by general purpose petaflop scale computing (petaflop = 10 15 floating point operations per second) which is likely to remain beyond the reach of the fastest semiconductor technologies [18, 19]. The primary immediate application of RSFQ logic is digital signal processing. The current state of RSFQ technology favors the design of circuits with a regular topology, limited control circuitry, a small number of distinct cells, and limited interconnections. The analysis of timing in RSFQ ....

G. Gao, K.K. Likharev, P.C. Messina, and T.L. Sterling, #Hybrid Technology Multithreaded Architecture,# in: Proc of PetaFlops Architecture Workshop, to be published; see also the Web site http://www.cesdis.gsfc.nasa.gov/petaflops/peta.html.


Application of Credit-Based Flow Control to RSFQ Micropipelines - Zinoviev, Maezawa (1998)   (1 citation)  (Correct)

....are not influenced by clock skew. This advantage is of key importance for the emerging very high speed digital circuits, including those belonging to the superconductor rapid single flux quantum (RSFQ) logic memory family [2] Recent studies of the possibility of petaflops scale computations ([3], 4] raise the issue of reliable distributed asynchronous on chip and chip tochip communication media, and micropipelines may be good candidates to occupy that niche. II. Traditional Micropipelines and Their Drawbacks The operation of a traditional micropipeline (Fig. 1) is based on a simple ....

G. Gao, K. K. Likharev, P. C. Messina, and T. L. Sterling, "Hybrid technology multithreaded architecture," in Proc. Frontiers `96, (Annapolis, MD), pp. 98--105, Feb. 1996.


Design and Implementation of an RSFQ Switching Node for.. - Shinichi Yorozu (1999)   (3 citations)  (Correct)

....its uniquely high speed and low power consumption. The petaflops computer presently being designed in the HTMT project combines several innovative technologies: RSFQ processors and networks, semiconductor SRAM and DRAM based processors in memory (PIMs) optical networks, and a holographic memory [2]. According to the preliminary design [3] the RSFQ subsystem of the HTMT machine will consist of 4,096 superconductor processing elements (SPELL) and a self routing multistage packet switching network (CNET) The network will enable any SPELL to access remote memory buffers belonging to other ....

G. Gao, K. K. Likharev, P. C. Messina, and T. L. Sterling, "Hybrid technology multithreaded architecture," in Proc. Frontiers `96, (Annapolis, MD), pp. 98--105, 1996.


High Thruput Nets for Petaflops Computing - Wittie, Sazaklis, Zhou, Zinoviev (1998)   (Correct)

....Thruput Nets for Petaflops Computing L. Wittie G. Sazaklis Y. Zhou D. Zinoviev May 1, 1998 1 Motivation This work was undertaken to estimate the complexity of networks that can be used to connect several thousand processing elements and memory interfaces in a Petaflops cryocomputer [3]. Two different network architectures have been studied in detail: multistage banyan networks and truncated ( pruned ) multidimensional meshes. For each architecture, we simulated many network shapes to determine the maximal aggregate throughput T and average latency avg . We have learned many ....

G. Gao, K. K. Likharev, P. C. Messina, and T. L. Sterling. Hybrid Technology Multithreaded Architecture. In Proc. Frontiers`96, pages 98--105, Annapolis, MD, 1996. Available electronically via anonymous FTP from ftp://rsfq1.physics.sunysb.edu/pub/ieee htm.ps.


Design Issues in Ultra-Fast Ultra-Low-Power Superconductor.. - Zinoviev (1997)   (2 citations)  (Correct)

.... that may be used as a component in commercial telecommunication switches; 2) to demonstrate that RSFQ technology is capable of conducting large scale projects, and 3) to identify parts and design strategies that may be used in similar projects, e.g. in multiprocessor interconnecting networks [7]. Each architecture has been considered under the same workload of 5:76 T bps in two different environments (96 Theta60 Gbps bit serial channels and 96 Theta1:875 Gpbs 32 bit parallel channels with parallel to serial and serial to parallel converters) with self routing and without contention ....

G. Gao, K. Likharev, P. Messina, and T. Sterling, Hybrid technology multithreaded architecture, in Proc. Frontiers`96, Annapolis, MD, 1996, pp. 98-- 105.


Processing in Memory: Chips to Petaflops - Kogge, Brockman, Sterling, Gao (1997)   (5 citations)  Self-citation (Gao Sterling)   (Correct)

....machines, even with the best of CMOS technologies, will require the programmer to deal with multi million way parallelism. Many real applications may not permit such huge levels of parallelism. To attack such problems, a large scale collaboration among several research groups the HTMT project [1, 5, 19] is focusing on a mixed technology solution where extremely long latencies are possible, and where preemptive activity in the PIM based memory are essential to reduce or eliminate the latency penalties. This paper takes one such proposed PIM architecture, Shamrock, and discusses a possible ....

....on the outside to view the chip as memory in the conventional sense. Again, both of these connections stem naturally from the topology of the individual nodes, and do not require any expensive additional wiring on the chip. The HTMT System The Hybrid Technology Multi Threaded (HTMT) project [1, 5, 19] is a collaborative project among about half a dozen research groups (Cal Tech JPL, U. Delaware, SUNY Stonybrook, Notre Dame, Princeton, plus an association with many other government and industrial labs) to define a system that can reach a petaflops level of performance in significantly less time ....

Gao, G., K. Likharev, P. Messina, T. Sterling, "Hybrid Technology Multithreaded Architecture," 6th Sump. on Frontiers of Massively Parallel Computation, Annapolis, MD, Oct. 25-31, 1996, pp. 98-105.


RSFQ Subsystem for Petaflops-Scale Computing: "COOL-0" - Paul Bunyk Mikhail   Self-citation (Likharev)   (Correct)

....in a system of such a physical size makes the system prone to stalling for communication intensive programs. Manuscript received May 1, 1999. The HTMT project and the work described in this paper are supported by DARPA, NSA, and NASA, and in part by NSF grant No. ECS 9700313. The HTMT concept [2] assumes a hierarchical organization of the petaflops computing system (Fig. 1) with multiple levels of distributed memory: holographic data storage (HRAM) semiconductor SRAM and DRAM, and cryomemory (CRAM) as well as three types of processors: SRAM and DRAM based processors in memory (PIMs) ....

....high as 1000 processor cycles) is multithreading. This technique reduces the processor idle time by overlapping the execution of separate tasks called threads. Multithreading and context prefetching have been accepted as the key techniques of latency tolerance in the HTMT program execution model [2] and COOL I instruction set architecture [3] PIMs find ready threads, allocate the context of a ready thread in CRAM, and initiate its execution in a SPELL. When a SPELL finishes the execution of a thread, an SRAM PIM fetches the results from CRAM into SRAM. All of these multilevel activities can ....

G. Gao, K. K. Likharev, P. C. Messina, and T. L. Sterling, "Hybrid technology multithreaded architecture," in Proc. Frontiers `96, (Annapolis, MD), pp. 98--105, Feb. 1996.


RSFQ Subsystem for Petaflops-Scale Computing: "COOL-0" - Paul Bunyk Mikhail   Self-citation (Likharev)   (Correct)

....0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 liquid helium (4 K) Optical Interconnect SRAM PIM DRAM PIM Holographic Memory CNet RSFQ Processors Figure 1. HTMT computer concept. The HTMT concept [9] assumes a hierarchical organization of the petaflops computing system (Figure 1) with multiple levels of distributed memory: holographic data storage (HRAM) semiconductor SRAM and DRAM, and cryomemory (CRAM) as well as three types of processors: SRAM and DRAM based processors in memory (PIMs) ....

G. Gao, K. K. Likharev, P. C. Messina, and T. L. Sterling. Hybrid technology multithreaded architecture. In Proc. Frontiers`96, pages 98--105, Annapolis, MD, Feb. 1996.


Superconductor Multithreaded Subsystem for Petaflops Scale.. - Mikhail Dorojevets   Self-citation (Likharev)   (Correct)

....technology, petaflops, asynchronous pipelines. Contact person: Prof. Mikhail Dorojevets, EE Dept. SUNY, Stony Brook, NY 11794 2350 Phone: 516) 632 8611 Fax: 516) 632 8494 E mail: midor eegw.ee. sunysb.edu 1 Introduction The goal of the Hybrid Technology MultiThreaded Architecture (HTMT) project[9] is to develop and study a new computer architecture that exploits multiple technologies to achieve petaflops level performance. Fig. 1 shows the top level concept of the HTMT petaflops system. It has a hierarchical organization with multiple levels of memory and three types of processors: SRAM ....

Gao, G., Likharev, K. K., Messina, P. C., and Sterling, T. L. Hybrid technology multithreaded architecture. In Proc. Frontiers`96 (Annapolis, MD, 1996), pp. 98--105. Available via anonymous ftp from ftp://rsfq1.physics.sunysb.edu/pub.


Superconducting Processors for HTMT: Issues and Challenges - Kevin Theobald (1999)   (1 citation)  Self-citation (Gao Sterling)   (Correct)

....(a strand) will lead to unacceptably high latencies, hence poor performance. We propose alternative processor designs which use fine grain synchronizations between individual instructions in order to avoid these bottlenecks. 1. Introduction The Hybrid Technology Multi Threading (HTMT) project [2, 4] is an ambitious, long term study of the feasibility of combining several emerging technologies to produce, within ten years, a computer with a sustained speed of 1 petaFLOPS (10 15 floating point operations per second) and 1 petabyte of memory. HTMT will combine highspeed superconductor ....

G. Gao, K. K. Likharev, P. C. Messina, and T. L. Sterling. Hybrid technology multi-threaded architecture. In Proceedings of Frontiers '96: The Sixth Symposium on the Frontiers of Massively Parallel Computation, pages 98--105, Annapolis, Maryland, October 1996.


CNET: Design of an RSFQ Switching Network for Petaflops-Scale.. - Wittie Yu (1999)   (3 citations)  Self-citation (Likharev)   (Correct)

....computing module per network clock cycle (in the present design, 32 ps) I. Introduction RSFQ digital technology [1] has created an exciting opportunity for computing on the petaflops scale (1 petaflops = 10 15 floating point operations per second) with acceptable aggregate physical parameters [2]. The major advantage of RSFQ technology for this particular application is not so much its unparalleled speed, but rather its uniquely low power consumption, since multi megawatt power dissipation makes fully semiconductor petaflops scale computers hardly feasible. The system presently designed ....

....is not so much its unparalleled speed, but rather its uniquely low power consumption, since multi megawatt power dissipation makes fully semiconductor petaflops scale computers hardly feasible. The system presently designed in the HTMT (Hybrid Technology MultiThreaded architecture) project [2], 3] combines RSFQ with other innovative technologies: semiconductor SRAM and DRAM based processors in memory (PIMs) an optical switching network, and holographic memory. According to the preliminary design named COOL 0 [3] the RSFQ subsystem for a petaflops scale computer should consist of ....

G. Gao, K. K. Likharev, P. C. Messina, and T. L. Sterling, "Hybrid technology multithreaded architecture," in Proc. Frontiers`96, (Annapolis, MD), pp. 98--105, Feb. 1996.


Design Issues in Ultra-Fast Ultra-Low-Power Superconductor.. - Dmitry Zinoviev (1997)   (2 citations)  Self-citation (Likharev)   (Correct)

.... that may be used as a component in commercial telecommunication switches; 2) to demonstrate that RSFQ technology is capable of conducting large scale projects, and 3) to identify parts and design strategies that may be used in similar projects, e.g. in multiprocessor interconnecting networks [10]. Each architecture has been considered under the same workload of 5:76 T bps in two different environments (96 Theta 60 Gbps bit serial channels and 96 Theta 1:875 Gpbs 32 bitparallel channels with parallel to serial and serial to parallel converters) with self routing and without contention ....

G. Gao, K. Likharev, P. Messina, and T. Sterling, "Hybrid technology multithreaded architecture," in Proc. Frontiers`96, (Annapolis, MD), pp. 98--105, 1996.


Full Operation of Three-Node Pipeline-Ring Switching Chip - For Superconducting Network   (Correct)

No context found.

G. Gao, K. Likharev, P. Messina, and T. Sterling, "Hybrid technology multithreaded architecture", in Proc. Frontiers '96, (Annapolis, MD), pp. 98-105, (1996).


High-Level Prototyping for the HTMT Petaflop Machine - Yerosheva (2001)   (Correct)

No context found.

G. Gao, K. Likharev, P. Messina, T. Sterling, "Hybrid Technology Multi-Threaded Architecture: Project Summary. Project Description," HTMT Reports, 1998.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC