| M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999. 163 |
....machine environment where dynamic optimizations can be performed without modifying source binaries. Chip multiprocessor Jrpm is based on the Hydra chip multiprocessor (CMP) 32] Decreasing feature size and increasing transistor counts now allow chip multiprocessors to be a reality [6] 14] 24][42]. Chip multiprocessors combine several CPUs onto one die with a tightly coupled memory interface. In this configuration, inter processor sharing and communication costs are significantly less than in traditional multiprocessors. The reduced communication costs make it possible to take advantage of ....
....dynamic and cannot be mitigated by thread synchronization or value prediction. 7. Related Work The Multiscalar paradigm [15] was the first complete description and evaluation of an architecture with TLS support. Several other architectures for TLS using CMPs have been proposed [10] 21] 29] 40][42]. These implementations have mostly targeted coarser grains of granularity than the Multiscalar architecture. In a similar vein, soffvare based dynamic dependence detection has been proposed for traditional multiprocessor systems as a way to preserve correctness for loops executed in parallel that ....
[Article contains additional citation context not shown here]
Tremblay, M. MAJC: Microprocessor Architecture for Java Computing. In HotChips'99, Stanford, CA, August 1999.
....Simple cache flushing eliminates a previously required slipstream recovery component. And the performance impact of flush induced compulsory misses is reduced by exploiting preserved data within flushed cache lines as highly accurate value predictions. 1. Introduction Chip multiprocessing (CMP) [12,17,29] and simultaneous multithreading (SMT) 8,30,31] are compelling because they maximize the performance capacity of a single chip by evolutionary rather than revolutionary means. Independent jobs or parallel tasks that otherwise execute on physically separate processors now execute on the same chip. ....
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. 11th Hot Chips Symposium, Aug. 1999.
....under grants EIA 0081307, EIA 0072102, and EIA 0103741; by DARPA under grant F30602 01 C 0078; by the Ministry of Education of Spain under grant TIC 2001 0995 C02 02; and by gifts from IBM and Intel. 21, 23, 24, 26] to software based (e.g. 7, 11, 17, 18] and targeting small machines (e.g. [1, 8, 10, 12, 14, 15, 20, 23, 24]) or large ones (e.g. 4, 6, 11, 16, 17, 18, 21, 26] Each scheme for thread level speculation has to solve two major problems: detection of violations and, if a violation occurs, state repair. Most schemes detect violations in a similar way: data that is speculatively accessed (e.g. read) is ....
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. Hot Chips, August 1999.
....compiler and hardware techniques, we improve performance under TLS by 6.2 28.5 for 6 of 14 applications, and by at least 2.7 for half of the other applications. 1. INTRODUCTION Multithreading within a chip is becoming increasingly commonplace: examples include the IBM Power4 [17] Sun MAJC [33], Alpha 21464 [9] HP PA 8800, and Sibyte BCM 1250 [4] While using this multithreaded hardware to improve the throughput of a workload is straightforward, using it to improve the performance of a single application requires parallelization. The ideal solution would be to convert sequential ....
TREMBLAY, M. MAJC: Microprocessor Architecture for Java Computing. HotChips '99 (August 1999).
....pointer based accesses, indirect accesses to arrays, irregular control flow, accesses to structures across complicated procedure calling patterns, and accesses whose pattern depends on input data. One way to extract parallelism from these codes is to use speculative thread level parallelization [1, 5, 7, 8, 9, 11, 13, 15, 17, 19, 20, 21, 22, 23, 24, 26, 27]. In this technique, the computation in the program is divided into tasks and assigned to different threads. The threads execute in parallel, optimistically assuming that sequential semantics will not be violated. As the threads run, their control flow and the data that they access are tracked. If ....
....to ensure data coherence. State repair may use hardware support to speed up the detection of tasks that need to be re executed and the destruction of the incorrect state in their caches. In recent years, many schemes with hardware support for speculative parallelization have been proposed [1, 5, 7, 9, 11, 13, 15, 21, 22, 23, 24, 26, 27]. Among other issues, they differ in their target machine size and type of code, as well as in their relative emphasis on hardware and software support. Some of these schemes have focused on architecting a solution for scalable machines [5, 22, 26, 27] The evaluation of such solutions for up to ....
[Article contains additional citation context not shown here]
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. Hot Chips, August 1999.
....data dependences, such as those that contain pointer accesses, references to arrays with non linear subscripts, very irregular control flow, or accesses across complicated procedure calling patterns. To extract parallelism in such codes, speculative thread level parallelization has been proposed [1, 4, 6, 7, 8, 10, 12, 18, 19, 20, 22, 23, 25, 26, 32]. In this approach, potentially dependent threads are speculatively executed in parallel, hoping not to violate dependences. If a cross thread dependence is violated at run time, a corrective action is triggered to repair the state. Such an action often involves squashing one or several threads. ....
....schemes for speculative parallelization differ in many ways. For example, some schemes rely on support code inserted by the compiler to check for dependence violations and to perform corrective actions [7, 19, 20] Other schemes rely on special hardware to perform some or all of these operations [1, 4, 6, 8, 10, 12, 18, 22, 23, 25, 26, 32]. This work was supported in part by the National Science Foundation under grants CCR 9970488, EIA 0081307, and EIA 0072102; by DARPA under grant F30602 01 C 0078; and by gifts from IBM and Intel. Work conducted in part while the author was with the Department of Computer Science at the ....
[Article contains additional citation context not shown here]
M. Tremblay. "MAJC: Microprocessor Architecture for Java Computing. " Presentation at Hot Chips, August 1999.
....instruction prioritization is not. Overall, we find that these techniques have great potential for improving the performance of TLS. 1 Introduction Microprocessors which can simultaneously execute multiple parallel threads are becoming increasingly commonplace. Processors such as the Sun MAJC [34], IBM Power4 [18] and the Sibyte SB 1250 [8] are single chip multiprocessors (CMPs) while the Alpha 21464 was designed to support simultaneousmultithreading [36] Using this multithreaded hardware to improve the throughput of a workload is straightforward, but improving the performance of a ....
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999.
....and hardware techniques. The latter TLP is currently achieved either via explicit parallel programming, or with the aid of parallelizing compilers [8, 7] Future high performance computers will be able to leverage plentiful on chip resources to support various granularities of parallelism [1, 6, 22]. While implicit parallelism has been traditionally studied at the instruction level, several implicit techniques that exploit the large transistor budget of next generation processors have recently been pursued in designs that exploit speculative TLP [11, 12, 5, 20, 21, 24] This work was ....
Tremblay, M. MAJC: Microprocessor Architecture for Java Computing. In Hot Chips, August 1999.
....Introduction Machines which can simultaneously execute multiple parallel threads are becoming increasingly commonplace on a wide variety of scales. For example, techniques such as simultaneous multithreading [23] e.g. the Alpha 21464) and single chip multiprocessing [16] e.g. the Sun MAJC [21] and the IBM Power4 [10] suggest that thread level parallelism may become increasingly important even within a single chip. Beyond chip boundaries, even personal computers are often sold these days in two or fourprocessor configurations. Finally, high end machines (e.g. the SGI Origin [14] have ....
....interface between TLS hardware and software, can be found in an earlier publication [19] a) Example psuedo code while(continue condition) f . x = hash[index1] hash[index2] y; g (b) Execution using thread level speculation Epoch 1 Epoch 2 Epoch 3 Epoch 4 hash[10] hash[21] = hash[30] hash[25] hash[3] hash[19] hash[33] hash[10] attempt commit( attempt commit( attempt commit( attempt commit( Violation Redo Processor1 Processor2 Processor3 Processor4 Epoch 4 hash[25] hash[10] attempt commit( ....
[Article contains additional citation context not shown here]
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999.
....Contract DABT63 95 C 0097, and gifts from IBM and Intel. 2000 ACM Intl. Symp. on Computer Architecture. corrective action is taken that involves thread squash and parallel execution resumption. Already companies like Intel and Sun are seriously involved in on chip speculative parallelization [1, 19]; the latter has even announced a chip multiprocessor (CMP) with such support (MAJC) Most of these small scale designs are conceived for standalone operation, and are thus not tailored at being integrated into large system configurations. Only [17] is designed for multichip configurations. One ....
....to many different types of nodes including single processors, CMPs, or SMP clusters, in this work we focus on using speculative CMPs as building blocks. With this approach, we hope to leverage emerging speculative CMP technology like the one commercialized by Sun Microsystems in the MAJC chip [19]. Consequently, in our design, we try to minimize the modifications required to a speculative CMP like the one presented in Section 2.2 to incorporate it into a scalable system. If we use CMPs as nodes and want the GMDT to keep state on a per node basis, it is simplest if the threads are assigned ....
[Article contains additional citation context not shown here]
M. Tremblay. "MAJC: Microprocessor Architecture for Java Computing. " Presentation at Hot Chips, August 1999.
No context found.
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999. 163
No context found.
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999.
No context found.
M. Tremblay. MAJC: Microprocessor architecture for java computing. HotChips '99, August 1999.
No context found.
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999.
No context found.
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999. 14
No context found.
TREMBLAY,M. 1999. MAJC: Microprocessor Architecture for Java Computing. Hot Chips.
No context found.
M. Tremblay. MAJC: Microprocessor architecture for Java computing. In Hot Chips, August 1999.
No context found.
M. Tremblay. MAJC: Microprocessor architecture for Java computing. Presentation at Hot Chips, August 1999.
No context found.
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999.
No context found.
M. Tremblay. MAJC: Microprocessor architecture for Java computing. Presentation at Hot Chips, August 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC