Download:
|
by Kevin Scott, Kevin Skadron
In Proceedings of the 2nd Annual Workshop on Hardware Support for Objects and Microarchitectures for Java
ftp://ftp.cs.virginia.edu/pub/techreports/CS-2000-05.ps.Z
Add To MetaCart
Abstract:
Abstract. The popularity of Java has resulted in a flurry of engineering and research activity to improve performance of Java Virtual Machine (JVM) implementations. This paper introduces the concept of bytecode-level parallelism (BLP)---data- and control- independent bytecodes that can be executed concurrently---as a vehicle for achieving substantial performance improvements in implementations of JVMs, and describes a JVM architecture---JVM-BLP---that uses threads to exploit BLP. Measurements for several large Java programs show levels of BLP can be as high as 14.564 independent instructions, with an average of 6.768. 1
Citations
|
521
|
Combining branch predictors
– McFarling
- 1993
|
|
333
|
Limits of Instruction-Level Parallelism
– Wall
- 1991
|
|
216
|
Performance Evaluation Corporation
– Standard
- 1998
|
|
144
|
The jalapeno dynamic optimizing compiler for java
– Burke, Choi, et al.
- 1999
|
|
132
|
The Java Virtual Machine Specification. The Java Series
– Lindholm, Yellin
- 1997
|
|
129
|
Transient fault detection via simultaneous multithreading
– Reinhardt, Mukherjee
- 2000
|
|
103
|
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading
– Lo, Emer, et al.
- 1997
|
|
93
|
Compiling Standard ML to Java bytecodes
– Benton, Kennedy, et al.
- 1998
|
|
76
|
Optimizing an ANSI C Interpreter with Superoperators
– Proebsting
- 1995
|
|
72
|
WebL – a programming language for the web
– Kistler, Marais
- 1998
|
|
64
|
Simultaneous subordinate microthread (SSMT
– Chappell, Stark, et al.
- 1999
|
|
63
|
Hwu Hwu. Java bytecode to native code translation: The Caffeine prototype and preliminary results
– Hsieh, Gyllenhaal, et al.
- 1996
|
|
59
|
Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques
– Ahuja
- 1999
|
|
45
|
Fast, Effective Code Generation in a Just-in-Time Java Compiler
– ADL-TABATABAI, CIERNIAK, et al.
- 1998
|
|
40
|
The advantages of machinedependent global optimization
– Benitez, Davidson
- 1994
|
|
38
|
The Java HotSpot Performance Engine Architecture. http://java.sun.com/products/hotspot/whitepaper.html
– Microsystems
- 1999
|
|
34
|
Stack caching for interpreters
– Ertl
- 1995
|
|
33
|
Improving the Performance of Speculatively Parallel Applications on the Hydra CMP
– Olukotun, Hammond, et al.
- 1999
|
|
26
|
Interlanguage working without tears: Blending SML with Java
– Benton, Kennedy
- 1999
|
|
23
|
Performance limitations of the java core libraries
– Heydon, Najork
- 1999
|
|
10
|
Characterization of Java Applications at Bytecode and Ultra-SPARC
– Radhakrishnan, Rubio, et al.
- 1999
|
|
9
|
Performance measurement of dynamically compiled Java executions
– Newhall, Miller
- 1999
|
|
6
|
Allowing for ILP in an embedded Java processor
– Radhakrishnan, Talla, et al.
- 2000
|
|
4
|
The technology behind crusoe processors. Whitepaper; see http://www.transmeta.com/crusoe/download/pdf/crusoetechwp.pdf
– Klaiber
- 2000
|
|
4
|
Jacl: A Tcl implementation in Java
– Lam, Smith
- 1997
|
|
3
|
et al. The structure and performance of interpreters
– Romer
- 1996
|
|
3
|
Compaq Chooses SMT for Alpha. Microprocessor Report
– Diefendorff
- 1999
|