27 citations found. Retrieving documents...
Sun Microsystems Corporation. The SPARC Architecture Manual, Version 7, 1987.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Using Information from the Programmer to Implement Shared-Memory.. - Adve (1998)   (3 citations)  (Correct)

....performance. Researchers have proposed several approaches to relax the program order and atomicity constraints of sequential consistency. One common technique is to simply relax the consistency model by explicitly allowing out of order and non atomic execution of certain memory operations [14, 16, 17, 22, 35, 36]. Such models provide significant performance gains but require programmers to forgo the simple interface of sequential consistency. Furthermore, several relaxed models exist and often differ from each other in subtle but significant ways [3, 19] The variety and complexity of these models ....

....A more comprehensive coverage appears in [1] 7. 1 Work Motivated by Hardware and Runtime System Optimizations Several researchers have proposed relaxed memory consistency models that explicitly specify relaxations of the program order and atomicity requirements of sequential consistency (e.g. [17, 22, 25, 36, 35, 16, 11, 14]) A disadvantage of these models is that their system centric nature results in several different and complex interfaces for the programmer [3, 19] thereby complicating programmability and portability. To address the above disadvantages of the system centric relaxed models, researchers have ....

Sun Microsystems Inc. The SPARC Architecture Manual, January 1991. No. 800-199-12, Version 8. 30


Applying Programming Language Implementation Techniques to.. - Schnarr (2000)   (2 citations)  (Correct)

....such as 34 MIPS R3000, and SPARC are specified with one token per instruction, but CISC ISAs need multiple tokens to encode variable width instructions. Patterns for several instructions can be specified simultaneously in tabular format, similar to the tables found in many architecture manuals [72], compacting the descriptions and reducing the potential for programmer error. Finally, constructors map between assembly language and machine language representations. This toolkit produces encoding procedures for each assembly instruction that represents a corresponding machine code ....

....Instead, SimpleScalar is used as a surrogate, as it simulates a comparable processor at an equivalent level of detail. 40 3.1. The Structure of FastSim v.1 FastSim v.1 is a cycle accurate, direct execution simulator of an out of order uniprocessor. Like RSIM, it models a SPARC v. 8 [72] instruction set running on a MIPS R10000 like [81] micro architecture Figure 3.1 although, unlike RSIM, FastSim only simulates a single processor. FastSim s processor model supports out of order instruction execution, speculative execution, and an aggressive non blocking cache. Table 3.1 ....

Sun Microsystems, The SPARC Architecture Manual (Version 8), December 1990. 290


Specialized Caches To Improve Data Access Performance - Bray (1993)   (2 citations)  (Correct)

....in the RISC I [PS81] Overlapping of the register windows improves parameter passing performance. For the SPARC architecture, the overlapping register windows scheme was modified to have APPENDIX B. A TWO LEVEL WINDOWED REGISTER FILE 117 callee save windows instead of caller save windows (see [Mic87] for SPARC register file organization) In the called procedure, a new register window is allocated (deallocated) with SAVE (RESTORE) instructions, instead of having the CALL (RET) instruction allocating (deallocating) a window. Callee save windows allow the compiler to determine if a new ....

....save and restore between the second level and the first level register file does not cause exception handling problems, and does not cause extra traffic to the rest of the memory hierarchy. The rest of the memory hierarchy does not see any extra memory requests. In the SPARC architecture [Mic87] conditional branches do not use either source operand therefore, 2 first level register reads are available for dribbling inactive sets to the second level (if the second level is not full) Since approximately 20 percent of instructions are conditional branches, there is considerable dribble ....

[Article contains additional citation context not shown here]

Sun Microsystems, Inc. The SPARC Architecture Manual, Version 7. Sun Microsystems, Inc., 1987.


The Design and Implementation of the SELF Compiler, an.. - Chambers (1992)   (3 citations)  (Correct)

.... provide special instructions such as link, jsr, and movem for managing linked stacks of activation records and stack pointers [Mot85] and the Sun SPARC architecture provides hardware register windows to support fast procedure calls and returns with little register saving and restoring overhead [Sun91]. Garbage collection places some requirements on the design and implementation of the run time system and the compiler. The garbage collector must be able to locate all object references stored in registers or on the stack. In the current SELF implementation, the compiler places a saved locations ....

Sun Microsystems. The SPARC Architecture Manual, Version 8. January, 1991.


An Evaluation of Memory Consistency Models for.. - Parthasarathy..   (Correct)

....require the programmer to be explicitly aware of the effect of such reorderings and write the program suitably to ensure correctness. A number of relaxed consistency models, differing in the way in which they relax the requirements between various classes of memory operations, have been proposed [23, 8, 15, 9, 14, 19, 41, 7]. We next discuss one of the most relaxed models release consistency (abbreviated as RC) Release consistency exploits the key observation that most programs use synchronizing memory operations to ensure ordering between concurrent accesses to the same memory location by different processors. ....

Sun Microsystems Inc. The SPARC Architecture Manual, January 1991. No. 800-199-12, Version 8.


Message Passing Support in the Avalanche Widget - Swanson, Kuramkote, Stoller, .. (1996)   (2 citations)  (Correct)

....as a bulk transfer mechanism that can overcome a shortcoming of DSM in moving large quantities of data. Alewife[7] represents one of the earliest hybrid distributed shared memory explicit message passing systems. Its approach was very invasive, requiring fabrication of a custom version of a SPARC[10] cpu. Message handling received little support, other than limited DMA capability, necessitating significant processor involvement in message reception and a high reliance on interrupts. Alewife implemented local cache coherence, as does the Avalanche Widget. Little is reported on the memory ....

SUN Microsystems, Mountain View, CA. SPARC Architecture Manual, 1988.


A Compact Intermediate Format for SIMICS - Peter Magnusson (1994)   (3 citations)  (Correct)

....address space (e.g. bus based multiprocessor) distributed (e.g. message passing architecture) or hybrids. can profile memory usage (working set) as well as traditional code profiling. can have a symbolic front end (GDB 4.11 or Xray) 3 or run in batch mode. is portable (currently runs on SunOS 4.1, Solaris, or HPUX) completely deterministic. is fast and has low memory overhead. Most of the features can be selected interactively. Thus, SIMICS can simulate a four processor architecture with a shared memory bus and 64K direct mapped first level cache, or a 16 processor distributed ....

....of different combinations of features. Data cache simulates a 16k, 2 way set associative first level cache with 32 byte cache lines. Instruction profiling counts exactly how many times an instruction in a particular memory location was successfully executed. The first example runs a simple Sparc SunOS 4.1 user program, the infamous Dhrystone 2.1 benchmark (Weicker 84) In the measurement, it runs 100#000 iterations, which requires approximately 50 million instructions. Accurate data cache contents are maintained at a 2 performance loss. The second example runs a much larger Sparc program ....

[Article contains additional citation context not shown here]

Sun 1990, The Sparc Architecture Manual, Version#8, Sun Microsystems, USA, December 1990.


The Interaction of Architecture and Operating System.. - Anderson, Levy, Bershad, .. (1991)   (107 citations)  (Correct)

.... implementation, the VAXstation 3200 (11.1 MHz CVAX [Leonard 87] and four RISC implementations: the Tektronix XD88 01 (20 MHz Motorola 88000 [Mot 88a, Mot 88b] DECstation 3100 (16.6 MHz MIPS R2000 [Kane 87] DECstation 5000 200 (25 MHz MIPS R3000 1 ) and SPARCstation 1 (25 MHz Sun SPARC [Sun 87, Cyp 90] For brevity, our tables list the architecture or microprocessor names, rather than the system names, although performance is of course affected not only by instruction set architecture and processor technology, but by attributes specific to particular system level implementation ....

Sun Microsystems, Inc., Mountain View, CA. The SPARC Architecture Manual, 1987.


Instrumentation Tools - Pierce, Smith, Mudge (1995)   (2 citations)  (Correct)

....uses abstract execution [3] to minimize the amount of instrumentation overhead that occurs during the execution of an instrumented application. Second, they implemented their tool on a SPARC architecture where they could take advantage of several unused registers that are reserved by the SPARC ABI [27]. They use one of these reserved registers as the single, global, register based, trace buffer pointer that is shared by all instrumented executables. This decision removes the need for the copying of the per process trace buffers into the global trace buffer as seen in Chen s system. They also ....

Sun Microsystems, The Sparc Architecture Manual, 1989.


The Peregrine High-Performance RPC System - Johnson, Zwaenepoel (1993)   (34 citations)  (Correct)

....However, many other architectures also require that the translationlookaside buffer (TLB) entries for the remapped pages in the MMU be modified. Page remapping can still be performed efficiently in such systems with modern MMU designs. For example, the new Sun Microsystems SPARC reference MMU [14, 18] uses a TLB but allows individual TLB entries to be flushed by virtual address, saving the expense of reloading the entire TLB after a global flush. Similarly, the MIPS MMU [9] allows the operating system to individually modify any specified TLB entry. 4.2. The Packet Header The Peregrine RPC ....

....client thread while waiting for the RPC results to be returned must be a complete context switch, saving and restoring all registers. On processors with larger numbers of registers that must be saved and restored on a context switch and a kernel trap, such as the SPARC processor s register windows [18], this optimization will increase further in significance [1] 5. The arguments are mapped into the server s address space, rather than being copied. The cost of performing memory to memory copies was reported above. From Table IV, the cost of remapping the Ethernet receive buffer in the server to ....

Sun Microsystems, Inc. The SPARC architecture manual, version 8, January 1991.


Using Information from the Programmer to Implement System.. - Adve (1996)   (3 citations)  (Correct)

....this paper focuses on the last technique as further elaborated in Section 1.2. 1.1 Techniques to Relax Program Order and Atomicity Requirements Relaxed memory consistency models. Several relaxed memory consistency models have been proposed that allow out of order and non atomic memory operations [18, 20, 21, 28, 50, 51], and provide significant performance gains [25, 27, 53] The disadvantage of relaxed models, however, is that they require programmers to forgo the simple interface of sequential consistency; instead, programmers must deal with out of order and non atomic operations. Furthermore, several relaxed ....

....and systems, and then on the allowed optimizations. 7.1. 1 The Framework Several relaxed memory consistency models have been proposed, including weak ordering [21] release consistency (RCsc and RCpc) 28] lazy release consistency [34] processor consistency [28] SPARC V8 total store ordering [51], SPARC V8 partial store ordering [51] SPARC V9 relaxed memory ordering [50] the Alpha model [15, 20] and the PowerPC model [18] The models describe various relaxations of the program order and atomicity requirements of sequential consistency, thereby improving performance [25, 27, 53] A ....

[Article contains additional citation context not shown here]

Sun Microsystems Inc. The SPARC Architecture Manual, January 1991. No. 800-199-12, Version 8.


The Peregrine High-Performance RPC System - Johnson (1993)   (34 citations)  (Correct)

....However, many other architectures also require that the translation lookaside buffer (TLB) entries for the remapped pages in the MMU be modified. Page remapping can still be performed efficiently in such systems with modern MMU designs. For example, the new Sun Microsystems SPARC reference MMU [14, 18] uses a TLB but allows individual TLB entries to be flushed by virtual address, saving the expense of reloading the entire TLB after a global flush. Similarly, the MIPS MMU [9] allows the operating system to individually modify any specified TLB entry. 4.2 The Packet Header The Peregrine RPC ....

....client thread while waiting for the RPC results to be returned must be a complete context switch, saving and restoring all registers. On processors with larger numbers of registers that must be saved and restored on a context switch and a kernel trap, such as the SPARC processor s register windows [18], this optimization will increase further in significance [1] 5. The arguments are mapped into the server s address space, rather than being copied. The cost of performing memory to memory copies was reported above. From Table 4, the cost of remapping the Ethernet receive buffer in the server to ....

Sun Microsystems, Inc. The SPARC architecture manual, version 8, January 1991.


Register Windows and User-Space Threads on the SPARC - Keppel (1991)   (5 citations)  (Correct)

....one register set per thread (Appendix B) 2 Terminology, Background, and Conventions It is assumed that the user is already familiar with the implementation of a basic user space threads package. This document assumes familiarity with the basics of SPARC register windows and assembly language [Sun 91] The key concepts include: ffl Caller save and callee save registers: Two functions (the caller and the callee) will both use registers. The caller and the callee follow a protocol to ensure that the callee does not clobber (fill with garbage values) registers that have valid caller data. The ....

....outs) Saving cached register sets is discussed in the following section. 5.1 Regular, Leaf, or Inlined Procedure Different implementations of cswap will need to save different registers. Several choices are discussed below. If the thread context swap routine is not treated as a leaf function [Sun 91] then only the frame pointer ( fp) and function return address ( i7) registers need to be saved. As discussed in x2, globals and floatingpoint registers do not usually need to be saved, because most compilers and systems assume that they are clobbered across function calls. The in registers ....

Sun Microsystems. The SPARC Architecture Manual, Version 8, 1991.


Some Efficient Techniques for Simulating Memory - Peter Magnusson (1994)   (2 citations)  (Correct)

....data cache simulation with another useful feature in SIMICS, instruction profiling. Instruction profiling counts exactly how many times an instruction in a particular memory location was successfully executed. 23 We also indicate the combined effects. The first example runs a simple Sparc SunOS 4.1 user program, the infamous Dhrystone 2.1 benchmark (Weicker 84) In the measurement, it runs 100#000 iterations, which requires approximately 50 million instructions. The cache performance is here excellent (0.001 miss rate) so the STC performs admirably. Accurate data cache contents are ....

....larger Sparc program from the SPECint92 suite, which requires 1.25 billion instructions to complete. The data cache behavior is worse (a realistic working set) and 21 We have borrowed the term UMR from the commercial product Purify which has a similar function. 22 The measurements were done on a Sun SC2000. 23 In system level simulation this is more complex than just measuring entries into basic blocks, for several reasons; a basic block may be interrupted by an exception and not re entered, the program may generate code at runtime (such as trap vectors) etc. P. Magnusson, B. Werner Some ....

[Article contains additional citation context not shown here]

Sun 1990, The Sparc Architecture Manual, Version#8, Sun Microsystems, USA, December 1990.


Fault Interpretation: Fine-Grain Monitoring of Page Accesses - Daniel Edelson (1993)   (1 citation)  (Correct)

....and some caveats, and Sect. 7 concludes the report. 1 Caveat: This technique requires knowing the precise state of the CPU when a protection violation occurs. It may not be possible to implement this functionality on all RISC architectures. We have implemented it on the SPARC processor [Cyp90, Sun87] 2 1. Fault Interpretation: Memory Access Monitoring 1. Fault Interpretation: Memory Access Monitoring Fault interpretation allows an application to detect all reads and or writes to selected pages of its virtual address space. The library uses the mprotect system call to disallow accesses to ....

Sun Microsystems, Inc. The SPARC architecture manual, 1987. Part No. 800-- 11399--07.


Retargetable Instruction Scheduling for Pipelined Processors - Bradlee (1991)   (15 citations)  (Correct)

....require the user to write a separate instruction directive for each result. This may cause two instructions to be generated where one would suffice; it also prevents the use of auto increment addressing modes. Many RISCs do not have instruction side effects, but both the i860 and the Sun SPARC [Sun87] have condition codes and the i860 has autoincrement addressing. At the expense of occasional inefficient code, this restriction retains simplicity in deriving the patterns from the description and in matching the patterns. Peephole optimization techniques for exploiting instruction side effects ....

Sun Microsystems, Inc., Mountain View, California. The SPARC Architecture Manual, 1987.


RSIM Reference Manual - Pai, Ranganathan, Adve (1997)   (3 citations)  (Correct)

....memory consistency model. RSIM supports memory systems three types of multiprocessor memory consistency protocols: ffl Relaxed memory ordering (RMO) 23] and release consistency (RC) 6] ffl Sequential consistency (SC) 11] ffl Processor consistency (PC) 6] and total store ordering (TSO) [26] Each of these memory models is supported with a straightforward implementation and optimized implementations. We first describe the straightforward implementation and then the more optimized implementations for each of these models. 3.2. PROCESSOR MICROARCHITECTURE 17 The relaxed memory ....

Sun Microsystems Inc. The SPARC Architecture Manual, January 1991. No. 800-199-12, Version 8.


Tempest and Typhoon: User-Level Shared Memory - Reinhardt, Larus, Wood (1994)   (247 citations)  (Correct)

....processor memory nodes connected by a high bandwidth, low latency point to point network (see Figure 1) For economic reasons, commodity components are used for the processor, bus, memory controller, and DRAM. Specifically, each Typhoon node has a SuperSPARC processor connected to a level 2 MBus [31]. 1 The one custom component is the network interface device the network inter 1. However, the basic design should work with any coherent bus using an ownership protocol and cache to cache transfers. Processor N 1 L1 Processor 0 NP DRAM CPU L2 Network L1 NP DRAM CPU L2 FIGURE 1. ....

....be customized to implement arbitrary, application dependent scatter gather operations. 5.3 Virtual Memory Management Conventional paged virtual memory hardware is sufficient to provide the needed user level functions. The NP and primary CPU both implement versions of the SPARC 8 reference MMU [31]. While the primary processor and the NP may use separate page tables, they share a single table in our current implementation. The operating system interface is similar to that of [35] 5.4 Fine Grain Access Control As described in Section 2.4, the fine grain access control model provides ....

Sun Microsystems. The SPARC Architecture Manual (Version 8), December 1990.


The Cilk System for Parallel Multithreaded Computing - Joerg (1996)   (33 citations)  (Correct)

....hooks make it quite simple and cheap for commercial computer manufacturers to build inexpensive, entry level, multiprocessor machines. This trend towards including multiprocessor support in standard microprocessors occurred first with processors used in workstations (e.g. MIPS R4000[MWV92] Sparc[Sun89] PowerPC 601 [Mot93] and more recently with processors for PCs (e.g. Intel s Pentium P54C [Gwe94] As with any other commodity, as parallel machines drop in price, they become cost effective in new areas, leading to parallel machines being installed at more and more sites. If this trend ....

Sun Microsystems, Inc. Sparc Architecture Manual, Version 8, January 1989.


The Importance of Prepass Code Scheduling for.. - Chang, Lavery.. (1994)   (13 citations)  (Correct)

....to make prescheduling necessary. 2.2 IMPACT I C Compiler The IMPACT I C Compiler [7] is a retargetable, optimizing compiler designed to generate very efficient code for pipelined and multiple instruction issue processors. Code generators have been built for the MIPS R2000 [8] the Sun SPARC [9], the AMD 29K [10] the Intel i860 [11] and the HP PA [12] processors. IMPACT I is used to study the effectiveness of new code optimization techniques and to study alternative approaches in the design of processors that exploit instructionlevel parallelism. The compiler contains a profiler to ....

Sun Microsystems, "The SPARC Architecture Manual," Part No. 800-1399-07, Revision 50, Mountain View, CA, Aug. 1987.


CSDL: Reusable Computing System Descriptions for Retargetable.. - Bailey   (Correct)

No context found.

Sun Microsystems Corporation. The SPARC Architecture Manual, Version 7, 1987.


The Performance Potential of Data Dependence.. - Sazeides, Vassiliadis, .. (1996)   (43 citations)  (Correct)

No context found.

S. MICROSYSTEMS. The SPARC Architecture Manual. Prentice Hall, 1992.


The Peregrine High-performance RPC system - Johnson, Zwaenepoel (1993)   (34 citations)  (Correct)

No context found.

Sun Microsystems, Inc. The SPARC Architecture Manual, version 8, January 1991.


The Oberon System Family - Brandis, Crelier, Franz, Templ (1995)   (Correct)

No context found.

SUN Microsystems, The SPARC Architecture Manual, Revision 50, August 1987.


The Spineless Tagless G-Machine - NOT! - Hammond (1993)   (1 citation)  (Correct)

No context found.

Sun Microsystems Inc. The SPARC Architecture Manual, Version 7, October 22, 1987.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC