26 citations found. Retrieving documents...
Pentium II Processor Specification Update. Intel Corporation.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Modeling the Effect of Technology Trends on the.. - Shivakumar.. (2002)   (10 citations)  (Correct)

....SER per individual logic chain and more than 100 times increase in logic chains per chip. At 50nm with 6 FO4 pipeline, the SER per chip of logic exceeds that of latches, and is within two orders of magnitude of the SER per chip of unprotected memory elements. Mainstream microprocessors from Intel [14] and other vendors [17] have employed ECC to reduce SER of SRAM caches at feature sizes of up to 350nm. For processors that use ECC to protect a large portion of the memory elements on the chip, logic will quickly become the dominant source of soft errors. 6 Discussion The primary focus of our ....

Pentium II Processor Specification Update. Intel Corporation.


Modeling the Effect of Technology Trends on the.. - Shivakumar.. (2002)   (10 citations)  (Correct)

....SER per individual logic chain and more than 100 times increase in logic chains per chip. At 50nm with 6 FO4 pipeline, the SER per chip of logic exceeds that of latches, and is within two orders of magnitude of the SER per chip of unprotected memory elements. Mainstream microprocessors from Intel [14] and other vendors [17] have employed ECC to reduce SER of SRAM caches at feature sizes of up to 350nm. For processors that use ECC to protect a large portion of the memory elements on the chip, logic will quickly become the dominant source of soft errors. 6 Discussion The primary focus of our ....

Pentium II Processor Specification Update. Intel Corporation.


Proving the IEEE Correctness of Iterative Floating-Point.. - Cornea-Hasegan   (Correct)

....by the rounding mode) is a direct consequence of the IEEE correctness of the result of the floating point divide operation, as rem(a,b) a bnear(a b) where near( is the nearest integer to the real number . Two problems arise. First, we must verify that near(a b) fits in an integer ([4] explains this) Second, we must counter the possibility of a double rounding error because what we calculate is actually rem(a,b) a bnear ( a b) rn ) To do this, we just need to compare (a b) rn and (a b) rz , and apply a correction if needed. Note that this might occur only in the tie ....

Pentium Pro Family Developer's Manual, Intel Corporation, 1996, pp. 11-152 to 11-154.


Adaptive Line Size Cache - Tang, Veidenbaum, Nicolau, Gupta   (Correct)

.... not make a lot of sense, except possibly as a mechanism to reduce its latency and, as a result, increase the processor clock rate which the cache largely determines, as proposed in [1] Write policy can be switched between write2 through and write back, in fact the Intel Pentium TM architecture [9] already allows this on a per page basis but not automatically. However, the parameter that is likely to deliver a significant performance improvement while being feasible to implement adaptively is the cache line size. This paper introduces a cache design with a hardware adaptive line size. To ....

Pentium TM Processor User's Manual, Intel Corporation, 1993.


An Accurate Time-management Unit for Real-time Processors - Kailas, Agrawala (1997)   (Correct)

....high performance processor architectures are not often useful to implement these ideas even though the hardware may support a fast system clock. Some of the embedded microprocessors such as Intel 80x196, Intel386 EX, and commercial high performance processors such as Pentium and Pentium Pro [7] provide on chip timers driven by the processor clock to implement system clocks with better time granularity. Computers that do not use such processors usually use an external hardware timer [6] 3] on the processor bus. But, these timers have been known to lose time during operation [19] For ....

Pentium Pro Family Developer's Manual, volume 1-3. Intel Corporation, Mt. Prospect, IL, 1996.


A High-Level Language for Programming Complex Temporal.. - Chou, Huang, Fujita (1997)   (1 citation)  (Correct)

....the negation of e as its clock. Furthermore, it can be shown that C is associative, so the temporal projections can be stacked one on top of another to form a hierarchical description of temporal behaviors at different granularities of time. For example, in the Pentium Pro processor bus protocol [15], a memory or I O operation may consist of multiple transactions, each of which consists of multiple phases, each of which takes multiple clock cycles. A controller for a bus agent might use the following code fragment: operation C transaction C phase where operation (resp. transaction and ....

....translation be one of our main research topics in the future. We are also considering the feasibility of implementing YASL as a macro package for Verilog or VHDL. 5. 2 Examples We are currently investigating the feasibility of formal modeling in YASL of the Pentium Pro processor bus protocol [15]. One problem that has to be solved is the introduction of data (at least bit vectors) into YASL. As we have mentioned earlier, it is not clear how conflicts in data assignments should be handled. Our intuition is that we probably have to live with this difficulty, since it is inherently hard to ....

Pentium Pro Family Developer's Manual, Vol. 1: Specifications, Intel Corporation, 1996.


Adapting Cache Line Size to Application Behavior - Veidenbaum, Tang, Gupta.. (1999)   (16 citations)  (Correct)

.... not make a lot of sense, except possibly as a mechanism to reduce its latency and, as a result, increase the processor clock rate which the cache largely determines, as proposed in [1] Write policy can be switched between write through and write back, in fact the Intel Pentium TM architecture [7] already allows this on a per page basis but not automatically. However, the parameter that is likely to deliver a significant performance improvement while being feasible to implement adaptively is the cache line size. This paper introduces a cache design with a hardware adaptive line size. To ....

Pentium TM Processor User's Manual, Intel Corporation, 1993.


Interleaved Sectored Caches: reconciling low tag volume and low.. - Seznec (1993)   (Correct)

.... An attractive solution is to include in a single chip the tag array and all the logic of the cache controller (see figure 2) this allows to maintain low cycle time for determining cache hit or miss, but on the other hand the integration density automatically limits the size of the tag array [18, 10]. 1 The tag array associated with the data cache of the Intel Pentium [10] is even fully triple ported (with random access) while the data cache itself is 8 way interleaved and supports two non conflicting accesses Interleaved Sectored Caches: reconciling low tag volume and low miss ratio 3 ....

.... the logic of the cache controller (see figure 2) this allows to maintain low cycle time for determining cache hit or miss, but on the other hand the integration density automatically limits the size of the tag array [18, 10] 1 The tag array associated with the data cache of the Intel Pentium [10] is even fully triple ported (with random access) while the data cache itself is 8 way interleaved and supports two non conflicting accesses Interleaved Sectored Caches: reconciling low tag volume and low miss ratio 3 microprocessor 2nd cache control tags 2nd cache main memory System bus ....

Pentium Processor User's Manual, Intel Corporation, 1993


Decoupled Sectored Caches: conciliating low tag implementation.. - Seznec (1994)   (13 citations)  (Correct)

....the cache is off chip, the tag array cannot be built with the same RAM chips as the data array itself. Including the tag array and the whole control logic for the cache in a single companion chip (see figure 2) is an attractive solution that has been adopted for the design of many L2 caches (e.g. [17, 10]) In this case, 2 The tag array associated with the data cache of the Intel Pentium [10] is even triple ported (with random access) while the data cache itself is 8 way interleaved and only supports two non conflicting accesses the maximum size of the tag array is automatically limited by ....

....itself. Including the tag array and the whole control logic for the cache in a single companion chip (see figure 2) is an attractive solution that has been adopted for the design of many L2 caches (e.g. 17, 10] In this case, 2 The tag array associated with the data cache of the Intel Pentium [10] is even triple ported (with random access) while the data cache itself is 8 way interleaved and only supports two non conflicting accesses the maximum size of the tag array is automatically limited by integration density. microprocessor L2 cache control tags L2 cache main memory System ....

[Article contains additional citation context not shown here]

Pentium Processor User's Manual, Intel Corporation, 1993


A Framework For Using The Pentium's Performance Monitoring Hardware - Safford (1997)   (2 citations)  (Correct)

....Update [14] While none of the errata are very serious, most of them are explained along with the event description. For more information on the new events contained in the Pentium Pro processor, consult [15] 17] For more information on the new events contained in the Pentium with MMX, consult [18]. x# x#data read [0x00] data write [0x01] data read or write [0x28] These events count data read or write accesses, regardless of whether they hit or miss in the internal data cache. I O reads and writes are not included. In the case of split accesses, each individual component read or write ....

Pentium Processor Family Developer's Manual. Intel Corporation, 1997.


A Framework For Using The Pentium's Performance Monitoring Hardware - Safford (1997)   (2 citations)  (Correct)

....Chapter 26 [6] Some events have errata, found in the Pentium Processor Specification Update [14] While none of the errata are very serious, most of them are explained along with the event description. For more information on the new events contained in the Pentium Pro processor, consult [15] [17]. For more information on the new events contained in the Pentium with MMX, consult [18] x# x#data read [0x00] data write [0x01] data read or write [0x28] These events count data read or write accesses, regardless of whether they hit or miss in the internal data cache. I O reads and writes ....

Pentium Pro Processor Specification Update, February 1997. Intel Corporation, 1997. 77


A Framework For Using The Pentium's Performance Monitoring Hardware - Safford (1997)   (2 citations)  (Correct)

....miss in the L2 cache. For more information concerning Pentium performance monitoring counter events, consult the Pentium Processor Family Developer s Manual, Volume 3: Architecture and Programming Manual, Chapter 26 [6] Some events have errata, found in the Pentium Processor Specification Update [14]. While none of the errata are very serious, most of them are explained along with the event description. For more information on the new events contained in the Pentium Pro processor, consult [15] 17] For more information on the new events contained in the Pentium with MMX, consult [18] x# ....

Pentium Processor Specification Update, January 1997. Intel Corporation, 1997.


Interleaved Sectored Caches: reconciling low tag volume and low.. - Seznec (1993)   (Correct)

.... An attractive solution is to include in a single chip the tag array and all the logic of the cache controller (see figure 2) this allows to maintain low cycle time for determining cache hit or miss, but on the other hand the integration density automatically limits the size of the tag array [18, 10]. 2.2 Sectored caches versus large line sizes In order to reduce this high cost of address tags, larger line sizes (e.g. 128 bytes as on the IBM Power [8] may be used; but many valuable arguments push designers to prefer small cache lines: A) Experience shows that, for a large number of ....

.... to prefer small cache lines: A) Experience shows that, for a large number of applications and for many cache sizes and organizations, the minimum cache miss ratio is obtained for line sizes in the range of 16 to 64 bytes [7] 1 The tag array associated with the data cache of the Intel Pentium [10] is even fully triple ported (with random access) while the data cache itself is 8 way interleaved and supports two non conflicting accesses Inria Interleaved Sectored Caches: reconciling low tag volume and low miss ratio 5 microprocessor L2 cache control tags L2 cache main memory System ....

Pentium Processor User's Manual, Intel Corporation, 1993


Coherence Controller Architectures for SMP-Based.. - Michael, Nanda, Lim.. (1997)   (15 citations)  (Correct)

....tradeoffs between these two alternatives in designing a CC NUMA multiprocessor coherence controller. We consider symmetric multiprocessor (SMP) nodes as well as uniprocessor nodes as the buildingblock for a multiprocessor. The availability of cost effective SMPs based on the Intel Pentium Pro [9] makes SMP nodes an attractive choice for CC NUMA designers [6] However, the added load presented to the coherence controller by multiple SMP processors may affect the choice between custom hardware FSMs and protocol processors. We base our experimental evaluation of the alternative coherence ....

....to proceed before handling any more network side requests. Figure 2 shows a block diagram of a custom hardware coherence controller design (HWC) The controller runs at 100 MHz, the same frequency as the SMP bus. All the coherence controller components are on the same chip ex (e,g. Pentium Pro [9]) allow users to designate regions of memory to be cached write through. RPE bus side fast directory controller side directory cache directory controller access directory network interface to network bus interface protocol dispatch controller to SMP bus LPE Figure 4: A custom hardware coherence ....

Pentium Pro Family Developer's Manual. Intel Corporation, 1996.


Don't Use the Page Number, But a Pointer on It - Andre Seznec   (Correct)

....sizes in storage bits: ffl In many microprocessors, the tag array has to service more transactions at a time than the cache array itself. In order to maintain cache coherency, the tag array may also have to support a snooping transaction on the system bus. For instance, on the Intel Pentium [7], the tag array of the data cache is triple ported (with random access) while the data cache itself is 8 way interleaved and supports only two non conflicting accesses per cycle. ffl The cache hit time is often one of the critical path in a processor. The hit time includes the delays for reading ....

Pentium Processor User's Manual, Intel Corporation, 1993


Whole-Program Optimization of Object-Oriented Languages - Dean (1996)   (20 citations)  (Correct)

....unknown routines) When dynamic dispatches are used infrequently, this does not significantly impact performance. Frequent use of dynamic dispatching, however, can have substantial performance implications. Modern architectures, such as the MIPS R10000 [Martin et al. 95] or the Intel Pentium Pro [Int95] exploit fine grained parallelism by having a large window of ready instructions to issue and they rely on predictable control flow in order to keep this window full of useful instructions. In such systems, the frequent indirect control transfers associated with dynamic dispatches can be a ....

Pentium Pro Family Programmer's Reference Manual. Intel Corporation, Inc., Santa Clara, CA, 1995.


Temporal accuracy and modern high performance processors: A case.. - Kailas (1997)   (Correct)

....with 12 stage pipeline. The processor has an instruction pool coupled with three independent units, viz. the Fetch Decode unit, the Dispatch Execute unit and the Retire unit as shown in Figure 1. A user program is executed by the Pentium Pro processor as follows (for a detailed description see [2, 5]) The user program instruction stream is fetched from the instruction cache and decoded into a series of micro operations ( ops) by the Fetch Decode unit. Pre fetching of instructions is speculative, based on a dynamic branch prediction scheme. The Dispatch Execute unit speculatively executes the ....

.... accuracy and modern high performance processors 3 FETCH LOAD STORE INSTRUCTION POOL RETIRE UNIT UNIT UNIT SYSTEM BUS DECODE ops FETCH D CACHE I CACHE L2 CACHE EXECUTE DISPATCH L1 BUS INTERFACE UNIT L1 Figure 1: Pentium Pro schematic (adapted from Pentium Pro Family Developer s Manual [2]) The Pentium Pro architecture offers two interesting timing mechanisms a pollable 64 bit time register called the Time Stamp Counter(TSC) and a 32 bit programmable timer. The TSC register is incremented at the processor s clock speed and can be accessed with either one of these two ....

[Article contains additional citation context not shown here]

Pentium Pro Family Developer's Manual, volume 1-3. Intel Corporation, Mt. Prospect, IL, 1996.


Coherence Controller Architectures for SMP-Based.. - Michael, Nanda, Lim.. (1997)   (15 citations)  (Correct)

....these two alternatives in designing a CC NUMA multiprocessor coherence controller. We consider symmetric multiprocessor (SMP) nodes as well as uniprocessor nodes as the building block for a multiprocessor. The availability of cost effective SMPs, such as those based on the Intel Pentium Pro [11] makes SMP nodes an attractive choice for CC NUMA designers [7] However, the added load presented to the coherence controller by multiple SMP processors may affect the choice between custom hardware FSMs and protocol processors. We base our experimental evaluation of the alternative coherence ....

....of the controller through loads and stores on the local (coherence controller) bus to memory mapped off chip registers in the other components. The protocol processor access to the protocol dispatch controller 1 Although most processors use write back caches, current processors (e.g. Pentium Pro [11]) allow users to designate regions of memory to be cached write through. RPE bus side fast directory controller side directory cache directory controller access directory network interface to network bus interface protocol dispatch controller to SMP bus LPE Figure 4: A custom hardware coherence ....

Pentium Pro Family Developer's Manual. Intel Corporation, 1996.


Critical Issues Regarding the Trace Cache Fetch Mechanism - Patel, Friendly, Patt (1997)   (29 citations)  (Correct)

....instructions per cycle, the problem is primarily one of encountering a cache line boundary before the full number of instructions are retrieved. While troublesome, this problem is overcome with straightforward techniques such as fetching two adjacent cache lines and realigning the instructions [11]. For processors capable of executing six or more instructions per cycle, the need to fetch beyond control instructions arises. As the focus of our current research in uniprocessor design has concentrated on wide issue machines (e.g. 16 wide) the design of high bandwidth fetch engines is extremely ....

Pentium Processor User's Manual Volume 1: Pentium Processor Data Book, Intel Corporation, 1993.


Modeling the Impact of Device and Pipeline Scaling .. - Shivakumar.. (2002)   Self-citation (Processor)   (Correct)

No context found.

Pentium II Processor Specification Update. Intel Corporation.


Modeling the Impact of Device and Pipeline Scaling .. - Shivakumar.. (2002)   Self-citation (Processor)   (Correct)

No context found.

Pentium II Processor Specification Update. Intel Corporation.


Modeling the Impact of Device and Pipeline Scaling .. - Shivakumar.. (2002)   Self-citation (Processor)   (Correct)

....SER per individual logic chain and more than 100 times increase in logic chains per chip. At 50nm with 6 FO4 pipeline, the SER per chip of logic exceeds that of latches, and is within two orders of magnitude of the SER per chip of unprotected memory elements. Mainstream microprocessors from Intel [18] and other vendors [21] have employed ECC to reduce SER of SRAM caches at feature sizes of up to 350nm. For processors that use ECC to protect a large portion of the memory elements on the chip, logic will quickly become the dominant source of soft errors. 6 Discussion The primary focus of our ....

Pentium II Processor Specification Update. Intel Corporation.


Message Passing Efficiency on Shared Memory Architectures - Thomas Radke University   (Correct)

No context found.

Pentium TM Processor User's Manual, Intel Corporation, 1994. 10 RA-TR-96-05, c fl TUCZ:


FLECKmarks: Measuring Floating Point Performance using a FulL.. - Darcy, Gay   (Correct)

No context found.

Pentium Pro Family Developer's Manual, Intel Corporation, 1996.


Virtual Memory in Contemporary Microprocessors - Jacob, Mudge (1998)   (11 citations)  (Correct)

No context found.

Pentium Processor User's Manual, Intel Corporation, Mt. Prospect, Ill., 1993.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC