5 citations found. Retrieving documents...
Greg Henry, Pat Fay, Ben Cole, and Timothy G. Mattson. The performance of the Intel TFLOPS supercomputer. Intel Technical Journal, 98(1), 1998. available online at http://developer.intel.com/technology/itj/.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Communication Lower Bounds for Distributed-Memory Matrix.. - Irony, Toledo   (Correct)

....computations, including matrix multiplications. They showed that blocked algorithms transferred fewer words between fast and slow memory than algorithms that operated by row or by column. High quality implementations of I O e#cient matrix multiplication algorithms are widely available and used [2, 1, 5, 7, 11, 14, 17, 15, 16, 18, 19, 26] The proof of the next theorem is very similar to the proof of Lemma 3.1. Theorem 7.1. Consider the conventional multiplication of two n by n matrices on a computer with a large slow memory and a fast cache that can contain M words. Arithmetic operations can only be performed on words that are in ....

Greg Henry, Pat Fay, Ben Cole, and Timothy G. Mattson. The performance of the Intel TFLOPS supercomputer. Intel Technical Journal, 98(1), 1998. available online at http://developer.intel.com/technology/itj/.


Selective Cache Ways: On-Demand Cache Resource Allocation - Albonesi (2000)   (85 citations)  (Correct)

....In the Intel Pentium Pro processor, the WBINVD instruction causes modified blocks in the L1 Dcache and the L2 cache to be flushed to memory and all status bits set to Invalid. On the Intel TFLOPS supercomputer, a WBINVD writes back data at a rate of 250 MB sec over the 66MHz 64 bit memory bus [19]. Assuming a 64KB L1 Dcache with 50 modified blocks and three ways to flush, a cache flush at this rate would take about 100 s. However, this assumes that all writebacks must be sent to main memory. Because we only need to flush the L1 Dcache, writebacks may hit to a block in a non shared state ....

G. Henry, P. Fay, B. Cole, and T. Mattson, "The performance of the Intel TFLOPS supercomputer," Intel Technology Journal, Q1 1998.


An Overview of the Intel TFLOPS Supercomputer - Mattson, Henry (1998)   (5 citations)  Self-citation (Henry Mattson)   (Correct)

....running on the machine, significant results have been produced. Some of these results and a detailed discussion of performance issues related to the system are described in another paper in this issue of the Intel Technology Journal entitled The Performance of the Intel TFLOPS Supercomputer [10]. In this paper, we describe the motivation behind this machine, the system hardware and software, and how the system is used by both programmers and the end users. The level of detail varies. When a topic is addressed elsewhere, it is discussed only briefly in this paper. For example, we say very ....

G. Henry, B. Cole, P. Fay, T.G. Mattson, "The Performance of the Intel TFLOPS Supercomputer," Intel Technology Journal, Q1'98 issue, 1998.


Application of a High Performance Parallel Eigensolver to.. - Sears, Stanley, Henry   (1 citation)  Self-citation (Henry)   (Correct)

....is optimal for just benchmarking the eigensolver. If in fact MP Linpack is run with a matrix size comparable to that used in our tests then the performance is also comparable. To test this we made an MP Linpack run, using the same Intel tuned MPLinpack code used to obtain a Teraflop last June [15, 22], but on a problem size requiring the same space as storing the matrix for Si 3072 . This problem had a complex double precision matrix of size 39936, which is roughly equivalent to a real double precision matrix of size 56500. Running MPLinpack on this equivalent size problem we obtained ....

G. Henry, P. Fay, B. Cole, and T.G. Mattson. The performance of the intel tflops supercomputer.


Communication Lower Bounds for Distributed-Memory Matrix.. - Irony, Toledo (2004)   (Correct)

No context found.

Greg Henry, Pat Fay, Ben Cole, and Timothy G. Mattson. The performance of the Intel TFLOPS supercomputer. Intel Technical Journal, 98(1), 1998. available online at http://developer.intel.com/technology/itj/.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC