7 citations found. Retrieving documents...
McKee, S.A., "An Analytic Model of SMC Performance", University of Virginia, TR CS-93-54, November, 1993.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Dynamic Access Ordering: Bounds on Memory Bandwidth - McKee (1994)   (1 citation)  Self-citation (Mckee)   (Correct)

.... 2 numerous simulation results demonstrating its effectiveness [McK94a] The hardware part of this solution is the Stream Memory Controller (SMC) An analytical model to bound asymptotic SMC performance for unit stride vectors has been developed and extended for non unit stride vectors in [McK93b, McK93c]. Here we develop a model to bound SMC performance on short vectors, and we extend the asymptotic model to describe symmetric multiprocessor (SMP) SMC performance. Note that we shall use the terms vector and stream interchangeably when doing so causes no confusion: a read vector is equivalent ....

....overwhelming similarity of the performance trends for most benchmarks and system configurations, we only discuss highlights of our results here. Other results can be found in the Appendix, and a more detailed comparison of uniprocessor simulation and asymptotic model results can be found in [McK93b, McK93c]. Figure 10 depicts results for a uniprocessor SMC system with one bank. The vectors for Figure 10(a) and Figure 10(b) are 100 elements long; those for Figure 10(c) and Figure 10(d) are 10,000 elements. Figure 10(a) and Figure 10(c) illustrate the performance curves for vaxpy, which involves ....

McKee, S.A., "An Analytic Model of SMC Performance", University of Virginia, Technical Report CS-93-54, November, 1993.


Maximizing Memory Bandwidth for Streamed Computations - McKee (1995)   (7 citations)  Self-citation (Mckee)   (Correct)

.... Multiprocessor SMC organization introduced in Chapter 4 were first presented in [McK95b] Parts of the results in Chapter 2 appear in [McK95a] Complete results for the functional simulations and analytic models presented in Chapter 2 through Chapter 5 can be found in our technical reports [McK93a,McK93b,McK94c,McK94d]. Maximizing Memory Bandwidth for Streamed Computations Introduction Access Ordering Conclusions The SMC Dense Matrix Uniprocessor Sparse Matrix Performance Performance Implementation Concerns Other Systems Issues Compiler Recommendations Hardware Development Uniprocessors Symmetric ....

S.A. McKee, "Analytic Models of SMC Performance", Technical Report CS-94-38, Department of Computer Science, University of Virginia, October 1994.


Maximizing Memory Bandwidth for Streamed Computations - McKee (1995)   (7 citations)  Self-citation (Mckee)   (Correct)

.... Multiprocessor SMC organization introduced in Chapter 4 were first presented in [McK95b] Parts of the results in Chapter 2 appear in [McK95a] Complete results for the functional simulations and analytic models presented in Chapter 2 through Chapter 5 can be found in our technical reports [McK93a,McK93b,McK94c,McK94d]. Maximizing Memory Bandwidth for Streamed Computations Introduction Access Ordering Conclusions The SMC Dense Matrix Uniprocessor Sparse Matrix Performance Performance Implementation Concerns Other Systems Issues Compiler Recommendations Hardware Development Uniprocessors Symmetric ....

S.A. McKee, "An Analytic Model of SMC Performance", Technical Report CS-93-54, Department of Computer Science, University of Virginia, November 1993.


Uniprocessor SMC Performance on Vectors with Non-Unit Strides - McKee (1994)   (1 citation)  Self-citation (Mckee)   (Correct)

....vectors an SMC system with 16 deep FIFOs to perform comparably to a similar system with deeps FIFOs and a vector stride equal to the page size divided by sixteen times the data size. The relationship turns out to be slightly more complicated than this, but is explained by the analytic model of [McK93d]. Let b be the number of interleaved memory banks, f be the depth of the FIFOs, v be the number of distinct vectors in the computation, and s be the number of streams. A single Uniprocessor SMC Performance on Vectors with Non Unit Strides 18 access vector constitutes one stream, whereas a ....

....streams. A single Uniprocessor SMC Performance on Vectors with Non Unit Strides 18 access vector constitutes one stream, whereas a read modify write vector counts as two. The model states that the average page miss rate for each FIFO in a stride 1 computation involving at least two vectors is [McK93d]. This formula also applies to vectors with non unit strides, provided that the number of vector elements residing in a DRAM page is significantly greater than the FIFO depth. The part of this equation comes from taking the limit of a converging series; this series models the effect created by the ....

[Article contains additional citation context not shown here]

McKee, S.A., "An Analytic Model of SMC Performance", University of Virginia, TR CS-93-54, November, 1994.


Dynamic Access Ordering for Symmetric Shared-Memory Multiprocessors - McKee (1994)   Self-citation (Mckee)   (Correct)

No context found.

McKee, S.A., "An Analytic Model of SMC Performance", University of Virginia, TR CS-93-54, November, 1993.


Dynamic Access Ordering for Symmetric Shared-Memory Multiprocessors - McKee (1994)   (Correct)

No context found.

McKee, S.A., "An Analytic Model of SMC Performance", University of Virginia, TR CS-93-54, November, 1993.


Uniprocessor SMC Performance on Vectors with Non-Unit Strides - Computer Science Report   (Correct)

No context found.

McKee, S.A., "An Analytic Model of SMC Performance", University of Virginia, TR CS-93-54, November, 1994.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC