| Dongarra, J. 1998. Performance of various computers using standard linear equations software. Report CS-89-85, Department of Computer Science, University of Tennessee,TN. |
....source code was linked with the Solaris thread library and was compiled using Sun s C Compiler Version 4. 0, with options: xtarget=ultra xarch=v8plus xcache=16 32 1:4096 64 1 O5 75 For a comparison of how this platform performs in relation to other platforms see the Linpack Benchmark Report [Dongarra,92] An updated version of the report can be found at http: www.netlib.org benchmark performance.ps. It should be noted that although the test bed had fourteen processors, we conducted experiments with at most twelve active threads. This is because we did not have access to all the processors of ....
Dongarra, J.J.: Performance of various computers using standard linear equation software, Computer Architecture News, pages 22-44, 1992.
....of memory operations to computation is high in this kernel, making it unfavorable for our algorithms. The overhead due to irregular memory access patterns would decrease with more computation in the loop. All experiments reported in this section were done on a Sparcstation 2, for which Dongarra [3] reports a 100 # 100 double precision LINPACK performance of 4.0 Mflops. We measured 3.3 Mflops. A repetition of the experiments on a single processor of an iPSC 860 18 showed similar characteristics. We used the GNUC compiler, with optimization at the O2 level. A higher optimization level ....
J. J. Dongarra. Performance of various computers using standard linear equations software. Computer Architecture News, 20(3):22--44, June 1992.
....systems, the most famous of them is the LINPACK benchmark, which involves solving a system of linear equations. It is used by most manufacturers to compare different systems and to advertise their products. The world TOP500 ranking [1] is based on the LINPACK benchmark. Prof. Jack Dongarra [2] of the University of Tennessee, the author of LINPACK, has pointed out that benchmark programs reflect only one small problem area and should not be used to judge the overall performance of a computer system. Therefore, it is important This material may not be published, modified or otherwise ....
Jack J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Software", University of Tennessee, CS-89-85, April 11, 1999
....When running in parallel, codes compiled under pghpf scale from slightly to significantly better than when compiled under xlhpf. The di#erence is mainly from better performance of communications such as cshift, spread, sum and gather scatter under pghpf. While numerous benchmarking packages [15, 6, 1, 2, 10] have been developed for measuring supercomputer performance, we are aware of only two that evaluate HPF compilers. The NAS parallel benchmarks [1] were first developed as paper and pencil benchmarks that specify the task to be performed and allow the implementor to choose algorithms as well as ....
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical Report CS-89-85, University of Tennessee, Department of Computer Science, 1989.
....programming: C or Fortran code can usually be translated into the equivalent Lisp code. 3 Numerical eciency of Common Lisp In this section, we test Common Lisp for its numerical eciency. We do this by considering two important operations from the Basic Linear Algebra Subroutines (BLAS) see [8] [5]. Most computations in numerical analysis and scienti c computing have to perform at least one of the following operations: Scalar product: Compute the scalar product or dot product between two vectors x = x 1 ; x n ) and y = y 1 ; y n ) The formula for this dot ....
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical report, Computer Science Department, University of Tennessee, 1998.
....curve, the ratio between the dual and single processor implementations of the same library are given. 6. 5 Impact One easily measurable impact of the described approach can be summarized by looking at its e ect on the performance attained by the Massively Parallel LINPACK Benchmark (MP LINPACK) [10]. The LINPACK benchmark measures the performance attained by a given architecture when solving a linear system of equations in 64 bit arithmetic via an LU factorization with partial pivoting. Highperformance implementations cast the LU factorization in terms of matrix multiplication [9] The HPL ....
J.J. Dongarra. Performance of various computers using standard linear equations software, (LINPACK benchmark report). University of Tennessee Computer Science Technical Report CS-89-85, Oct. 2002.
....375 62.5 MHz, DECstation 5000 240, PC486 33, HP735 125 MHz 256 MB memory. Table 7 shows benchmark values of SPECint95 and Mflop s. The values of SPECint95 are taken from the WWW site of SPEC (Standard Performance Evaluation Corporation) and Mflop s are the values of LINPACK benchmark in [9]. Based on these, we give rough estimates on machine speeds in the column estimate , where the speed of Sun Ultra 2 is normalized to one, and a larger value means that the computer is faster. The results of CNS and CFT in Table 6 are exhibited estimating their time limits according to Table 7. ....
J.J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Software, " Technical Report No. CS-89--85, Computer Science Department, University of Tennessee, July 1999 (available as http://www.netlib.org/benchmark/performance.ps).
....systems. In contrast to performance modelling, which requires an abstracted, analytical characterisation of the load, performance measurements (and some simulation based approaches) require the generation of executable, synthetic workloads. Standardised benchmarks (including for example LINPACK [6] or the NAS kernels [1] are often used for quoting the peak performance of parallel architectures. From the standpoint of the person engaged in the performance evaluation activity, the use of a 4 standard benchmark program suffers from one significant limitation the lack of control over the ....
J. J. Dongarra. Performance of Various Computers Using Standard Linear Equations Software. Technical Report CS-89-85, UniversityofTennessee and Oak Ridge National Laboratory,November 1995.
....and benchmark values of SPECint95, SPECfp95 and Mflop s, respectively. The values of SPECint95 and SPECfp95 are taken from the WWW site of SPEC (Standard Performance Evaluation Corporation) except for the data of SPECfp95 of Dell XPS D300, which is in [24] and Mflop s are taken from [6]. In the table, the row TS run shows the inverse of the average computational time needed for algorithm TS to obtain prespecified target solutions, where the time for Sun Ultra 2 is normalized as one. For comparison purposes, we also show the data for Gateway GP6 350 (Pentium II, 350MHz) From ....
J.J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Software," Technical report: Computer Science Department, University of Tennessee, Knoxville, TN 37996-1301, and Mathematical Sciences Section, Oak Ridge National Laboratory, Oak Ridge, TN 37831, 1999 (available as http://www.netlib.org/benchmark/performance.ps).
....these rates are only rarely approached. A more realistic approach is to use a common application to measure computer performance. Since computational linear algebra is at the heart of many scientific problems, the de facto standard benchmark has become the linear algebra benchmark, LINPACK [1,7]. The LINPACK benchmark measures the time it takes to solve a dense system of linear equations. Originally, the system size was fixed at 100, and users of the benchmark had to run a specific code. This form of the benchmark, however, tested the quality of compilers, not the relative speeds of ....
....Paragon [5] 3744 143 1994 Intel Paragon [5] 6768 281 1996 Hitachi CPPACS [6] 2048 368 1996 Intel ASCI Option Red Supercomputer 7264 1060 1997 Intel ASCI Option Red Supercomputer 9152 1340 Table 1: MP LINPACK world records in the 1990 s. This data was taken from the MP LINPACK benchmark report [7]. Compute Nodes 4,536 Service Nodes 32 Disk I O Nodes 32 System Nodes (Boot) 2 Network Nodes (Ethernet, ATM) 10 System Footprint 1,600 Square Feet Number of Cabinets 85 System RAM 594 Gbytes Topology 38x32x2 Node to Node bandwidth Bidirectional 800 MB sec Bi directional Cross section ....
Dongarra, J.J., "Performance of various computers using standard linear equations software in a Fortran environment," Computer Science Technical Report CS-89-85, University of Tennessee, 1989, http://www.netlib.org/benchmark/performance.ps
....order to maximise performance. In order to implement this, the compiler is used to analyse the source program and deduce which program sections would benefit from being held in their own partition. As an example of how this works, consider the implementation of the daxpy kernel, from the Linpack [3] benchmark suite, shown in Figure 2. In this simple example it is unlikely that there will be any intra thread competition for space in the instruction cache which will impact on performance. However, when placed in a multi threaded context, performance could depend on often executed program ....
....is invoked, the compiler has no way of knowing exactly how many machine instructions are required to implement a given section of code. 4 Results In order to demonstrate the effectiveness of our instruction partitioning mechanism, we conducted a number of experiments using kernels from Linpack [3], Livermore Loop Fortran Kernels [14] LLFK) and multimedia benchmarks from suites such as MediaBench [12] We used a simple RISC based simulator to run the compiled examples on a variety of cache architectures. As a simple example of typical results, we present a Fast Fourier Transform or FFT ....
J.J. Dongarra. Performance of Various Computers Using Standard Linear Equation Software. Technical Report CS-89-85, Department of Computer Science, University of Tennessee, June 2000.
....the successor to the CCM. CCM MP 2D is currently used for benchmarking parallel systems. Our kernel codes are the following. 5) Linpack. The High Performance Linpack (HPL) benchmark solves a (random) dense linear system of equations in double precision arithmetic on distributed memory computers [5]. HPL solves the linear system by computing the LU factorization with partial row pivoting of the n (n 1) coe#cient matrix [A b] The data are block cyclicly distributed onto a two dimensional P Q grid of processors to ensure load balancing. The right looking variant of the LU factorization ....
J. Dongarra, Performance of various computers using standard linear equations software, Computer Science Technical Report CS-89-85, University of Tennessee, Knoxville TN, 37996, July 2002. (see also http://www.netlib.org/benchmark/hpl).
....CMOS chips have outperformed ECL in numbers of transistors, speed, and reliability. Recently, most vector vendors have introduced CMOS based vector machines (like the J90 or SX 4) ffl Also important is the fact that users often have difficulty achieving peak performance on vector supercomputers [13, 14, 15, 16]. Despite high performance processors and high bandwidth memory systems, even programs that are highly vectorized fall short of theoretical peak performance [17] ffl Finally, it is important to note that there have been relatively few architectural innovations since the CRAY1. The top of the ....
J. J. Dongarra. Performance of various computers using standard linear equations software in a fortran environment. Technical Report CS-89-85, Univeristy of Tennesse, 1993.
....mainly for costly acquisition decisions, and has therefore been practiced for a long time. However, such evaluations typically focus on the computational aspects of the system. They typically use a small set of benchmark applications, and measure the performance of these applications in isolation [2, 3, 7, 1]. The results re ect the performance of the processor, the memory hierarchy, the interconnection network, and the relationship between these factors. ESP is di erent it targets the system level performance rather than the hardware [8] Issues include the eciency of the scheduling, its ....
J. J. Dongarra, \Performance of various computers using standard linear equations software". Comput. Arch. News 18(1), pp. 17-31, Mar 1990.
....MP LINPACK TFLOPS barrier. Actually, the previous record was 368 GFLOPS so we did not just break the record, we shattered it While the rules for the LINPACK benchmark require use of the standard benchmark code, MP LINPACK lets you rewrite the program as long as certain ground rules are followed [6]. Our MP LINPACK code used a twodimensional block scattered data decomposition with a block size 64 [9] The algorithm is a variant of the right looking LU factorization with row pivoting and is done in Code PGI 9 96 (Rel 1.1) PGI 12 96 (Rel 1.2 5) PGI 12 97 (Rel 1.6 3) Intel C C Compiler ....
....identical to ScaLAPACK version 1.00 Beta, which is a standard MPP package for Linear Algebra [2] The benchmark results are maintained in the LINPACK Performance Report: Performance of Various Computers Using Standard Linear Equations Software by Dr. Jack Dongarra at the University of Tennessee [6]. He has accepted our TFLOPS entry into his 12 16 96 report, which is available on the web [6] e mail, and ftp. RMAX was 1.068 TFLOP, NMAX or N was 215000, and N1 2 was 53400. N1 2 is the minimum problem size (to the nearest 100) such that half the RMAX performance was achieved. That is, over ....
[Article contains additional citation context not shown here]
Dongarra, J.J., "Performance of various computers using standard linear equations software in a Fortran environment," Computer Science Technical Report CS-89-85, University of Tennessee, 1989, http://www.netlib.org/benchmark/performance.ps
....obtain much higher performance results than the actual performance. 4.1.2.2 Kernel Benchmarks and Compact Application Benchmarks To avoid the problems of synthetic benchmarks, some kernel benchmarks have been proposed. Many benchmarks fall in this category. Some examples are the Linpack benchmark [9], the Livermore benchmark [21] and part of the MITRE benchmark [13] However, these kernel benchmarks often overstate the performance of the real applications [26] To obtain results even closer to real applications, compact application benchmarks have also been proposed. In these, small real ....
Jack J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report)," University of Tennessee Computer Science Technical Report, CS-89-85, 1998
....of the software development it is 7 possible to make initial predictions of the performance, by identifying the principal operations. An example of principal operation characterisation in the performance studies of scientific applications are those based on floating point operations (orfiops) [12, 16, 19]. The number of flops, that are to be credited for different types of floating point operations can be based on the scheme suggested by McMahon [25] Using Table 1 one can use flops as the RUV for each node of the software execution graph. Then by using the flop rate (Mflop s) of a specific system ....
J.J. Dongarra, Performance of various computers using standard linear equations software, Technical Report CS-89-85, University of Tennessee, USA, 1992.
....be viewed as another stage in the hierarchy, we would expect that an algorithm based on the BLAS 3[4] would perform much better. Indeed, on a single RS 6000 model 520 this method runs at only 9 Mflops sec on an order 1,000 problem while an implementationbased on the BLAS 3 runs at 26 Mflops sec. [3] The first decision made in parallelizing this code is the data distribution rows or columns, block or wrapped. The algorithm is column oriented since Fortran stores data in column major order. I decided to distribute complete columns to each processor to avoid communication during the pivot ....
Jack J. Dongarra. Performance of Various Computers Using Standard Linear Equation Software in a Fortran Environment. Technical Report CS-89-85, University of Tennessee, March 1990.
....1 Gamma1 k (t) log j z Gamma i k (t) j dt = z) z 2 Gamma ; N X k=1 Z 1 Gamma1 k (t) dt = 1 ; where (z) log jz Gamma z 0 j if Omega j int Gamma, 0 if Omega j ext Gamma. This system is uniquely solvable for the set of density functions f k g N k=1 , with k 2 L 2 [ 1,1], and the real constant . 2.4 The Boundary Correspondence Functions (BCFs) Associated with each arc Gamma k , k = 1; N , is the boundary correspondence function k : 1,1] R defined implicitly by f ffi i k (t) exp(i k (t) 2) The unique solution of the Symm integral equation ....
....j ext Gamma. This system is uniquely solvable for the set of density functions f k g N k=1 , with k 2 L 2 [ 1,1] and the real constant . 2. 4 The Boundary Correspondence Functions (BCFs) Associated with each arc Gamma k , k = 1; N , is the boundary correspondence function k :[ 1,1] R defined implicitly by f ffi i k (t) exp(i k (t) 2) The unique solution of the Symm integral equation system is given by k (t) 0 k (t) 2 (3) and = 0 if Omega j int Gamma, Gamma log c if Omega j ext Gamma ; 4) where c is the capacity of Gamma. 6 2.5 A Boundary ....
[Article contains additional citation context not shown here]
J.J. Dongarra. Performance of various computers using standard linear equations software in a fortran environment. Technical Memorandum 23, Argonne National Laboratory, Argonne, Illinois 60439, 1987.
....the authors and the packages CPLEX and MINTO these are the actual running times measured in experiments carried out by the authors. For the algorithms by other authors, the times of other machines are converted into DECstation 5000 240 CPU seconds according to the performance reported in Dongarra [15]. This leads to some unavoidable approximation when comparing codes running on different computers, anyway, even taking into account this very crude approximation, the running times reported give in most cases clear indications. The OR Library contains 12 classes of SCP instances, namely Classes ....
J.J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Software", Technical Report No. CS-89-85, Computer Science Department, University of Tennessee, January 1996.
....is that all machines of the series are air cooled. All other machines in this class relied at least on water cooling. Unlike the S 820 series, the S3600 series is also marketed worldwide, not only in Japan. This is also the case for the S3800 SM MIMD machines 3.3.3. Measured performances: In [4] a speed of 851 Mflop s for the solution of a full linear system of order 1000 is reported for the S3600 160. The S3600 180 attains a performance of 1672 Mflop s on the same problem. 3.1 Shared memory SIMD systems 14 3.2 Distributed memory SIMD systems 15 3.2 Distributed memory SIMD systems ....
....MP 1. Of course, tasks can be scheduled via a multi user interface on the front end system. The MP 1 features a very nice X window based programming environment, MPPE, which integrates an interactive source debugger, a profiler, and output windows in one environment. Measured Performances: In [4] the solution of a full linear system was reported on a 16384 PE machine with a speed of 440 Mflop s. The same report estimated the peak performance to be 580 Mflop s in 64 bit precision. 3.2.4 The MasPar MP 2. Machine type: Processor array. Models: MP2201, MP2202, MP2204, MP2208, MP2216. ....
[Article contains additional citation context not shown here]
J.J. Dongarra, Performance of various computers using standard linear equations software, Computer Science Technical Report CS-89-85, Univ. of Tennessee, December 1996.
....paper, attempt to normalize the old results to their current machines. In the world of scientific computing, where floating point computation and Fortran are the dominant computational paradigms, this approach is widely practiced. The benchmark codes are from the LINPACK suite described in [Don00] a report that lists, for various combinations of machine, operating system, and compiler, the number of megaflops (millions of floating point operations per second) performed on each of three benchmark computations. These benchmark codes have certain drawbacks for our purposes, however. Being ....
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical report, No. CS-89-85, Computer Science Department, University of Tennesssee, Knoxville, TN, August 24, 2000. The currently most up-to-date version should be available from http://www.netlib.org/benchmark/performance.ps.
....Fortran codes were later provided to clarify the algorithms, and following the NAS group s involvement with the Parkbench initiative, these have now been made available in both serial and parallel form. Results are collected and published on the NAS web site. LINPACK: is a simple kernel benchmark[15] based on solving a dense system of linear equations. To accommodate improved processors, over time the size of the problem has increased from 100 Theta 100, and the test can also be used to measure how performance varies as a function of problem size. This code is quite straightforward, ....
J.J. Dongarra, Performance of Various Computers Using Standard Linear Equation Software, Report CS-89-85, Univ. of Tennessee, Knoxville, Nov. 1996.
....In practice the exact speedup S d (p) may not be known except for those programs in standard general purpose parallel computing libraries. Thus the values can only be approximate in those cases. However, good approximations can often be obtained. For example, the results of the Linpack Benchmark [7] can be used as a good approximation for problems of matrix computation. The utilisation ratios of the existing jobs may be decreased whenever a new job enters the system to time share the resources. The problem is how to ensure a sustained ratio of CPU utilisation for each job so that the ....
J. J. Dongarra, Performance of various computers using standard linear equations software, Technical Report CS-89-95, Computer Science Department, University of Tennessee, Nov. 1997.
....with = 0:1; 0:3; 0:5; 0:7; 0:9, and n = 20; 40; 60; 80; 100; 120; 140; 160; 180; 200; 220; 240; 260; 280; 300. Column N 3 gives the number of instances successfully solved by the Gendreau Laporte Semet code within a time limit of 10,000 SUN Sparc station 1000 CPU seconds. According to Dongarra [9], our computer is about 1.8 times faster than that used by Gendreau, Laporte and Semet, hence their time limit corresponds to about 5,555 HP 9000 720 CPU seconds. Columns N 1 and N 2 give the number of instances successfully solved by our code within a time limit of 18,000 and 5,555 HP 9000 720 ....
J.J. Dongarra, 1996. Performance of Various Computers Using Standard Linear Equations Software. Report CS-89-85, University of Tennessee, Knoxville.
....The fact that we can solve models of this size on our desktops can be atrributed to the following factors. First, there has been tremendous improvements in hardware. The desktop PC that was used to run this model is quite a garden variety machine. However, it o#ers unprecedented computing power. [5] mentions that a 1.2 GHz AMD Athlon can outperform a Cray C90 (16 procs, 4.2 ns) a mighty machine just a decade ago, yielding 558 Mflop s compared to 479 Mflop s for the C90 on a standard Linpack benchmark. This comparison even favors the Cray somewhat as that machine was designed to do dense ....
J. J. Dongarra, Performance of Various Computers Using Standard Linear Equations Software, Report CS-89-85, University of Tennessee, january 2001.
....compatibility library. SUNMOS makes it also possible to use the second CPU (and third CPU on a MP node) as a compute processor. It is this feature that has enabled us to get more Gflops out of our Paragon, than what it was rated for. It also allowed us to reclaim the world MP LINPACK record 4 [1]. 4 Q A, Hints and Tips In this section I would like to collect answers to commonly asked questions, as well as hints and tips on SUNMOS usage. Something along the lines of David Robboy s article 5 he wrote for Intel On Line 6 . Another example is Ted Barragy s article 7 . 5 ....
....to the Paragon. The first high Gflops number under SUNMOS is described in [17] The two Gordon Bell award entries that were using SUNMOS are described in [10] and [15] SUNMOS on the Intel Paragon is compared to an SP1 and the T3D using the shallow water model in [19] Raw numbers are reported in [1]. Since breaking the world record 28 , SUNMOS leads the list There are two papers comparing SUNMOS and OSF 1 AD on the Intel Paragon: 12] and [9] We have been, and will again, look at issues in parallel I O. Most of this work was done under SUNMOS on the nCUBE 2. We implemented a parallel ....
Jack J. Dongarra. Performance of various computers using standard linear equations software. Technical report CS-89-85, Computer Science Department, University of Tennessee and Mathematical Sciences Section, Oak Ridge National Laboratories, January 1995. Available from: file://netlib.att.com/netlib/benchmark/performance.ps.Z .
....pixels provide excellent tools for pictorially representing data from scientific simulations. In Table 1 there is a short list of high performance workstations with the name of the machine, followed by the name of the manufacturer, followed by the performance in mflops on the Linpack Benchmark. Dongarra 92] This benchmark is based on the speed of solving a system of 100 simultaneous linear equations 4 gflop is an abbreviation for a gigaflop , 10 9 operations per second. CUBoulder : HPSC Course Notes 10 Overview (DRAFT: 23 Feb 1993) Machine Number of LINPACK (mflops) SUN SPARCstation 2 SUN ....
....[Dongarra et al. 79] The peak performance of workstations in this table could be as much as five to ten times the Linpack performance number. Table 2 provides a short list of supercomputers, with the name of the machine series, the name of the manufacturer, and the Theoretical Peak Performance [Dongarra 92] 5 As with the workstations, the machines in this list come in various models and configurations; the performance data is for the largest system listed in Dongarra s report. Note that here, in contrast with the previous table, a theoretical peak performance figure is given; in an actual ....
DONGARRA, J. J. [1992]. Performance of various computers using standard linear equations software. Technical Report CS-8985, Oak Ridge National Laboratory, Oak Ridge, TN 37831.
....(with optimal solution greater than zero) the time performance of our method is better than theirs. These are the instances 4 to 7 with less than 3 runways. This result also holds if we take into account that our computer is about 50 faster (according to the Dongarra Linpack benchmarks [Don01] It should be noted, however, that our solution times are quite incomparable to those of Beasleys. For some instances our approach is up to 25 times slower, while for others it is up to 50 times faster than the approach in [BKA00] The cost extended version of Uppaal has additionally been (and ....
Jack J. Dongarra. Performance of Various Computers Using Standard Linear Equations Software. Technical Report CS89 -85, Computer Science Department, University of Tennessee, 2001. An up-to-date version of this report can be found at http://www.netlib.org/benchmark/performance.ps.
....375 62.5 MHz, DECstation 5000 240, PC486 33, HP735 125 MHz 256 MB memory. Table 6 shows benchmark values of SPECint95, SPECfp95 and Mflop s. The values of SPECint95 and SPECfp95 are taken from the WWW site of SPEC (Standard Performance Evaluation Corporation) 3 , and Mflop s are taken from [9]. Based on these, we give rough estimates on machine speeds in the column estimate , where the speed of Sun Ultra 2 is normalized to one, and a larger value means that the computer is faster. From Tables 5 and 6, we can observe that our algorithm 3 FNLS obtained the best known solutions for all ....
J.J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Software, " Technical Report No. CS-89--85, Computer Science Department, University of Tennessee, July 1999 (available as http://www.netlib.org/benchmark/performance.ps).
....fact that we can solve models of this size on our desktops can be atrributed to the following factors. First, there has been tremendous improve10 ments in hardware. The desktop PC that was used to run this model is quite a garden variety machine. However, it o ers unprecedented computing power. [4] mentions that a 1.2 GHz AMD Athlon can outperform a Cray C90 (16 procs, 4.2 ns) a mighty machine just a decade ago, yielding 558 M op s compared to 479 M op s for the C90 on a standard Linpack benchmark. This comparison even favors the Cray somewhat as that machine was designed to do dense ....
J. J. Dongarra, Performance of Various Computers Using Standard Linear Equations Software, Report CS-89-85, University of Tennessee, january 2001.
....at 66 MHz and containing internal data and instruction cache memories. Moreover, each of these nodes has access to 192 MBytes of RAM (two nodes have 256 MBytes) and 4 GBytes of disk. Each workstation is claimed to have a peak performance of 266 MFlops. A more useful benchmark 7 is reported in Dongarra (1998), where 53 MFlops is found to be the LINPACK benchmark for size 100 dense systems and 181 MFlops for size 1000 with hand optimization. A value of 125 MFlops is reported for this machine by Chopard (1997) The wide node has a larger cache memory, bus size, and disk space; this node operates as the ....
Dongarra, J.J., (1998). Performance of Various Computers Using Standard Linear Equations Software. University of Tennessee Computer Science Technical Report, CS-- 89--85, http://www.netlib.org/benchmark/performance.ps.
No context found.
Dongarra, J. 1998. Performance of various computers using standard linear equations software. Report CS-89-85, Department of Computer Science, University of Tennessee,TN.
No context found.
Jack J. Dongarra. Performance of various computers using standard linear equations software. Technical Report CS-89-85, University of Tennessee Computer Science, 1999.
No context found.
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical Report CS-89-85, Computer Science Department, University of Tennessee, 2004.
No context found.
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical report, Computer Science Department, University of Tennessee, 1998.
No context found.
Jack J. Dongarra. Performance of various computers using standard linear equations software (linpack benchmark report). Technical Report CS-89-85, University of Tennessee, 2004.
No context found.
J. J. Dongarra, "Performance of various computers using standard linear equations software ". Comput. Arch. News 18(1), pp. 17--31, Mar 1990.
No context found.
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical Report CS-89-85, Computer Science Department, University of Tennessee, 2003.
No context found.
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical Report CS-89-85, Computer Science Department, University of Tennessee, 2003.
No context found.
J. J. Dongarra, Performance of Various Computers Using Standard Linear Equations Software, CS-89-85, University of Tennessee and Oak Ridge National Laboratory, Sep 1993
No context found.
J.J. Dongarra. Performance of various computers using standard linear algebra software in a fortran environment. Technical Report CS-89-85, University of Tennessee, July 2003.
No context found.
J. J. Dongarra. "Performance of Various Computers Using Standard Linear Equations Software". Tech. Rep. CS-89-85, UniversityofTennessee and Oak Ridge National Laboratory,November 1995.
No context found.
J.J. Dongarra. Performance of various computers using standard linear algebra software in a fortran environment. Technical Report CS-89-85, University of Tennessee, October 2003.
No context found.
J. J. Dongarra, Performance of various computers using standard linear equations software, Report CS-89-05, Computer Science Department, University of Tennessee, version of November 12, 1991
No context found.
J. J. Dongarra. Performance of various computers using standard linear equations software. Working paper, University of Tennessee, 2004. Continuous updates are available at http://www.netlib. org/benchmarks/performance.ps.
No context found.
J. J. Dongarra. Performance of various computers using standard linear equations software in a fortran environment. Technical Report CS-89-85, University of Tennessee, Knoxville, 1990.
No context found.
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical Report CS-89-85, Computer Science Department, University of Tennessee, 2003.
No context found.
Dongarra J.J., Performance of Various Computers Using Standard Linear Equations Software in a Fortran Environment, netlib version of Nov. 15 1992, Oak Ridge National Laboratory, Technical Report CS-89-85.
No context found.
Dongarra J.J., Performance of Various Computers Using Standard Linear Equations Software in a Fortran Environment, netlib version of Nov. 15 1992, Oak Ridge National Laboratory, Technical Report CS-89-85.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC