• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 18,796
Next 10 →

Table 2: The overall performance of the compiler-based system 9

in Performance of the Compiler-based Andorra-I System
by Rong Yang, Tony Beaumont, Ines Dutra, Vitor Santos Costa, David H D Warren 1993
"... In PAGE 8: ... 4 The Performance Results In this section, we present and discuss the performance results of Andorra-I. The rst table ( Table2 ) shows the relative speeds of sequential and parallel... In PAGE 9: ... 4.2 Overall Performance Table2 compares the overall performance of sequential and parallel versions of Andorra-I with SICStus and JAM, in terms of relative speed with respect to the parallel version of Andorra-I on a single processor. The column for SICStus Prolog shows that, for the Prolog style programs, the parallel version of Andorra-I with single processor is, on average, about... In PAGE 10: ... Thus, running it under Prolog apos;s depth- rst left-right order is very ine cient, as nearly 200 times more resolutions are performed by Prolog. This explains why the relative speed of SICStus Prolog for this program is very slow in Table2 . In contrast to y pan, the protein program is written for Prolog, and when executed on Andorra-I, the number of res- olutions performed is reduced by about 28%.... In PAGE 12: ... 4.3 Parallel Speedups Table2 showed the speedups obtained using 10 processors. In general, these speedups are quite good.... In PAGE 13: ... Note that the warplan and protein 1st benchmarks contain signi cant or-speculative work, and that Andorra-I is able to obtain reasonable speedups through incorporating the latest version of the Bristol scheduler[3]. The last three benchmarks in Table2 contain both and- and or-paralle- lism. Since we don apos;t have an equivalent system to compare, we ran them on Andorra-I with a xed team con guration forcing Andorra-I to exploit only one form of parallelism.... In PAGE 13: ... 4.4 Sequential Performance Table2 showed the basic performance of the parallel version of Andorra-I compared with SICStus. However, it is also worth comparing the sequential version of Andorra-I directly with SICStus Prolog and this is done in Table 7.... In PAGE 15: ...Parallel Overheads Table2 showed how the sequential implementation of Andorra-I is faster than the parallel implementation, in other words, how much overhead we have paid to provide support for parallelism. For our benchmark suite the parallel overhead is on average about 40%, which is quite reasonable.... ..."
Cited by 17

Table 2: The overall performance of the compiler-based system 9

in Performance of the Compiler-based Andorra-I System
by Rong Yang, Tony Beaumont, Ines Dutra, VĂ­tor Santos Costa, V'itor Santos Costa, David H D Warren 1993
"... In PAGE 8: ... 4 The Performance Results In this section, we present and discuss the performance results of Andorra-I. The rst table ( Table2 ) shows the relative speeds of sequential and parallel... In PAGE 9: ... 4.2 Overall Performance Table2 compares the overall performance of sequential and parallel versions of Andorra-I with SICStus and JAM, in terms of relative speed with respect to the parallel version of Andorra-I on a single processor. The column for SICStus Prolog shows that, for the Prolog style programs, the parallel version of Andorra-I with single processor is, on average, about... In PAGE 10: ... Thus, running it under Prolog apos;s depth- rst left-right order is very ine cient, as nearly 200 times more resolutions are performed by Prolog. This explains why the relative speed of SICStus Prolog for this program is very slow in Table2 . In contrast to y pan, the protein program is written for Prolog, and when executed on Andorra-I, the number of res- olutions performed is reduced by about 28%.... In PAGE 12: ... 4.3 Parallel Speedups Table2 showed the speedups obtained using 10 processors. In general, these speedups are quite good.... In PAGE 13: ... Note that the warplan and protein 1st benchmarks contain signi cant or-speculative work, and that Andorra-I is able to obtain reasonable speedups through incorporating the latest version of the Bristol scheduler[3]. The last three benchmarks in Table2 contain both and- and or-paralle- lism. Since we don apos;t have an equivalent system to compare, we ran them on Andorra-I with a xed team con guration forcing Andorra-I to exploit only one form of parallelism.... In PAGE 13: ... 4.4 Sequential Performance Table2 showed the basic performance of the parallel version of Andorra-I compared with SICStus. However, it is also worth comparing the sequential version of Andorra-I directly with SICStus Prolog and this is done in Table 7.... In PAGE 15: ...Parallel Overheads Table2 showed how the sequential implementation of Andorra-I is faster than the parallel implementation, in other words, how much overhead we have paid to provide support for parallelism. For our benchmark suite the parallel overhead is on average about 40%, which is quite reasonable.... ..."
Cited by 17

Table 1: Differences in hardware and compiler-based cache coherence schemes.

in Improving Memory Utilization in Cache . . .
by David J. Lilja, Pen-chung Yew
"... In PAGE 21: ... Table1 0: Normalized memory overhead. Scheme Overhead, Ox 1.... In PAGE 25: ...models in Table1 0, its memory overhead grows as O(p). The memory overhead of the 2-bit broadcast scheme is fixed, independent of the number of processors, but the additional messages needed for the broadcasts as p increases will seriously degrade its performance.... In PAGE 25: ... Unlike software-only coherence schemes, this compiler-assisted scheme still can use the full power of the directory when the compiler is unable to determine the precise sharing characteristics of a particular block. As summarized in Table1 2, the pointer cache directory performs as well as any of the current directory schemes while using only a small fraction of the memory that the other directory schemes need to store the pointer information. The memory overhead of the software-directed version control scheme with imprecise memory disambiguation is less than a factor of 10 times greater than that of the pointer cache, but the pointer cache produces lower memory delays due to its perfect memory disambiguation.... In PAGE 26: ... Table1 2: Performance and memory overhead comparisons. Coherence scheme Compared to the pointer cache Average delay Memory overhead 1.... In PAGE 34: ...Table1 1: Average memory delay and memory overhead. (a) arc3d Configuration No compiler opts.... In PAGE 35: ...Table1 1: (cont.) (c) simple24 Configuration No compiler opts.... In PAGE 36: ...Table1 1: (cont.) (e) flo52 Configuration No compiler opts.... ..."

Table 1: Differences in hardware and compiler-based cache coherence schemes.

in Improving Memory Utilization in Cache Coherence Directories
by David J. Lilja, Pen-Chung Yew
"... In PAGE 21: ... Table1 0: Normalized memory overhead. Scheme Overhead, Ox 1.... In PAGE 25: ...models in Table1 0, its memory overhead grows as O(p). The memory overhead of the 2-bit broadcast scheme is fixed, independent of the number of processors, but the additional messages needed for the broadcasts as p increases will seriously degrade its performance.... In PAGE 25: ... Unlike software-only coherence schemes, this compiler-assisted scheme still can use the full power of the directory when the compiler is unable to determine the precise sharing characteristics of a particular block. As summarized in Table1 2, the pointer cache directory performs as well as any of the current directory schemes while using only a small fraction of the memory that the other directory schemes need to store the pointer information. The memory overhead of the software-directed version control scheme with imprecise memory disambiguation is less than a factor of 10 times greater than that of the pointer cache, but the pointer cache produces lower memory delays due to its perfect memory disambiguation.... In PAGE 26: ... Table1 2: Performance and memory overhead comparisons. Coherence scheme Compared to the pointer cache Average delay Memory overhead 1.... In PAGE 34: ...Table1 1: Average memory delay and memory overhead. (a) arc3d Configuration No compiler opts.... In PAGE 35: ...Table1 1: (cont.) (c) simple24 Configuration No compiler opts.... In PAGE 36: ...Table1 1: (cont.) (e) flo52 Configuration No compiler opts.... ..."

Table 1: Differences in hardware and compiler-based cache coherence schemes.

in Improving Memory Utilization in Cache Coherence Directories
by David Lilja, Pen-chung Yew
"... In PAGE 21: ... Table1 0: Normalized memory overhead. Scheme Overhead, Ox 1.... In PAGE 25: ...models in Table1 0, its memory overhead grows as O(p). The memory overhead of the 2-bit broadcast scheme is fixed, independent of the number of processors, but the additional messages needed for the broadcasts as p increases will seriously degrade its performance.... In PAGE 25: ... Unlike software-only coherence schemes, this compiler-assisted scheme still can use the full power of the directory when the compiler is unable to determine the precise sharing characteristics of a particular block. As summarized in Table1 2, the pointer cache directory performs as well as any of the current directory schemes while using only a small fraction of the memory that the other directory schemes need to store the pointer information. The memory overhead of the software-directed version control scheme with imprecise memory disambiguation is less than a factor of 10 times greater than that of the pointer cache, but the pointer cache produces lower memory delays due to its perfect memory disambiguation.... In PAGE 26: ... Table1 2: Performance and memory overhead comparisons. Coherence scheme Compared to the pointer cache Average delay Memory overhead 1.... In PAGE 34: ...Table1 1: Average memory delay and memory overhead. (a) arc3d Configuration No compiler opts.... In PAGE 35: ...Table1 1: (cont.) (c) simple24 Configuration No compiler opts.... In PAGE 36: ...Table1 1: (cont.) (e) flo52 Configuration No compiler opts.... ..."

Table XVIII. Total Compilation Times in Milliseconds of the Omega-Based Approach and the PTD-Based Approach. The OME column and the PTD column give the compilation times obtained using the Omega-based and the PTD-based approaches, respectively. The INC column shows the percentage increase when going from PTD to OME.

in A global communication optimization technique based on data-flow analysis and linear algebra
by M. Kandemir, P. Banerjee, A. Choudhary, J. Ramanujam, N. Shenoy 1999
Cited by 15

Table 1. Runtime Overhead for Compiler- based Split Control and Data Stack (All time in seconds)

in Architecture Support for Defending Against Buffer Overflow Attacks
by Jun Xu, Zbigniew Kalbarczyk, Sanjay Patel, Ravishankar K. Iyer 2002
Cited by 31

Table 2.1: Front-ends and compiler based on the GNU C compiler

in Design and Implementation of the GNU INSEL-Compiler gic
by Markus Pizka 1997
Cited by 19

Table 1. Runtime Overhead for Compiler- based Split Control and Data Stack (All time in seconds)

in Programming Languages and Operating Systems (ASPLOS-X)
by Kimberly Keeton, George Candea, Phil Koopman Cmu, Subhasish Mitra Intel, Steven S. Lumetta 2002
"... In PAGE 5: ... The major failure event categories identified in the dependability literature to date include Hardware, Software, Human Error, Process, Environment, Security, External, Planned Downtime, and Design. Table1 includes two sources estimated failure distribution allocation among the categories of causes. More detailed studies have been done and published among IT management journals, but their results are somewhat questionable as they ... ..."

Table 1: Generated Specializations

in Identifying Profitable Specialization in Object-Oriented Languages
by Jeffrey Dean, Craig Chambers, David Grove 1994
"... In PAGE 9: ... The speed of executing the algorithm itself is good, taking a few seconds for small programs and under 5 minutes for computing specializations for the large Cecil compiler program. Table1 compares the number of specializations that would be generated for this program by a static approach (as used in Sather and Trellis), by a dynamic compilation-based approach (as used in SELF), and by our selective specialization algorithm, for both singly- and multiply-dispatched systems. The static, per-class column reflects the number of specialized methods that would be generated if each source method were specialized for each of the possible classes of its first argument, for the single-dispatching row, or for all possible combinations of subclasses of the dispatched arguments of the method, for the multiple-dispatching row.... ..."
Cited by 22
Next 10 →
Results 1 - 10 of 18,796
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University