• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,754
Next 10 →

Table 3. The simulation parameters for the multi-core system.

in An effective hybrid transactional memory system with strong isolation guarantees
by Chi Cao Minh, Martin Trautmann, Jaewoong Chung, Austen Mcdonald, Nathan Bronson, Jared Casper, Christos Kozyrakis, Kunle Olukotun 2007
"... In PAGE 8: ... 5.1 Simulation and Code Generation Table3 presents the main parameters of the simulated multi-core system. Insertions and lookups in the SigTM signatures use the four hash functions listed.... ..."
Cited by 10

Table 3. Parameters for the simulated multi-core system.

in The OpenTM Transactional Application Programming Interface
by unknown authors
"... In PAGE 7: ...1 Environment We use an execution-driven simulator that models multi- core systems with MESI coherence and support for hard- ware or hybrid TM systems. Table3 summarizes the pa- rameters for the simulated CMP architecture. All opera- tions, except loads and stores, have a CPI of 1.... ..."

Table 1: Multi-Core Spanning Tree Problems and Their Complexity Status

in Issues in Multicast Protocols
by unknown authors

Table 10: LAMMPS benchmark: Multi-core speedup (no numactl)

in Characterization of scientific workloads on systems with multi-core processors
by Sadaf R. Alam, Richard F. Barrett, Jeffery A. Kuehn, Philip C. Roth, Jeffrey S. Vetter
"... In PAGE 9: ... We ran the experiments on the DMZ, Longs and Tiger systems. These results are listed in Table10 . The LAMMPS benchmarks scale linearly on multiple cores and the scaling behavior is increasingly different for different classes of computation.... ..."

Table 1. a.) A classification of core selection algorithms. M: multi-core (vs. single-core), C: constrained (vs. unconstrained), A: asymmetric (vs. symmetric), D: distributed (vs. centralized). b.) Computational and message exchange complexity of core selection algorithms. M, C and V as set handles denote respectively the sets of multicast group members, candidate cores and the entire nodes in the domain. hdiam and e are respectively the maximum hop-distance between any two domain nodes and the maximum node degree in the domain.

in Core Selection Algorithms for Group Communications
by unknown authors
"... In PAGE 1: ... Latter research shifted to multi-core selection which, allowing multiple shared trees for distinct receiver partitions, further broadened the solution space not only for potential improvements on the efficiency, but also for the range of successful solutions for the delay-constrained case. In Table1 -a, we present a classification of core selection algorithms along with the complexity of each algorithm in the distributed, distance-vector routing environment. We primarily distinguish whether the algorithm is single-core or multi-core algorithm.... In PAGE 2: ... We classify the algorithm as distributed if its deployment with no reliance on an external message exchange protocol in the distributed platform is feasible, centralized otherwise. The entry of column Coordinator in Table1 -b is the computational complexity of the process operating on the designated node coordinating the core selection process. For distributed algorithms, we also analyze the proportional growth of message exchange throughout the entire algorithm and present our results in respective columns of the table.... ..."

Table 1: Floating point performance characteristics of individual cores of modern, multi-core processor architectures. DGESV and SGESV are the LAPACK subroutines for dense system solution in double precision and single precision respectively. Architecture Clock DP Peak SP Peak time(DGESV)/

in Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems
by Alfredo Buttari, Jack Dongarra, Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak 2007
"... In PAGE 2: ... When combined with the size of the register file of 128 registers, it is capable of delivering close to peak performance on many common computationally intensive workloads. Table1 shows the difference in peak performance between single precision (SP) and double precision (DP) of four modern processor architectures; also, on the last column is reported the ratio between the time needed to solve a dense linear system in double and single precision by means of the LAPACK DGESV and SGESV respec- tively. Following the recent trend in chip design, all of the presented processors are multi-core architectures.... In PAGE 6: ... For the Cell processor (see Figures 7 and 8), parallel implementations of Algo- rithms 2 and 3 have been produced in order to exploit the full computational power of the processor. Due to the large difference between the single precision and double precision floating point units (see Table1 ), the mixed precision solver performs up to 7 and 11 faster than the double precision peak in the unsymmetric and symmetric, positive definite cases respectively. Implementation details for this case can be found in [7, 8].... ..."

Table 1. Configuration and area of the cores. architecture is also ideally suited to manage varied priority levels, but that advantage is not explored here. An additional issue with heterogeneous multi-core archi- tectures supporting multiple concurrently executing programs is cache coherence. In this paper, we study multi-programmed workloads with disjoint address spaces, so the particular cache coherence protocol is not an issue (even though we do model the writeback of dirty cache data during core-switching). However, when there are differences in cache line sizes and/or per-core protocols, the cache coherence protocol might need some redesign. We believe that even in those cases, cache coherence can be accomplished with minimal additional over- head.

in Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance
by Rakesh Kumar, Dean M. Tullsen, Parthasarathy Ranganathan, Norman P. Jouppi, Keith I. Farkas
"... In PAGE 4: ... 3.1 Hardware assumptions Table1 summarizes the configurations used for the cores in the study. As discussed earlier, we mainly focus on the EV5 (Alpha 21164) and the EV6 (Alpha 21264).... In PAGE 4: ...o be 10 cycles. Memory latency was set to be 150 ns. We as- sume a snoopy bus-based MESI coherence protocol and model the writeback of dirty cache lines for every core-switch. Table1 also presents the area occupied by each core. These were computed using a methodologysimilar to that used in our earlier work [14].... ..."

Table 1 presents combinations of these modules and their purpose. By modularising parallelisation concerns into multiple aspects it is possible to manage multiple configurations of a parallel application and to deploy the one that more adequately matches the target platform. When the target platform is a single processor machine, only core functionality is deployed. On multiprocessor machines, the concurrency module is included as well. However, we keep the choice over whether we include the partition module, depending on the type of parallel application (e.g., in branch and bound applications partition module usually is not required).

in Aspect-oriented support for modular parallel computing
by João L. Sobral 2006
"... In PAGE 3: ...Table1 . Deployable parallel applications Partition Mod u l e Con c u rren cy Mod u l e Distr i bution module Purpose No No No Tidy up core functionality, debugging, single processor machines Yes No No Tidy up partition strategy, debugging No / Yes Yes No Shared memory parallel machines (SMP/Multi-core) Yes Yes Yes Distributed memory machines/Grids No No / Yes Yes Distributed application Aspect precedence is of particular importance in this approach.... ..."
Cited by 2

Table 1. Operating speed for detecting hand postures in a single image

in A Real-Time Hand Gesture Interface Implemented on a Multi-Core Processor
by Tsukasa Ike, Nobuhisa Kishikawa, Björn Stenger
"... In PAGE 4: ... To compare the performance between a multi-core system and a single-core system, we disabled the hyper-threading function of a Xeon processor. The result is shown in Table1 . We measured the aver- age, the best case, and the worst case of the operating speed for recognizing the three hand postures shown in Figure 2.... ..."

Table 1. An overview of ACO applications.

in VERY STRONGLY CONSTRAINED PROBLEMS: AN ANT COLONY OPTIMIZATION APPROACH
by Vittorio Maniezzo, Matteo Roffilli
"... In PAGE 8: ... However, in the following we will present some of the most relevant problems from both an historical perspective and current trends. As reported in Table1 , current works on ACO are devoted to exploit the implicit parallelism of the algorithm in order to speed up the computation on modern multi-core processors. While these studies are focused on particular problems, we can recognize common strategies aimed at splitting the problem (Ouyang and Yan, 2004), exchanging information among colonies (Ellabib et al.... ..."
Next 10 →
Results 1 - 10 of 2,754
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University