Results 1 - 10
of
279
Evaluation of Release Consistent Software Distributed Shared Memory on Emerging Network Technology
"... We evaluate the effect of processor speed, network characteristics, and software overhead on the performance of release-consistent software distributed shared memory. We examine five different protocols for implementing release consistency: eager update, eager invalidate, lazy update, lazy invalidat ..."
Abstract
-
Cited by 467 (43 self)
- Add to MetaCart
independent of the protocol used. Medium-grained applications, such as Water, can achieve good performance, but the choice of protocol is critical. For sixteen processors, the best protocol, lazy hybrid, performed more than three times better than the worst, the eager update. Fine-grained applications
A medium-grain reconfigurable cell array for DSP application
- in Proc. 3rd iasted International Conference on Circuits, Signals, and Systems, Cancun
, 2003
"... Digital signal processing (DSP) is an essential component of many applications, including multimedia and communications systems. The recent surge in wireless and mobile computing underscores the need for high-performance low power DSP hardware. Reconfigurable hardware balances these requirements wit ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
with development costs by providing system designers a viable alternative to custom integrated circuits. This paper describes a novel reconfigurable architecture for DSP applications. The device contains an array of medium-grain cells that can perform arithmetic, memory, and control operations. The main features
Delgado-Frias, “Fault avoidance in medium-grain reconfigurable hardware architectures
- in International Conference on Engineering of Reconfigurable Systems and Algorithms
"... Abstract—Medium-grain reconfigurable hardware (MGRH) architectures represent a hybrid between the versatility of a field programmable gate array (FPGA) and the computational power of a custom application-specific integrated circuit (ASIC) [1]. Recent research has shown that they are particularly app ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
appropriate for systems focused on digital signal processing (DSP) [2] and power efficiency [3]. This means that medium-grain reconfig-urable architectures have a high potential for implementation in mobile computing platforms, wireless sensor networks, military technology, and aerospace applications
Providing Structure for Medium-grained Distributed Object-based Computation
"... Pipelined Processor Farms (PPF) are a generic parallelprocessing structure suitable for continuous-flow, multialgorithm applications. PPF has been redesigned for a multimedia workload which is multifunctional and multimodal. A dynamic object-based processing structure is implemented which will map o ..."
Abstract
- Add to MetaCart
Pipelined Processor Farms (PPF) are a generic parallelprocessing structure suitable for continuous-flow, multialgorithm applications. PPF has been redesigned for a multimedia workload which is multifunctional and multimodal. A dynamic object-based processing structure is implemented which will map
Optimizing Irregular Adaptive Applications on Multi-threaded Processors: The Case of Medium-Grain Parallel Delaunay Mesh Generation
, 2006
"... The Importance of parallel mesh generation and emerging growth of SMT architectures raise an important question of adapting parallel mesh generation software to the SMT ar-chitecture. In this work we focus on Parallel Constrained Delaunay Mesh Generation. We explore medium grain parallelism at the s ..."
Abstract
- Add to MetaCart
parallel mesh generation software (PCDT), combining the coarse-grain approach with a medium-grain approach. The second contribution is that we significantly improved the performance of the PCDT software using the optimizations developed for the MPCDT code and the SMT architecture. These changes made PCDT
Optimizing Parallel Applications for Wide-Area Clusters
- IN IPPS-98 INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM
, 1998
"... Recent developments in networking technology cause a growing interest in connecting local-area clusters of workstations over wide-area links, creating multilevel clusters, or meta computers. Often, latency and bandwidth of local-area and wide-area networks differ by two orders of magnitude or more. ..."
Abstract
-
Cited by 36 (13 self)
- Add to MetaCart
. One would expect only very coarse grain applications to achieve good performance. To test this intuition, we analyze the behavior of several existing medium-grain applications on a wide-area multicluster. We find that high performance can be obtained if the programs are optimized to take
Adaptive Sampling-Based Profiling Techniques for Optimizing the Distributed JVM Runtime
"... Abstract—Extending the standard Java virtual machine (JVM) for cluster-awareness is a transparent approach to scaling out multithreaded Java applications. While this clustering solution is gaining momentum in recent years, efficient runtime support for fine-grained object sharing over the distribute ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
of execution time increase for fine-to medium-grained applications. Keywords-profiling; sampling; correlation tracking; access locality; thread affinity; thread stack; thread migration; dynamic load balancing; object sharing; distributed Java virtual machine; distributed shared memory systems I.
The Multicomputer Toolbox: Scalable Parallel Libraries for Large-Scale Concurrent Applications
, 1994
"... In this paper, we consider what is required to develop parallel algorithms for engineering applications on message-passing concurrent computers (multicomputers). At Caltech, the first author studied the concurrent dynamic simulation of distillation column networks [19, 21, 20, 14]. This research was ..."
Abstract
-
Cited by 20 (11 self)
- Add to MetaCart
set of portable, reusable numerical algorithms constituting a "Multicomputer Toolbox," suitable for use on both existing and future medium-grain concurrent computers; third, a working prototype simulation system, Cdyn, for distillation problems, that can be enhanced (with additional work
Improving the Performance of Speculatively Parallel Applications on the Hydra CMP
, 1999
"... Hydra is a chip multiprocessor (CMP) with integrated support for thread-level speculation. Thread-level speculation provides a way to parallelize sequential programs without the need for data dependence analysis or synchronization. This makes it possible to parallelize applications for which static ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
memory dependence analysis is difficult or impossible. While performance of the baseline Hydra system on applications with significant amounts of medium to large grain parallelism is good, the performance on integer applications with fine-grained parallelism can be poor. In this paper, we describe a
Impact of Sharing-Based Thread Placement on Multithreaded Architectures
- In Proceedings of the 21st Annual International Symposium on Computer Architecture
, 1994
"... Multithreaded architectures context switch to another instruction stream to hide the latency of memory operations. Although the technique improves processor utilization, it can increase cache interference and degrade overall performance. One technique to reduce the interconnect traffic is to co-loca ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
-locate on the same processor threads that share data. The multi-thread sharing in the cache should reduce compulsory and invalidation misses, benefiting execution time. To test this hypothesis, we compared a variety of thread placement algorithms via trace-driven simulation of fourteen coarse- and medium-grain
Results 1 - 10
of
279