Results 1 - 10
of
120
Integrated Framework For Rapid System Prototyping Automatic Distribution
- in proceedings of the 5th "International Workshop on Rapid System Prototyping
, 1994
"... : Rapid prototyping of parallel systems is of interest to quickly produce a parallel prototype. The emergence of distributed systems technology has enabled to develop software system distributed over large networks. Rapid prototyping must deal with real parallelism over a set of processors, either c ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
closely or loosely coupled We describe in this paper an extension of the CPN/TAGADA project to manage distributed code generation over a set of CPU. To achieve a mapping of components over the target architecture, both hardware and software have to be described. We expose our technique and apply it to a
Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems
"... Abstract — Heterogeneous computing on CPUs and GPUs has traditionally used fixed roles for each device: the GPU handles data parallel work by taking advantage of its massive number of cores while the CPU handles non data-parallel work, such as the sequential code or data transfer management. Unfortu ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
for developing a single dataparallel kernel in OpenCL, while the system automatically partitions the workload across an arbitrary set of devices, generates kernels to execute the partial workloads, and efficiently merges the partial outputs together. The goal is performance improvement by maximally utilizing all
Automatic Design Validation Framework for HDL Descriptions via RTL ATPG
- in Proceedings of the Asian Test Symposium, 2003
"... We present a framework for high-level design validation using an efficient register-transfer level (RTL) automatic test pattern generator (ATPG). The RTL ATPG generates the test environments for validation targets, which include variable assignments, conditional statements, and arithmetic expression ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present a framework for high-level design validation using an efficient register-transfer level (RTL) automatic test pattern generator (ATPG). The RTL ATPG generates the test environments for validation targets, which include variable assignments, conditional statements, and arithmetic
Automatic mapping of nested loops to FPGAs
- In ACM SIGPLAN PPoPP’07
, 2007
"... This paper present a framework for automatic mapping of perfectly nested loops with constant dependences onto regular processor arrays, suitable for direct implementation on Field Programmable Gate Arrays (FPGAs). The problem is modeled as that of finding a suitable completion procedure for a full-r ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper present a framework for automatic mapping of perfectly nested loops with constant dependences onto regular processor arrays, suitable for direct implementation on Field Programmable Gate Arrays (FPGAs). The problem is modeled as that of finding a suitable completion procedure for a full
FFT program generation for shared memory: SMP and multicore
- In Proc. Supercomputing
, 2006
"... The chip maker’s response to the approaching end of CPU frequency scaling are multicore systems, which offer the same programming paradigm as traditional shared memory platforms but different performance characteristics. This situation considerably increases the burden on library developers and stre ..."
Abstract
-
Cited by 28 (14 self)
- Add to MetaCart
and strengthens the case for automatic performance tuning frameworks such as Spiral, a program generator and optimizer for linear transforms such as the discrete Fourier transform (DFT). We present a shared memory extension of Spiral. The extension within Spiral consists of a rewriting system that manipulates
Automatic Construction and Evaluation of Performance Skeletons
"... The performance skeleton of an application is a short running program whose execution time in any scenario reflects the estimated execution time of the application it represents. Such a skeleton can be employed to quickly estimate the performance of a large application under existing network and nod ..."
Abstract
- Add to MetaCart
and node sharing. This paper presents a framework for automatic construction of performance skeletons of a specified execution time and evaluates their use in performance prediction with CPU and network sharing. The approach is based on capturing the execution behavior of an application and automatically
DEVELOPERS CAN IMPLEMENT DOMAIN-SPECIFIC OPERATIONS BY EXTENDING THE DSL FRAMEWORK, WHICH PROVIDES STATIC OPTIMIZATIONS AND CODE GENERATION FOR HETEROGENEOUS HARDWARE. THE DELITE RUNTIME AUTOMATICALLY SCHEDULES AND EXECUTES DSL OPERATIONS ON HETEROGENEOUS
"... ......Power constraints have limited the ability of microprocessor vendors to scale single-core performance with each new generation. Instead, vendors are increasing the number of processor cores and incorporating specialized hardware to improve performance. 1 For example, GPUs have become essential ..."
Abstract
- Add to MetaCart
......Power constraints have limited the ability of microprocessor vendors to scale single-core performance with each new generation. Instead, vendors are increasing the number of processor cores and incorporating specialized hardware to improve performance. 1 For example, GPUs have become
FFT Program Generation for Shared Memory
- SMP and Multicore”, SC2006
, 2006
"... The chip maker’s response to the approaching end of CPU frequency scaling are multicore systems, which offer the same programming paradigm as traditional shared memory platforms but have different performance characteristics. This situation considerably increases the burden on library developers and ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
and strengthens the case for automatic performance tuning frameworks like Spiral, a program generator and optimizer for linear transforms such as the discrete Fourier transform (DFT). We present a shared memory extension of Spiral. The extension within Spiral consists of a rewriting system that manipulates
Towards a compiler framework for thread-level speculation
"... Abstract—Speculative parallelization techniques allow to extract parallelism of fragments of code that can not be analyzed at compile time. However, research on software-based, thread-level speculation will greatly benefit from an appropriate compiler framework for easy prototyping and further devel ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
analysis and transformation solutions, with a reduction of around 83 % on the number of code lines needed with respect to the direct use of Cetus for the same purpose. To show the possibilities of this framework, we present an automatically-generated classification of loops for several SPEC CPU2006 C
Active Mask Framework for Segmentation of Fluorescence Microscope Images
"... m]]l]]s¶D]]¿÷mB]iv]b]oD]m¶¨]iv]§]iv]r]j]t¿rv]]irj]]t]]m] / | ap]]r¿]ÎNy]s¶D]]mb¶r]ix} Û]Ix]]rd]mb]} p—N]t]o%ism] in]ty]m] / || Û]Is]¡uÎc]rN]]riv]nd]p]*N]m]st¶ I always bow to Śri ̄ Śāradāmbā, the limitless ocean of the nectar of compassion, who bears a rosary, a vessel of nectar, the symbol of ..."
Abstract
- Add to MetaCart
m]]l]]s¶D]]¿÷mB]iv]b]oD]m¶¨]iv]§]iv]r]j]t¿rv]]irj]]t]]m] / | ap]]r¿]ÎNy]s¶D]]mb¶r]ix} Û]Ix]]rd]mb]} p—N]t]o%ism] in]ty]m] / || Û]Is]¡uÎc]rN]]riv]nd]p]*N]m]st¶ I always bow to Śri ̄ Śāradāmbā, the limitless ocean of the nectar of compassion, who bears a rosary, a vessel of nectar, the symbol
Results 1 - 10
of
120