DMCA
An efficient implementation of a 3D wavelet transform based encoder on hyper-threading technology (2006)
Citations
3531 | A Theory for Multiresolution Signal Decomposition: The Wavelet Representation
- Mallat
- 1989
(Show Context)
Citation Context ...uadrature mirror filters (QMF), G = g(n) and H = h(n)n Z. H corresponds to a low-pass filter and G is a high-pass filter. For a more detailed analysis of the relationship between wavelets and QMF see =-=[26]-=-. The filters H and G correspond to one step in the wavelet decomposition. Given a discrete signal, s, with a length of 2 n , at each stage of the wavelet transformation, the G and H filters are appli... |
1530 |
Embedded image coding using zerotrees of wavelet coefficients
- Shapiro
- 1993
(Show Context)
Citation Context ...$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.parco.2006.11.011 www.elsevier.com/locate/parcoscompression. Several coders have been developed using 2D wavelet transform =-=[2,21,23,30,32]-=-, or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences [15,22]. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and s... |
879 |
Image coding using wavelet transform
- Antonini, Barlaud, et al.
- 1992
(Show Context)
Citation Context ...$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.parco.2006.11.011 www.elsevier.com/locate/parcoscompression. Several coders have been developed using 2D wavelet transform =-=[2,21,23,30,32]-=-, or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences [15,22]. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and s... |
188 |
Image Compression Using the 2-D Wavelet Transform
- Lewis, Knowles
- 1992
(Show Context)
Citation Context ...$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.parco.2006.11.011 www.elsevier.com/locate/parcoscompression. Several coders have been developed using 2D wavelet transform =-=[2,21,23,30,32]-=-, or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences [15,22]. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and s... |
182 | Hyper-threading technology architecture and microarchitecture.
- Marr, Binns, et al.
- 2002
(Show Context)
Citation Context ...ires several threads or processes, which forces us to rethink the implementation of many algorithms. Current examples of these architectures are the Intel Ò processors with hyper-threading technology =-=[28]-=-. This technology makes it feasible for a single processor to execute simultaneously two processes or threads, and adds relatively little complexity to the processor design. In the case of parallelizi... |
180 | Speculative precomputation: Long-range prefetching of delinquent loads.
- Collins, Wang, et al.
- 2001
(Show Context)
Citation Context ...ependent two tasks that are statically assigned to each thread to run concurrently. One thread executes the 3D Wavelet Transform Encoder, the master thread, while the second thread, the helper thread =-=[16,33]-=-, carries out the prefetch instructions to put the necessary pixels for computing the wavelet transform into the cache. The helper thread must be activated by the master thread in order to perform the... |
173 | Execution-based Prediction Using Speculative Slices.
- Zilles, Sohi
- 2001
(Show Context)
Citation Context ...ependent two tasks that are statically assigned to each thread to run concurrently. One thread executes the 3D Wavelet Transform Encoder, the master thread, while the second thread, the helper thread =-=[16,33]-=-, carries out the prefetch instructions to put the necessary pixels for computing the wavelet transform into the cache. The helper thread must be activated by the master thread in order to perform the... |
155 | Simultaneous Multithreading: A Platform for Next-Generation Processors. In
- Eggers, Emer, et al.
- 1997
(Show Context)
Citation Context ...y level. Nowadays, an increasing number of computers are shared memory multiprocessors or they have processors able to execute several threads at the same time using Simultaneous Multi-threading (SMT =-=[19,24,27]-=-). However, to exploit these architectures successfully requires several threads or processes, which forces us to rethink the implementation of many algorithms. Current examples of these architectures... |
147 | Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading”,
- Lo, Egger, et al.
- 1997
(Show Context)
Citation Context ...y level. Nowadays, an increasing number of computers are shared memory multiprocessors or they have processors able to execute several threads at the same time using Simultaneous Multi-threading (SMT =-=[19,24,27]-=-). However, to exploit these architectures successfully requires several threads or processes, which forces us to rethink the implementation of many algorithms. Current examples of these architectures... |
126 | An embedded wavelet video coder using three-dimensional set partitioning in hierarchical trees (SPIHT),” in
- Kim, Pearlman
- 1997
(Show Context)
Citation Context ...parcoscompression. Several coders have been developed using 2D wavelet transform [2,21,23,30,32], or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences =-=[15,22]-=-. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and still images based on a wavelet algorithm. In previous works [6,8–11], we have developed and improved an encoder... |
124 | Automatic program parallelization
- Banerjee, Eigenmann, et al.
- 1993
(Show Context)
Citation Context ...rocessor to execute simultaneously two processes or threads, and adds relatively little complexity to the processor design. In the case of parallelizing a video encoder, the automatic parallelization =-=[3,13]-=- methods available to us do not yield any benefit. It is necessary to use manual parallelization, especially to take advantage of the benefits that hyper-threading technology provides [7]. Manual para... |
95 | An Overview of JPEG-2000.
- Marcellin, Gormish, et al.
- 2000
(Show Context)
Citation Context ...$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.parco.2006.11.011 www.elsevier.com/locate/parcoscompression. Several coders have been developed using 2D wavelet transform =-=[2,21,23,30,32]-=-, or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences [15,22]. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and s... |
58 | Compressing Still and Moving Images with Wavelets
- Hilton, Jawerth, et al.
- 1994
(Show Context)
Citation Context ...$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.parco.2006.11.011 www.elsevier.com/locate/parcoscompression. Several coders have been developed using 2D wavelet transform =-=[2,21,23,30,32]-=-, or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences [15,22]. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and s... |
46 |
MPEG-4: a multimedia standard for the third millennium. 1,
- Ballista, Casalino, et al.
- 1999
(Show Context)
Citation Context ...been developed using 2D wavelet transform [2,21,23,30,32], or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences [15,22]. Nowadays, the standard MPEG-4 =-=[4,5]-=- supports an ad-hoc tool for encoding textures and still images based on a wavelet algorithm. In previous works [6,8–11], we have developed and improved an encoder for medical video based on the 3D Wa... |
45 | Three-Dimensional Subband Coding of Video Using the Zerotree Method”,
- Chen, Pearlman
- 1996
(Show Context)
Citation Context ...parcoscompression. Several coders have been developed using 2D wavelet transform [2,21,23,30,32], or 3D wavelet transform to efficiently approximate 3D volumetric data [29] or to code video sequences =-=[15,22]-=-. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and still images based on a wavelet algorithm. In previous works [6,8–11], we have developed and improved an encoder... |
44 | Automatic detection of parallelism: A grand challenge for high-performance computing,
- Blume
- 1994
(Show Context)
Citation Context ...rocessor to execute simultaneously two processes or threads, and adds relatively little complexity to the processor design. In the case of parallelizing a video encoder, the automatic parallelization =-=[3,13]-=- methods available to us do not yield any benefit. It is necessary to use manual parallelization, especially to take advantage of the benefits that hyper-threading technology provides [7]. Manual para... |
26 |
Information Technology–Portable Operating System Interface (POSIX)–Part 1: System Application
- IEEE
- 1996
(Show Context)
Citation Context ... applications can be developed taking into account the resources that are used by one thread and the resources that could be used by the another one. 2.3. Pthreads and OpenMP Pthreads (POSIX threads, =-=[1]-=-) is a commonly portable API used for programming shared memory multiprocessors. This API is a lower level than OpenMP. Hence, it allows a greater control on how to exploit concurrency at the expense ... |
25 |
Multiscale volume representation by a DoG wavelet.
- Muraki
- 1995
(Show Context)
Citation Context ....11.011 www.elsevier.com/locate/parcoscompression. Several coders have been developed using 2D wavelet transform [2,21,23,30,32], or 3D wavelet transform to efficiently approximate 3D volumetric data =-=[29]-=- or to code video sequences [15,22]. Nowadays, the standard MPEG-4 [4,5] supports an ad-hoc tool for encoding textures and still images based on a wavelet algorithm. In previous works [6,8–11], we hav... |
20 |
OpenMP Application Program Interface Version 3.0. http: //www.openmp.org/mp-documents/spec30.pdf
- Board
- 2008
(Show Context)
Citation Context ...care of all the necessary synchronization. Pthreads has a rich set of synchronization primitives which include locks (mutexes), semaphores, barriers and condition variables. On the other hand, OpenMP =-=[14]-=- is a specification which allows the implementation of portable parallel programs in shared memory multiprocessor architectures using C, C++ or FORTRAN. Programming using OpenMP is based on the use of... |
16 | Efficient exploitation of parallelism on Pentium III and Pentium 4 processor-based systems.
- Bik, Girkar, et al.
- 2001
(Show Context)
Citation Context ...lar blocking with overlapping and operation reuse, and it will be the strategy assumed for the rest of this work. Moreover, we attempted to take advantage of the Streaming SIMD Extensions efficiently =-=[12]-=- by performing a manual vectorization using the compiler intrinsic instructions available in the Intel Ò2 C/C++ Compiler [17]. SSE extensions allow us to exploit fine grain parallelism vectorizing loo... |
12 | Exploiting Speculative Thread-Level Parallelism on a
- Marcuello, González
- 1999
(Show Context)
Citation Context ...y level. Nowadays, an increasing number of computers are shared memory multiprocessors or they have processors able to execute several threads at the same time using Simultaneous Multi-threading (SMT =-=[19,24,27]-=-). However, to exploit these architectures successfully requires several threads or processes, which forces us to rethink the implementation of many algorithms. Current examples of these architectures... |
10 |
Hyper-Threading technology: Impact on compute-intensive workloads
- Magro, Petersen, et al.
- 2002
(Show Context)
Citation Context ...has introduced one implementation of SMT called hyper-threading [28] in its high-performance ·86 processors like Xeon and Pentium IV, obtaining improvements in execution time around 30% in some cases =-=[25]-=-. Hyper-threading technology allows one physical processor to appear as two logical processors. From an architectural point of view, the introduction of hyper-threading technology means that the opera... |
8 | A new lossy 3-D wavelet transform for high-quality compression of medical video - Bernabe, Gonzalez, et al. - 2000 |
5 | Reducing 3D Wavelet transform execution time through the streaming SIMD extensions
- Bernabé, Garcı́a, et al.
- 2003
(Show Context)
Citation Context ... high-pixels of the 3D-FWT requires 32 floating point multiplications, 24 floating point additions and 56 instructions. Using SSE, it is possible to perform those operations with only 15 instructions =-=[6,8]-=-. We also employed other classical methods like data prefetching and loop unrolling (in the time dimension) [6,8]. Finally, we examined the source code in order to exploit the temporal and spatial loc... |
5 |
Memory conscious 3-D wavelet transform
- Bernab´e, Gonz´alez, et al.
- 2002
(Show Context)
Citation Context ...s, the working set becomes huge and the algorithm is limited by memory (memory bound). Hence, we developed a memory conscious 3D FWT that exploits the memory hierarchy by means of blocking algorithms =-=[11,8]-=-, thus reducing the final execution time. In particular, we proposed and evaluated two blocking approaches that differ in the way that the original working set is divided. In the first approach, we pr... |
4 |
Exploiting multilevel parallelism within modern microprocessors: DWT as a case study
- Tenllado, Garcia, et al.
- 2004
(Show Context)
Citation Context ...plementation using OpenMP which is easier to implement, more readable and nearly as efficient as the original implementation using Pthreads. 1 Recently, and independently to our work, Tenllado et al. =-=[31]-=- have developed a related work parallelizing the discrete Wavelet transform (DWT) for its execution using hyper-threading technology and SIMD. Some of their conclusions are similar to ours: they also ... |
2 |
An efficient 3D Wavelet transform on Hyper-Threading technology
- Bernabé, García, et al.
- 2004
(Show Context)
Citation Context ...elization [3,13] methods available to us do not yield any benefit. It is necessary to use manual parallelization, especially to take advantage of the benefits that hyper-threading technology provides =-=[7]-=-. Manual parallelization poses a considerable burden in software development and it would be desirable to minimize this increase in complexity. There are a number of alternatives for implementing a pa... |
2 | Enhancing the entropy encoder of a 3D-FWT for high-quality compression of medical video
- Bernabé, González, et al.
- 2001
(Show Context)
Citation Context ...D Wavelet Transform Encoder to compress medical video. Then, we describe the main characteristics of hyper-threading technology. 2.1. The proposed 3D Wavelet transform based encoder In previous works =-=[9,10]-=- we presented an implementation of a lossy encoder for medical video based on the 3D Fast Wavelet Transform (FWT). This encoder achieves high-compression ratios with excellent quality (PSNR around 41 ... |
2 | Optimizing a 3DFWT Video Encoder for SMPs and HyperThreading Architectures
- Fernandez, Garcia, et al.
- 2005
(Show Context)
Citation Context ... various methods for the 3D Wavelet Transform Encoder will be discussed. Finally, Section 5 summarizes the work and draws the main conclusions. 1 A preliminary implementation was already presented in =-=[20]-=-. G. Bernabé et al. / Parallel Computing 33 (2007) 54–72 55s56 G. Bernabé et al. / Parallel Computing 33 (2007) 54–72 2. Background In this Section, we review the framework on which our enhancements h... |
1 |
Intel C/C++ compiler for Linux. Available from
- Corporation
(Show Context)
Citation Context ...e attempted to take advantage of the Streaming SIMD Extensions efficiently [12] by performing a manual vectorization using the compiler intrinsic instructions available in the Intel Ò2 C/C++ Compiler =-=[17]-=-. SSE extensions allow us to exploit fine grain parallelism vectorizing loops which perform a simple operation over data streams. It is possible to exploit these instructions to reduce the number of f... |