Results 1 
3 of
3
Reconfigurable DWT unit based on lifting
 in Proceedings of ProRISC 2002
, 2002
"... Abstract ⎯ At algorithmic level, the socalled lifting scheme represents the fastest implementation of the Discrete Wavelet Transform (DWT). In this paper, a hardware accelerator for the lifting scheme is described. A liftingbased DWT unit was implemented in reconfigurable hardware, namely the Xilin ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract ⎯ At algorithmic level, the socalled lifting scheme represents the fastest implementation of the Discrete Wavelet Transform (DWT). In this paper, a hardware accelerator for the lifting scheme is described. A liftingbased DWT unit was implemented in reconfigurable hardware, namely the Xilinx VIRTEX II FPGA. The hardware module achieves the acceleration using techniques as pipelining, data reusability, parallel operating subunits and some specific features of the Xilinx FPGAs. A VHDL model was developed and synthesized with the implementation tools of the FPGA vendor. Synthesis results prove the feasibility of a 50 MHz FPGA implementation, allowing processing rates between 85 and 1087 pictures per second for a range of standard picture dimensions. To estimate the performance gains from the hardware module, we compare these results against a pure software implementation of the algorithm from the LIFTPACK software package. For picture size of 720 x 560 pixels, assuming clock frequency of 50 MHz for the hardware module, simulation results indicate a speedup of over 5 times versus a pure software realization on a 1 GHz general purpose MIPS processor. Moreover, the speedup grows for larger images and filters with higher degrees. The hardware area costs are estimated to be 985 Virtex II CLB slices, 669 FlipFlops, 22 Block RAM and 8 multiplier blocks for a basis structure. The design is generic and scalable, which allows better performance when more parallel subunits are implemented.
MIPS Augmented with Wavelet Transform:
"... Wavelet transform provides a natural way for building a reversible and irreversible image compression system. Lifting scheme, a key feature of the JPEG2000 standard, is the fastest implementation of wavelet transform. However, a pure software implementation of lifting wavelet transform is considere ..."
Abstract
 Add to MetaCart
Wavelet transform provides a natural way for building a reversible and irreversible image compression system. Lifting scheme, a key feature of the JPEG2000 standard, is the fastest implementation of wavelet transform. However, a pure software implementation of lifting wavelet transform is considered to be a substantial bottleneck for the system using it. In this thesis, we employed and compared 3 software approaches that implement the Fast Lifting Wavelet Transform (FLWT), namely; modified version of Liftpack, the reversible wavelet transform (Le Gall 35filter) and the irreversible wavelet transform (Daubechies 97). All these approaches are based on integer wavelet transforms using lifting scheme on twodimensional images but differed in filtering methods and boundary treatment of the data. Out of this comparison, the modified version of the Liftpack was found to be the fastest FLWT implementation. We further investigate the software only implementation of FLWT with a hybrid software/hardware implementation. A reconfigurable hardware implementation of the FLWT algorithm was simulated as a functional unit in a MIPS based processor. A new instruction was introduced as an ISA extension to the MIPS architecture for the FLWT reconfigurable hardware unit. Simulations were carried out in a cycle accurate simulator `Simoutorder'of the SimpleScalar Toolset (V 3.0). For an image size of 352*288,the results indicate a speed up of over 4 times for Liftpack , 3.7 times for the reversible wavelet transform and 11 times for the irreversible wavelet transform versus the pure software implementation. It was also noted that the speedup rises with increase in the picture dimensions or increase in the length of the filter.
Microarchitectural Extension for Liftingbased DWT
, 2002
"... At algorithmic level, the so called lifting scheme represents the fastest implementation of the Discrete Wavelet Transform (DWT). In this paper, we investigate a novel microarchitectural extension for the DWT based on the lifting scheme. A new fast lifting DWT (FLWT) instruction is introduced as an ..."
Abstract
 Add to MetaCart
At algorithmic level, the so called lifting scheme represents the fastest implementation of the Discrete Wavelet Transform (DWT). In this paper, we investigate a novel microarchitectural extension for the DWT based on the lifting scheme. A new fast lifting DWT (FLWT) instruction is introduced as an ISA Extension of a MIPS architecture. Simulations have been carried out by a cycle accurate simulator (SimpleScalar). As benchmark software, we have used a modified version of the package LIFTPACK, optimized for integer arithmetic. A microcodable DWT unit has been implemented on a reconfigurable hardware platform  namely the Xilinx FPGA VIRTEX II. In order to accelerate the transform process, the hardware module utilizes pipelining, data reusability, data parallelism and some specific features of the Xilinx FPGAs. We have used the original synthesis tools of the FPGA vendor to get realistic data for the performance of the unit. Simulation results indicate a speedup of over 4 times versus the pure software implementation, assuming clock frequencies of 50 MHz for the DWT module and 1 GHz for the General Purpose MIPS. An important advantage of the module is that for higher degrees of the filter polynomial, its speed up versus the pure software implementation would be even higher. It is also noted that the speedup increases when the dimensions of the processed pictures grow. More dramatic improvements in performance can be achieved, since the design can be scaledup to utilize higher degrees of parallelism. The introduced microarchitecture can be utilized by wavelet based encoding tools and standards like JPEG 2000, MPEG4, etc.