Results 1 
8 of
8
Fast gpubased ct reconstruction using the common unified device architecture (cuda
 Nuclear Science Symposium Conference Record, 2007. NSS ’07. IEEE 6
"... is a fundamentally new programming approach making use of the unified shader design of the most current Graphics Processing Units (GPUs) from NVIDIA. The programming interface allows to implement an algorithm using standard C language and a few extensions without any knowledge about graphics program ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
(Show Context)
is a fundamentally new programming approach making use of the unified shader design of the most current Graphics Processing Units (GPUs) from NVIDIA. The programming interface allows to implement an algorithm using standard C language and a few extensions without any knowledge about graphics programming using OpenGL, DirectX, and shading languages. We apply this revolutionary new technology to the FDK method, which solves the threedimensional reconstruction task in conebeam CT. The computational complexity of this algorithm prohibits its use for many medical applications without hardware acceleration. Today’s GPUs with their high level of parallelism are costefficient processors for performing the FDK reconstruction according to medical requirements. In this paper, we present an innovative implementation of the most timeconsuming parts of the FDK algorithm: filtering and backprojection. We also explain the required transformations to parallelize the algorithm for the CUDA architecture. Our implementation approach further allows to do an ontheflyreconstruction, which means that the reconstruction is completed right after the end of data acquisition. This enables us to present the reconstructed volume to the physician in realtime, immediately after the last projection image has been acquired by the scanning device. Finally, we compare our results to our highly optimized FDK implementation on the Cell Broadband Engine Architecture (CBEA), both with respect to reconstruction speed and implementation effort. I.
Statistical Cone–Beam CT Image Reconstruction using the Cell Broadband Engine
"... Abstract — CT images can be reconstructed analytically or iteratively. The analytic methods, e.g. filtered backprojection, are known to be computationally inexpensive and highly accurate. Iterative reconstruction seems of high interest since better dose usage is expected. However, iterative methods ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract — CT images can be reconstructed analytically or iteratively. The analytic methods, e.g. filtered backprojection, are known to be computationally inexpensive and highly accurate. Iterative reconstruction seems of high interest since better dose usage is expected. However, iterative methods are computationally extremely expensive and therefore have been applied to modalities with low amounts of data (e.g. PET) only. A promising algorithm for CT is the ordered subset convex (OSC) whose initial design has recently been significantly improved and now achieves high image quality. Recently, a novel general purpose architecture optimized for distributed computing became available: The Cell Broadband Engine (CBE). Its eight synergistic processing elements (SPEs) currently allow for a theoretical performance of 192 GFlops. We aim at maximizing the OSC image reconstruction speed for flat– panel–based cone–beam CT such as micro–CT or C–arm–CT. For this geometry highly optimized perspective CBE–based cone– beam forward and backprojection algorithms were designed and implemented. Performance was assessed by reconstructing a 512 3 volume from 512 cone–beam projections of size 1024 2. In combination with a preceding Feldkamp–type initialization, four OSC iterations turned out to be sufficient to achieve high image quality. Using both CBEs of our dual Cell–based blade (Mercury Computer Systems) allows to reconstruct the whole volume in about one minute. I.
Tomographic Image Reconstruction using the Cell Broadband Engine (CBE) General Purpose Hardware
"... Tomographic image reconstruction, such as the reconstruction of CT projection values, of tomosynthesis data, PET or SPECT events, is computational very demanding. In filtered backprojection as well as in iterative reconstruction schemes, the most time–consuming steps are forward and backprojection ..."
Abstract
 Add to MetaCart
(Show Context)
Tomographic image reconstruction, such as the reconstruction of CT projection values, of tomosynthesis data, PET or SPECT events, is computational very demanding. In filtered backprojection as well as in iterative reconstruction schemes, the most time–consuming steps are forward and backprojection which are often limited by the memory bandwidth. Recently, a novel general purpose architecture optimized for distributed computing became available: the Cell Broadband Engine (CBE). Its eight synergistic processing elements (SPEs) currently allow for a theoretical performance of 192 GFlops (3 GHz, 8 units, 4 floats per vector, 2 instructions, multiply and add, per clock). To maximize image reconstruction speed we modified our parallel–beam and perspective backprojection algorithms which are highly optimized for standard PCs, and optimized the code for the CBE processor. 1–3 In addition, we implemented an optimized perspective forwardprojection on the CBE which allows us to perform statistical image reconstructions like the ordered subset convex (OSC) algorithm. 4 Performance was measured using simulated data with 512 projections per rotation and 512 2 detector elements. The data were backprojected into an image of 512 3 voxels using our PC–based approaches and the new CBE– based algorithms. Both the PC and the CBE timings were scaled to a 3 GHz clock frequency. On the CBE, we obtain total reconstruction times of 4.04 s for the parallel backprojection, 13.6 s for the perspective backprojection and 192 s for a complete OSC reconstruction, consisting of one initial Feldkamp reconstruction, followed by 4 OSC iterations. 1.
A Hardware Projector/Backprojector Pair for 3D PET Reconstruction
"... Forward and Backward projections are two computational costly steps in tomography image reconstruction such as Positron Emission Tomography (PET). To speedup reconstruction time, a hardware projection/backprojection pair has been built following algorithm architecture adequacy principles. Thanks to ..."
Abstract
 Add to MetaCart
(Show Context)
Forward and Backward projections are two computational costly steps in tomography image reconstruction such as Positron Emission Tomography (PET). To speedup reconstruction time, a hardware projection/backprojection pair has been built following algorithm architecture adequacy principles. Thanks to an original memory access strategy based on an 3D adaptive and predictive memory cache, the external memory wall has been overcome. Thus, for both projector architectures several units run efficiently. Each unit reaches a computational throughput close to 1 operation per cycle. In this paper, we present how from our hardware projection/backprojection pair, an analytic (3DRP) and an iterative (3DEM) reconstruction algorithms can be implemented on a System on Programmable Chip (SoPC). First, an hardware/software partitioning is done based on the different steps of each algorithm. Then the reconstruction system is composed of two hardware configurations of the programmable logic resources (FPGA). Each one corresponds mainly to the projection and backprojection step. Our projector/backprojector has been validated with a software 3DRP and 3DEM reconstruction on simulated PETSORTEO data. A reconstruction time evaluation of these reconstruction systems are done based on the measured performances of our projectors IPs and the estimated performances of the additional simple hardware IPs. The expected reconstruction time is compared with the software tomography distribution STIR. A speedup of 7 can be expected for the 3DRP algorithm and a speedup of 3.5 for the 3DEM algorithm. For both algorithms, the architecture cycle efficiency expected is largely greater than the software implementation: 120 times for 3DRP and 60 times for 3DEM.
unknown title
"... Abstract — The threedimensional image reconstruction process used in interventional CT imaging is computationally demanding. Implementation on generalpurpose computational platforms requires substantial processing time, which is undesirable during timecritical surgical and minimally invasive proc ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — The threedimensional image reconstruction process used in interventional CT imaging is computationally demanding. Implementation on generalpurpose computational platforms requires substantial processing time, which is undesirable during timecritical surgical and minimally invasive procedures. Central and Graphics Processing Units ( CPUs and GPUs, respectively) have been studied as a platform to accelerate 3D imaging. GPU devices offer a programmable hardware architecture, suitable for pipelining and high levels of parallel processing to increase computational throughput, as well as the benefits of being offtheshelf and effectively scalable solutions. The focus of this paper is on the backprojection step of the image reconstruction process, since it is the most computationally intensive part. Using the modified FeldkampDavisKress (FDK) conebeam algorithm, our feasibility studies indicate the entire 512 3 image reconstruction on a mobile Xray Carm can be accelerated to real time (i.e. completed immediately after an exposure scan of 1530 seconds duration). C Accelerated conebeam backprojection using GPUCPU hardware Index Terms—Xray tomography, image reconstruction
SPIE Electronic Imaging 2007, Computational Imaging V Keynote Why do Commodity Graphics Hardware Boards (GPUs) work so well for acceleration of Computed Tomography?
"... Commodity graphics hardware boards (GPUs) have achieved remarkable speedups in various subareas of Computed Tomography (CT). This paper takes a close look at the GPU architecture and its programming model and describes a successful acceleration of Feldkamp’s conebeam CT reconstruction algorithm. F ..."
Abstract
 Add to MetaCart
(Show Context)
Commodity graphics hardware boards (GPUs) have achieved remarkable speedups in various subareas of Computed Tomography (CT). This paper takes a close look at the GPU architecture and its programming model and describes a successful acceleration of Feldkamp’s conebeam CT reconstruction algorithm. Further, we will also have a comparative look at the new emerging Cell architecture in this regard, which similar to GPUs has also seen its first deployment in gaming and entertainment. To complete the discussion on highperformance PCbased computing platforms, we will also compare GPUs with FPGA (Field Programmable Gate Array) based medical imaging solutions.
Implementation of the FDK Algorithm for ConeBeam CT on the Cell Broadband Engine Architecture
"... In most of today’s commercially available conebeam CT scanners, the well known FDK method is used for solving the 3D reconstruction task. The computational complexity of this algorithm prohibits its use for many medical applications without hardware acceleration. The brandnew Cell Broadband Engine ..."
Abstract
 Add to MetaCart
In most of today’s commercially available conebeam CT scanners, the well known FDK method is used for solving the 3D reconstruction task. The computational complexity of this algorithm prohibits its use for many medical applications without hardware acceleration. The brandnew Cell Broadband Engine Architecture (CBEA) with its high level of parallelism is a costefficient processor for performing the FDK reconstruction according to the medical requirements. The programming scheme, however, is quite different to any standard personal computer hardware. In this paper, we present an innovative implementation of the most timeconsuming parts of the FDK algorithm: filtering and backprojection. We also explain the required transformations to parallelize the algorithm for the CBEA. Our software framework allows to compute the filtering and backprojection in parallel, making it possible to do an ontheflyreconstruction. The achieved results demonstrate that a complete FDK reconstruction is computed with the CBEA in less than seven seconds for a standard clinical scenario. Given the fact that scan times are usually much higher, we conclude that reconstruction is finished right after the end of data acquisition. This enables us to present the reconstructed volume to the physician in realtime, immediately after the last projection image has been acquired by the scanning device.