Efficient Compression Of Arbitrary Multi-View Video Signals (1996)
| Citations: | 5 - 1 self |
BibTeX
@MISC{McVeigh96efficientcompression,
author = {Jeffrey Scott McVeigh},
title = {Efficient Compression Of Arbitrary Multi-View Video Signals},
year = {1996}
}
OpenURL
Abstract
Multiple views of a scene, obtained from cameras positioned at distinct viewpoints, can provide a viewer with the benefits of added realism, selective viewing, and improved scene understanding. The importance of these signals is evidenced by the recently proposed Multi-View Profile (MVP) extension to the MPEG-2 video compression standard, and their explicit incorporation into the future MPEG-4 standard. However, multi-view compression implementations typically rely on single-view image sequence model assumptions. We hypothesize (and demonstrate) that impressive system bandwidth reduction can be achieved by utilizing displacement vector field and image intensity models tuned to the special characteristics of multi-view video signals. This thesis focuses on the predictive coding of non-periodic, i.e., arbitrary, multi-view video signals for the applications of simulated motion parallax and viewer-specified degree of stereoscopy. To facilitate their practical use, we desire algorithms that are applicable to the common waveform-based, hybrid encoder framework, which consists of a frame-based prediction followed by residual encoding. Three novel techniques are developed, which respectively improve the processes of frame based prediction, residual encoding, and viewpoint interpolation. These are: • a simple method to adaptively select the best possible reference frame, based on estimated occlusion percentage with the frame to be encoded; • a low bit rate residual encoding technique that compensates for pixel intensity nonstationarities along a displacement trajectory and for the practical limitations of the prediction process; and • an algorithm that correctly handles displacement estimation errors, occlusions and ambiguously-referenced image regions for the interpolation of subjectively-pleasing “virtual” viewpoints from a noisy displacement vector field. We demonstrate the superiority of each of these algorithms on numerous multi-view video signals through comparisons with conventional techniques, and we analyze their cost/benefit ratio in terms of increases in system complexity and storage, offset by rate-distortion improvements. Finally, we indicate the relative significance of these algorithms, and provide insight into how and when they should be combined into a complete, efficient multi-view encoder/decoder system.







