Results 1 -
9 of
9
Multiview video compression -- Exploiting Inter-image Similarities
- IEEE SIGNAL PROCESSING MAGAZINE
, 2007
"... Advances in display and camera technology enable new applications for three-dimensional (3-D) scene communication. Among the most important of these applications is 3-D TV; it strives to create realistic 3-D impressions of natural 3-D scenes [1]. Usually, multiple video cameras are used to simultane ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Advances in display and camera technology enable new applications for three-dimensional (3-D) scene communication. Among the most important of these applications is 3-D TV; it strives to create realistic 3-D impressions of natural 3-D scenes [1]. Usually, multiple video cameras are used to simultaneously acquire various viewpoints of a scene. The resulting data are often referred to as multiview video. As the potential degree of 3-D realism improves with the camera density around the scene, a vast amount of multiview video data needs to be stored or transmitted for 3-D TV. Multiview video data is also expected to consume a large portion of the bandwidth available in the Internet of the future. This will include point-to-point communication as well as multicast scenarios. Multimedia distribution via sophisticated content delivery networks and flexible peer-to-peer networks enable possible multiview video on demand as well as live broadcasts. Due to the vast raw bit rate of multiview video, efficient compression techniques are essential for 3-D scene communication [2]. As the video data originate from the same scene, the inherent similarities of the multiview imagery are exploited for efficient compression. These similarities can be classified into two types, inter-view similarity between adjacent camera views
GENERATION OF REDUNDANT FRAME STRUCTURE FOR INTERACTIVE MULTIVIEW STREAMING
"... While multiview video coding focuses on the rate-distortion performance of compressing all frames of all views, we address the problem of designing a frame structure to enable interactive multiview streaming, where clients can interactively switch views during video playback. Thus, as a client is pl ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
While multiview video coding focuses on the rate-distortion performance of compressing all frames of all views, we address the problem of designing a frame structure to enable interactive multiview streaming, where clients can interactively switch views during video playback. Thus, as a client is playing back successive frames (in time) for a given view, it can send a request to the server to switch to a different view while continuing uninterrupted temporal playback. Noting that standard tools for random access (i.e., I-frame insertion) can be inefficient for this application, we propose a technique where redundant representations of some frames can be stored to facilitate view switching. We first present an optimal algorithm with exponential running time that generates such a frame structure so that the expected transmission rate is optimally traded off with total storage. We then present methods to reduce the algorithm complexity for practical use. We show in our experiments that we can generate redundant frame structures offering a range of tradeoff points with transmission and storage, including ones that outperform simple I-frame insertion structures by up to 48 % in terms of bandwidth efficiency for similar storage costs.
Joint Tracking and Multiview Video Compression
"... In immersive communication applications, knowing the user’s viewing position can help improve the efficiency of multiview compression and streaming significantly, since often only a subset of the views are needed to synthesize the desired view(s). However, uncertainty regarding the viewer location c ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In immersive communication applications, knowing the user’s viewing position can help improve the efficiency of multiview compression and streaming significantly, since often only a subset of the views are needed to synthesize the desired view(s). However, uncertainty regarding the viewer location can have negative impacts on the rendering quality. In this paper, we propose an algorithm to improve the robustness of view-dependent compression schemes by jointly performing user tracking and compression. A face tracker tracks the user’s head location and sends the probability distribution of the face locations as one or many particles. The server then applies motion model to the particles and compresses the multiview video accordingly in order to improve the expected rendering quality of the viewer. Experimental results show significantly improved robustness against tracking errors.
Anchor View Allocation for Collaborative Free Viewpoint Video Streaming
"... Abstract—In free viewpoint video, a viewer can choose at will any camera angle or the so-called “virtual view ” to observe a dynamic 3-D scene, enhancing his/her depth perception. The virtual view is synthesized using texture and depth videos of two anchor camera views via depth-image-based renderin ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract—In free viewpoint video, a viewer can choose at will any camera angle or the so-called “virtual view ” to observe a dynamic 3-D scene, enhancing his/her depth perception. The virtual view is synthesized using texture and depth videos of two anchor camera views via depth-image-based rendering (DIBR). We consider, for the first time, collaborative live streaming of a free viewpoint video, where a group of users may interactively pull and cooperatively share streams of different anchor views. There is a cost to access the anchor views from the live source, a cost to “reconfigure ” the peer network due to a change in selected anchors during view switching, and a distortion cost due to the distance of the virtual views to the received anchor views at users. We optimize the anchor views allocated to users so as to minimize the overall streaming cost given by the access cost, reconfiguration cost, and view distortion cost. We first show that, if the reconfiguration cost due to view switching is negligible, the view allocation problem can be optimally and efficiently solved in polynomial time using dynamic programming. For the case of non-negligible reconfiguration cost, the problem becomes NP-hard. We thus present a locally optimal and centralized algorithm inspired by Lloyd’s algorithm used in non-uniform scalar quantization. We further propose a distributed algorithm with convergence guarantee, where each peer group independently makes merge-and-split decisions with a well-defined fairness criteria. Simulation results show that our algorithms achieve low streaming cost due to its excellent anchor view allocation. Index Terms—Digital video broadcasting, multimedia com-puting. I.
RDTC Optimized Streaming for Remote Browsing in Image-Based Scene Representations
"... Remote navigation in compressed image-based scene representations requires random access to arbitrary parts of the reference image data to recompose virtual views. The degree of inter-frame dependencies exploited during compression has an impact on the effort needed to access reference images and de ..."
Abstract
- Add to MetaCart
(Show Context)
Remote navigation in compressed image-based scene representations requires random access to arbitrary parts of the reference image data to recompose virtual views. The degree of inter-frame dependencies exploited during compression has an impact on the effort needed to access reference images and delimits the rate distortion (RD) trade-off that can be achieved. If, additionally, a given receiver hardware and a maximum available channel bitrate is taken into account, the traditional rate-distortion optimization is extended to an RDTC trade-off between rate (R), distortion (D), transmission data rate (T), and decoding complexity (C). In this work we introduce our RDTC optimized compression framework. In addition, an experimental testbed for streaming of these RDTC optimally compressed image-based scene representations is described and the impact of client side caching is investigated. Our results show a significant reduction in user perceived delay for RDTC optimized streams in such a remote browsing environment compared to RD optimized or independently encoded scene representations. * 1.
unknown title
, 2009
"... This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or sel ..."
Abstract
- Add to MetaCart
(Show Context)
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit:
OPTIMAL FRAME STRUCTURE DESIGN USING LANDMARKS FOR INTERACTIVE LIGHT FIELD STREAMING
"... Light field is a large set of spatially correlated images of the same static scene captured using a 2D array of closely spaced cameras. Interactive light field streaming is the application where a client con-tinuously requests successive light field images along a view trajec-tory of his choosing, a ..."
Abstract
- Add to MetaCart
(Show Context)
Light field is a large set of spatially correlated images of the same static scene captured using a 2D array of closely spaced cameras. Interactive light field streaming is the application where a client con-tinuously requests successive light field images along a view trajec-tory of his choosing, and in response the server transmits appropriate data for the client to correctly reconstruct desired images. The tech-nical challenge is how to encode captured light field images into a reasonably sized frame structure a priori (without knowing eventual clients ’ view trajectories), so that at stream time, expected server transmission rate can be minimized, while satisfying client’s view-switch requests. In this paper, using I-frames, redundant P-frames and distributed source coding (DSC) frames as building blocks, we design coding structures to optimally trade off storage size of the frame structure with expected server transmission rate. The key nov-elty is to facilitate the use of “landmarks ” in the structure—popular reference frames cached in the decoder buffer—so that the proba-bility of having at least one useful predictor frame available in the buffer for disparity compensation is greatly increased. We first de-rive recursive equations to find the optimal caching strategy for a given coding structure. We then formulate the structure design prob-lem as a Lagrangian minimization, and propose fast heuristics to find near-optimal solutions. Experimental results show that the expected server streaming rate can be reduced by up to 93.6 % compared to an I-frame-only structure, at twice the storage required. Index Terms — light field, interactive streaming, optimization 1.
OPTIMIZED FRAME STRUCTURE FOR INTERACTIVE LIGHT FIELD STREAMING WITH COOPERATIVE CACHING
"... Light field is a large set of spatially correlated images of the same static scene captured using a 2D array of closely spaced cameras. In-teractive light field streaming is the application where a client contin-uously requests successive light field images along a view trajectory of her choosing, a ..."
Abstract
- Add to MetaCart
(Show Context)
Light field is a large set of spatially correlated images of the same static scene captured using a 2D array of closely spaced cameras. In-teractive light field streaming is the application where a client contin-uously requests successive light field images along a view trajectory of her choosing, and in response the server transmits appropriate data for the client to correctly reconstruct desired images. The technical challenge is how to encode captured light field images into a reason-ably sized frame structure a priori (without knowing eventual clients’ view trajectories), so that during streaming session, expected server transmission rate can be minimized, while satisfying client’s view requests. In this paper, we design efficient frame structures, using I-frames, redundant P-frames and distributed source coding (DSC) frames as building blocks, to optimally trade off storage size of the frame structure with expected server transmission rate. The key nov-elty is to optimize structures in such a way that decoded images in caches of neighboring cooperative peers, connected together via a secondary network such as ad hoc WLAN for content sharing, can be reused to further decrease the server-to-client transmission rate. We formulate the structure design problem as a Lagrangian mini-mization, and propose fast heuristics to find near-optimal solutions. Experimental results show that the expected server streaming rate can be reduced by up to 83 % compared to an I-frame-only structure, at less than twice the storage required. Index Terms — light field, interactive streaming, cooperative caching 1.
Dynamic Node Join Algorithm with Rate-Distortion for P2P Live Multipath Networks
"... Abstract:- The delivery of multimedia that efficiently maximizes its quality in changing network conditions is one of the most challenging tasks in the design of live streaming systems. This study attempts to improve current P2P (peer-to-peer) live streaming systems by allowing users to enjoy high-q ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract:- The delivery of multimedia that efficiently maximizes its quality in changing network conditions is one of the most challenging tasks in the design of live streaming systems. This study attempts to improve current P2P (peer-to-peer) live streaming systems by allowing users to enjoy high-quality service under the limitations of network resources. The proposed improvement method involves summing up and analyzing the consideration factors and restriction factors involving live stream quality during system operations. The proposed R-D (Rate-Distortion) optimized dynamic nodes join algorithm is based on multipath streaming concept and receiver-driven approach. This distributed algorithm enables the system to evaluate the current network status, in order to optimize the end-to-end distortion of P2P networks. Results of this study demonstrate the effectiveness of the proposed approach. Key-Words:- P2P live streaming, Rate-Distortion, multipath streaming, receiver-driven, end-to-end distortion 1