Results 1 - 10
of
30
Region-based representations of image and video: Segmentation tools for multimedia services
, 1999
"... This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main pr ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standard are discussed. The review is structured around the strategies used by the algorithms (transition-based or homogeneity-based) and the decision spaces (spatial, spatio-temporal and temporal). The second part of the paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in...
Binary Partition Tree as an Efficient Representation for Image Processing, Segmentation, and Information Retrieval
, 2000
"... This paper discusses the interest of Binary Partition Trees as a region-oriented image representation. Binary Partition Trees concentrate in a compact and structured representation a set of meaningful regions that can be extracted from an image. They offer a multi-scale representation of the image a ..."
Abstract
-
Cited by 48 (7 self)
- Add to MetaCart
This paper discusses the interest of Binary Partition Trees as a region-oriented image representation. Binary Partition Trees concentrate in a compact and structured representation a set of meaningful regions that can be extracted from an image. They offer a multi-scale representation of the image and define a translation invariant 2-connectivity rule among regions. As shown in the paper, this representation can be used for a large number of processing goals such as filtering, segmentation, information retrieval and visual browsing. Furthermore, the processing of the tree representation leads to very efficient algorithms. Finally, for some applications, it may be interesting to compute the Binary Partition Tree once and to store it for subsequent use for various applications. In this context, the last section of the paper will show that the amount of bits necessary to encode a Binary Partition Tree remains moderate. Keywords--- Nonlinear filtering, Connected operators, Mathematical mo...
Image Sequence Analysis for Emerging Interactive Multimedia Services-The European COST 211 Framework
- IEEE Trans. Circuits Syst. Video Technol
, 1998
"... Flexibility and efficiency of coding, content extraction, and content-based search are key research topics in the field of interactive multimedia. Ongoing ISO MPEG-4 and MPEG-7 activities are targeting standardization to facilitate such services. European COST Telecommunications activities provide a ..."
Abstract
-
Cited by 40 (4 self)
- Add to MetaCart
Flexibility and efficiency of coding, content extraction, and content-based search are key research topics in the field of interactive multimedia. Ongoing ISO MPEG-4 and MPEG-7 activities are targeting standardization to facilitate such services. European COST Telecommunications activities provide a framework for research collaboration. COST 211 bis and COST 211 ter activities have been instrumental in the definition and development of the ITU-T H.261 and H.263 standards for videoconferencing over ISDN and videophony over regular phone lines, respectively. The group has also contributed significantly to the ISO MPEG-4 activities. At present a significant effort of the COST 211 ter group activities is dedicated toward image and video sequence analysis and segmentation---an important technological aspect for the success of emerging object-based MPEG-4 and MPEG-7 multimedia applications. The current work of COST 211 is centered around the test model, called the Analysis Model (AM). ...
A Stochastic Framework For Optimal Key Frame . . .
- MPEG VIDEO DATABASES,” COMPUTER VISION AND IMAGE UNDERSTANDING
, 1999
"... A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered toge ..."
Abstract
-
Cited by 39 (28 self)
- Add to MetaCart
A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered together using a fuzzy formulation and extraction of several key frames is performed for each shot in a contentbased rate sampling framework. In particular, our approach is based on minimization of a cross-correlation criterion among video frames of a given shot so as to be located a set of minimally correlated feature vectors. Experimental results indicating the good performance of the proposed scheme are also presented.
Efficient Summarization of Stereoscopic Video Sequences
- IEEE TRANS. ON CSVT
, 2000
"... An efficient technique for summarization of stereoscopic video sequences is presented in this paper, which extracts a small but meaningful set of video frames using a content-based sampling algorithm. The proposed video-content representation provides the capability of browsing digital stereoscopic ..."
Abstract
-
Cited by 23 (22 self)
- Add to MetaCart
An efficient technique for summarization of stereoscopic video sequences is presented in this paper, which extracts a small but meaningful set of video frames using a content-based sampling algorithm. The proposed video-content representation provides the capability of browsing digital stereoscopic video sequences and performing more efficient content-based queries and indexing. Each stereoscopic video sequence is first partitioned into shots by applying a shot-cut detection algorithm so that frames (or stereo pairs) of similar visual characteristics are gathered together. Each shot is then analyzed using stereo-imaging techniques, and the disparity field, occluded areas, and depth map are estimated. A multiresolution implementation of the Recursive Shortest Spanning Tree (RSST) algorithm is applied for color and depth segmentation, while fusion of color and depth segments is employed for reliable video object extraction. In particular, color segments are projected onto depth segments so that video objects on the same depth plane are retained, while at the same time accurate object boundaries are extracted. Feature vectors are then constructed using multidimensional fuzzy classification of segment features including size, location, color, and depth. Shot selection is accomplished by clustering similar shots based on the generalized Lloyd--Max algorithm, while for a given shot, key frames are extracted using an optimization method for locating frames of minimally correlated feature vectors. For efficient implementation of the latter method, a genetic algorithm is used. Experimental results are presented, which indicate the reliable performance of the proposed scheme on real-life stereoscopic video sequences.
Interactive Content-Based Retrieval in Video Databases Using Fuzzy Classification and Relevance Feedback
, 1999
"... This paper presents an integrated framework for interactive content-based retrieval in video databases by means of visual queries. The proposed system incorporates algorithms for video shot detection, keyframe and shot selection, automated video object segmentation and tracking, and construction of ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
This paper presents an integrated framework for interactive content-based retrieval in video databases by means of visual queries. The proposed system incorporates algorithms for video shot detection, keyframe and shot selection, automated video object segmentation and tracking, and construction of multidimensional feature vectors using fuzzy classification of color, motion or texture segment properties. Retrieval is then performed in an interactive way by employing a parametric distance between feature vectors and updating distance parameters according to user requirements using relevance feedback. Experimental results demonstrate increased performance and flexibility according to user information needs. 1.
Efficient Unsupervised Content-Based Segmentation In Stereoscopic Video Sequences
, 1999
"... This paper presents an e#cient technique for unsupervised content-based segmentation in stereoscopic video sequences by appropriately combined di#erent content descriptors in a hierarchical framework. Three main modules are involved in the proposed scheme; extraction of reliable depth information ..."
Abstract
-
Cited by 12 (11 self)
- Add to MetaCart
This paper presents an e#cient technique for unsupervised content-based segmentation in stereoscopic video sequences by appropriately combined di#erent content descriptors in a hierarchical framework. Three main modules are involved in the proposed scheme; extraction of reliable depth information, image partition into color and depth regions and a constrained fusion algorithm of color segments using information derived from the depth map. In the #rst module, each stereo pair is analyzed and the disparity #eld and depth map are estimated. Occlusion detection and compensation are also applied for improving the depth map estimation. In the following phase, color and depth regions are created using a novel complexity-reducingmultiresolution implementation of the Recursive Shortest Spanning Tree algorithm #M-RSST#. While depth segments provide a coarse representation of the image content, color regions describe very accurately object boundaries. For this reason, in the #nal phase, a new segmentation fusion algorithm is employed which projects color segments onto depth segments. Experimental results are presented which exhibit the e#ciency of the proposed scheme as content-based descriptor, even in case of images with complicated visual content
The Image Foresting Transformation
- IEEE Trans. on Pattern Analysis and Machine Intelligence
, 2000
"... In this paper, we introduce an image processing operator called Image Foresting Transformation (IFT ). The image foresting transformation maps an input image into a graph, computes a shortest-path forest in this graph, and outputs an annotated image, which is basically an image and its associated ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
In this paper, we introduce an image processing operator called Image Foresting Transformation (IFT ). The image foresting transformation maps an input image into a graph, computes a shortest-path forest in this graph, and outputs an annotated image, which is basically an image and its associated forest. We describe the application of IFT to region growing, edge detection, Euclidean distance transform, geodesic distance computation, and watershed transformation. All the operators are eciently computed using the same IFT algorithm based on the same set of parameters by changing only their meaning and values. We also present a new interactive image segmentation paradigm based on the region growing operator and discuss other applications of the IFT for boundary-based object denition and shape-based interpolation. 1 Introduction The use of graph in computer vision and image processing has been investigated for many years now. Its motivation stems from a solid theory with many e...
Unsupervised Semantic Object Segmentation of Stereoscopic Video Sequences
- PROC. OF IEEE INT. CONF. ON INTELLIGENCE, INFORMATION AND SYSTEMS
, 1999
"... In this paper, we present an efficient technique for unsupervised semantically meaningful object segmentation of stereoscopic video sequences. By this technique we achieve to extract semantic objects using the additional information a stereoscopic pair of frames provides. Each pair is analyzed and t ..."
Abstract
-
Cited by 9 (8 self)
- Add to MetaCart
In this paper, we present an efficient technique for unsupervised semantically meaningful object segmentation of stereoscopic video sequences. By this technique we achieve to extract semantic objects using the additional information a stereoscopic pair of frames provides. Each pair is analyzed and the disparity field, occluded areas and depth map are estimated. The key algorithm, which is applied on the stereo pair of images and performs the segmentation, is a powerful low-complexity multiresolution implementation of the RSST algorithm. Color segment fusion is employed using the depth segments as a kind of constraints. Finally experimental results are presented which demonstrate the high-quality of semantic object segmentation this technique achieves.
An Efficient Fully Unsupervised Video Object Segmentation Scheme Using an Adaptive Neural-Network Classifier Architecture
- IEEE Trans. Neural Netw
, 2003
"... In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem an ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem and implemented through an adaptive network classifier, which provides better results compared to conventional motion-based tracking algorithms. Network adaptation is accomplished through an efficient and cost effective weight updating algorithm, providing a minimum degradation of the previous network knowledge and taking into account the current content conditions. A retraining set is constructed and used for this purpose based on initial VO estimation results. Two different scenarios are investigated. The first concerns extraction of human entities in video conferencing applications, while the second exploits depth information to identify generic VOs in stereoscopic video sequences. Human face/ body detection based on Gaussian distributions is accomplished in the first scenario, while segmentation fusion is obtained using color and depth information in the second scenario. A decision mechanism is also incorporated to detect time instances for weight updating. Experimental results and comparisons indicate the good performance of the proposed scheme even in sequences with complicated content (object bending, occlusion).

