Results 1 - 10
of
33
Efficient Summarization of Stereoscopic Video Sequences
- IEEE TRANS. ON CSVT
, 2000
"... An efficient technique for summarization of stereoscopic video sequences is presented in this paper, which extracts a small but meaningful set of video frames using a content-based sampling algorithm. The proposed video-content representation provides the capability of browsing digital stereoscopic ..."
Abstract
-
Cited by 23 (22 self)
- Add to MetaCart
An efficient technique for summarization of stereoscopic video sequences is presented in this paper, which extracts a small but meaningful set of video frames using a content-based sampling algorithm. The proposed video-content representation provides the capability of browsing digital stereoscopic video sequences and performing more efficient content-based queries and indexing. Each stereoscopic video sequence is first partitioned into shots by applying a shot-cut detection algorithm so that frames (or stereo pairs) of similar visual characteristics are gathered together. Each shot is then analyzed using stereo-imaging techniques, and the disparity field, occluded areas, and depth map are estimated. A multiresolution implementation of the Recursive Shortest Spanning Tree (RSST) algorithm is applied for color and depth segmentation, while fusion of color and depth segments is employed for reliable video object extraction. In particular, color segments are projected onto depth segments so that video objects on the same depth plane are retained, while at the same time accurate object boundaries are extracted. Feature vectors are then constructed using multidimensional fuzzy classification of segment features including size, location, color, and depth. Shot selection is accomplished by clustering similar shots based on the generalized Lloyd--Max algorithm, while for a given shot, key frames are extracted using an optimization method for locating frames of minimally correlated feature vectors. For efficient implementation of the latter method, a genetic algorithm is used. Experimental results are presented, which indicate the reliable performance of the proposed scheme on real-life stereoscopic video sequences.
A Fuzzy Video Content Representation For Video Summarization And Content-Based Retrieval
- Signal Processing
, 1997
"... In this paper, a fuzzy representation of visual content is proposed, which is useful for the new emerging multimedia applications, such as content-based image indexing and retrieval, video browsing and summarization. In particular, a multidimensional fuzzy histogram is constructed for each video fra ..."
Abstract
-
Cited by 23 (19 self)
- Add to MetaCart
In this paper, a fuzzy representation of visual content is proposed, which is useful for the new emerging multimedia applications, such as content-based image indexing and retrieval, video browsing and summarization. In particular, a multidimensional fuzzy histogram is constructed for each video frame based on a collection of appropriate features, extracted using video sequence analysis techniques. This approach is then applied both for video summarization, in the context of a content-based sampling algorithm, and for content-based indexing and retrieval. In the "rst case, video summarization is accomplished by discarding shots or frames of similar visual content so that only a small but meaningful amount of information is retained (key-frames). In the second case, a content-based retrieval scheme is investigated, so that the most similar images to a query are extracted. Experimental results and comparison with other known methods are presented to indicate the good performance of the proposed scheme on real-life video recordings. # 2000 Elsevier Science B.V. All rights reserved.
Interactive Content-Based Retrieval in Video Databases Using Fuzzy Classification and Relevance Feedback
, 1999
"... This paper presents an integrated framework for interactive content-based retrieval in video databases by means of visual queries. The proposed system incorporates algorithms for video shot detection, keyframe and shot selection, automated video object segmentation and tracking, and construction of ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
This paper presents an integrated framework for interactive content-based retrieval in video databases by means of visual queries. The proposed system incorporates algorithms for video shot detection, keyframe and shot selection, automated video object segmentation and tracking, and construction of multidimensional feature vectors using fuzzy classification of color, motion or texture segment properties. Retrieval is then performed in an interactive way by employing a parametric distance between feature vectors and updating distance parameters according to user requirements using relevance feedback. Experimental results demonstrate increased performance and flexibility according to user information needs. 1.
Unsupervised Semantic Object Segmentation of Stereoscopic Video Sequences
- PROC. OF IEEE INT. CONF. ON INTELLIGENCE, INFORMATION AND SYSTEMS
, 1999
"... In this paper, we present an efficient technique for unsupervised semantically meaningful object segmentation of stereoscopic video sequences. By this technique we achieve to extract semantic objects using the additional information a stereoscopic pair of frames provides. Each pair is analyzed and t ..."
Abstract
-
Cited by 9 (8 self)
- Add to MetaCart
In this paper, we present an efficient technique for unsupervised semantically meaningful object segmentation of stereoscopic video sequences. By this technique we achieve to extract semantic objects using the additional information a stereoscopic pair of frames provides. Each pair is analyzed and the disparity field, occluded areas and depth map are estimated. The key algorithm, which is applied on the stereo pair of images and performs the segmentation, is a powerful low-complexity multiresolution implementation of the RSST algorithm. Color segment fusion is employed using the depth segments as a kind of constraints. Finally experimental results are presented which demonstrate the high-quality of semantic object segmentation this technique achieves.
Content-based Video Retrieval: An overview
, 2000
"... Content-based Image Retrieval systems (CBIRS) start ourishing on the Web. Their performances are continuously improving and their base principles span a wide range of diversity. Content-based Video Retrieval systems (CBVRS) are less common and seem at a first glance to be a natural extension of CBIR ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Content-based Image Retrieval systems (CBIRS) start ourishing on the Web. Their performances are continuously improving and their base principles span a wide range of diversity. Content-based Video Retrieval systems (CBVRS) are less common and seem at a first glance to be a natural extension of CBIRS. In this document, we summarise advances made in the development of CBVRS and analyse their relationship to CBIRS. While doing so, we show that CBVRS are actually not so obvious extensions of CBIRS.
A Probabilistic Framework of Selecting Effective Key-Frames for Video Browsing and Indexing
, 2000
"... To represent effectively the video content, for browsing, indexing and video skimming, the most characteristic frames (called key-frames) should be extracted from given shots. This paper, briefly reviews and evaluates the existing approaches of key-frames extraction; and then introduces a framework ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
To represent effectively the video content, for browsing, indexing and video skimming, the most characteristic frames (called key-frames) should be extracted from given shots. This paper, briefly reviews and evaluates the existing approaches of key-frames extraction; and then introduces a framework of selecting effective key-frames using an unsupervised clustering method. The mixture of Gaussians is used to model the temporal variation of the feature vectors of all frames in the shot. As a result, the feature-based representation of the shot is partitioned into several clusters. From each obtained cluster, firstly the closest frame to the median of its frames is selected as a reference key-frame. Then depending on the variation in time and appearance of the cluster content against the reference key-frame multiple frames can be extracted to represent effectively the cluster. The number of clusters is determined automatically by the Bayes Information Criterion. Experimental results on tracked objects in a real-world video stream are presented which illustrate the performance of the proposed technique.
Non-Sequential Video Structuring Based on Video Object Linking: An Efficient Tool for Video Browsing and Indexing
- in Proc. IEEE Int. Conf. Image Processing, Thessaloniki
, 2001
"... An efficient system for unsupervised structuring of stereoscopic sequences is presented in this paper, which generates links between similar VOPs of different shots. Particularly after shot cut detection, for each shot, a fast, unsupervised VOP detection and tracking algorithm is applied. Then for e ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
An efficient system for unsupervised structuring of stereoscopic sequences is presented in this paper, which generates links between similar VOPs of different shots. Particularly after shot cut detection, for each shot, a fast, unsupervised VOP detection and tracking algorithm is applied. Then for each of the foreground VOPs of a frame, a feature vector is constructed using low level features of the VOP as color and size. Afterwards, for a given shot, key-VOP poses are extracted for each VOP, using an optimization method for locating VOP poses of minimally correlated feature vectors. Finally an iterative process is performed to link each key VOP of a shot to another, according to a correlation measure. Experimental results indicate the promising performance of the proposed system on real life stereoscopic video sequences.
Extraction Of Key Frames From Videos By Optimal Color Composition Matching And Polygon Simplification
, 2001
"... A video sequence is first mapped to a sequence of points in a semi-metric space that forms a polyline. We require only that a semi-distance between pairs of points be defined that need not satisfy the triangle inequality. By simplifying the polyline, we obtain a small set of the most relevant key fr ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A video sequence is first mapped to a sequence of points in a semi-metric space that forms a polyline. We require only that a semi-distance between pairs of points be defined that need not satisfy the triangle inequality. By simplifying the polyline, we obtain a small set of the most relevant key frames that is representative of the whole video sequence. The degree of the simplification is either determined automatically or selected by the user. Using our technique, a viewer can browse a video at the level of summarization that suits his patience level. Applications include the creation of a smart fast-forward function for digital VCRs, and the automatic creation of short summaries or trailers that can be used as previews before videos are downloaded from the web.
Non-Linear Relevance Feedback: Improving the Performance of Content-Based Retrieval Systems
- of Content-based Retrieval Systems”, Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on
, 2000
"... In this paper, a non-linear relevance feedback mechanism is proposed for increasing the performance and the reliability of content-based retrieval systems. In particular, the human is considered as part of the retrieval process in an interactive framework, who evaluates the results provided by the s ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this paper, a non-linear relevance feedback mechanism is proposed for increasing the performance and the reliability of content-based retrieval systems. In particular, the human is considered as part of the retrieval process in an interactive framework, who evaluates the results provided by the system so that the system automatically updated its performance based on the users' feedback. An adaptively trained neural network architecture is used for implementing the non- linear feedback. The weight adaptation is performed in such a way that the network output satisfies the users' selection as much as possible, while simultaneously providing a minimal degradation over all previous data. Experimental results indicates that the proposed method yields better performance compared to linear relevance feedback mechanism.
High-level concept detection based on mid-level semantic information and contextual adaptation
- Proceedings of the Second International Workshop on Semantic Media Adaptation and Personalization
"... In this paper we propose the use of enhanced mid-level information, such as information obtained from the application of supervised or unsupervised learning methodologies on low-level characteristics, in order to improve semantic multimedia analysis. High-level, a priori contextual knowledge about t ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper we propose the use of enhanced mid-level information, such as information obtained from the application of supervised or unsupervised learning methodologies on low-level characteristics, in order to improve semantic multimedia analysis. High-level, a priori contextual knowledge about the semantic meaning of objects and their low-level visual descriptions are combined in an integrated approach that handles in a uniform way the gap between semantics and low-level features. Prior work on low-level feature extraction is extended and a region thesaurus containing all mid-level features is constructed using a hierarchical clustering method. A model vector that contains the distances from each mid-level element is formed and a neural network-based detector is trained for each semantic concept. Contextual adaptation improves the quality of the produced results, by utilizing fuzzy algebra, fuzzy sets and relations. The novelty of the presented work is the contextdriven mid-level manipulation of region types, utilizing a domain-independent ontology infrastructure to handle the knowledge. Early experimental results are presented using data derived from the beach domain. 1

