Results 1 - 10
of
125
Comparison of automatic shot boundary detection algorithms
, 1999
"... Various methods of automatic shot boundary detection have been proposed and claimed to perform reliably. Although the detection of edits is fundamental to any kind of video analysis since it segments a video into its basic components, the shots, only few comparative investigations on early shot boun ..."
Abstract
-
Cited by 86 (2 self)
- Add to MetaCart
Various methods of automatic shot boundary detection have been proposed and claimed to perform reliably. Although the detection of edits is fundamental to any kind of video analysis since it segments a video into its basic components, the shots, only few comparative investigations on early shot boundary detection algorithms have been published. These investigations mainly concentrate on measuring the edit detection performance, however, do not consider the algorithms ’ ability to classify the types and to locate the boundaries of the edits correctly. This paper extends these comparative investigations. More recent algorithms designed explicitly to detect specific complex editing operations such as fades and dissolves are taken into account, and their ability to classify the types and locate the boundaries of such edits are examined. The algorithms ’ performance is measured in terms of hit rate, number of false hits, and miss rate for hard cuts, fades, and dissolves over a large and diverse set of video sequences. The experiments show that while hard cuts and fades can be detected reliably, dissolves are still an open research issue. The false hit rate for dissolves is usually unacceptably high, ranging from 50 % up to over 400%. Moreover, all algorithms seem to fail under roughly the same conditions.
Region-based representations of image and video: Segmentation tools for multimedia services
, 1999
"... This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main pr ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standard are discussed. The review is structured around the strategies used by the algorithms (transition-based or homogeneity-based) and the decision spaces (spatial, spatio-temporal and temporal). The second part of the paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in...
Spatial Color Indexing and Applications
, 1998
"... We suggest the use of the color correlogram as a generic indexing tool to tackle various computer vision problems. Correlograms were shown to be very effective for contentbased image retrieval [4]. We adapt the correlogram to handle the problems of image subregion querying, object localization, obje ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
We suggest the use of the color correlogram as a generic indexing tool to tackle various computer vision problems. Correlograms were shown to be very effective for contentbased image retrieval [4]. We adapt the correlogram to handle the problems of image subregion querying, object localization, object tracking, and cut detection. Experimental results suggest that the color correlogram is much more effective than the histogram for these applications, with insignificant additional computational, storage, or processing cost. We also provide a technique to cut down the storage requirement of correlograms so that it is the same as that of histograms, with only negligible performance penalty compared to the original correlogram. 1
Analysis And Presentation Of Soccer Highlights From Digital Video
, 1995
"... In many sports games like soccer, a major portion of the essence is captured in relatively short durations of intense actions. These highlights are summaries of the games. The capture and effective presentation of these action highlights serve as an important browsing mechanism in a video library of ..."
Abstract
-
Cited by 44 (1 self)
- Add to MetaCart
In many sports games like soccer, a major portion of the essence is captured in relatively short durations of intense actions. These highlights are summaries of the games. The capture and effective presentation of these action highlights serve as an important browsing mechanism in a video library of soccer games, and require special techniques for the analyses of digital video. In this paper we present techniques to automatically detect and extract the soccer highlights by analyzing the image contents, and to present these shots of action by the panoramic reconstruction of selected events. The analyses include the recognition of prominent features of the game, tracking of ball, camera movement compensation for effective recognition, and construction of the panoramic views. 1. INTRODUCTION The length of a typical soccer game is more than an hour. In tournaments like the World Cup, 52 games are played. In a video library of soccer games, one may find hundreds of hours game recordings. ...
Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation
- IEEE Trans. on Circuits and Systems for Video Technology
, 1998
"... As digital video becomes more pervasive, e#cient ways of searching and annotating video according to content will be increasingly important. Such tasks arise, for example, in the management of digital video libraries for content-based retrieval and browsing. In this paper, we develop tools based ..."
Abstract
-
Cited by 41 (1 self)
- Add to MetaCart
As digital video becomes more pervasive, e#cient ways of searching and annotating video according to content will be increasingly important. Such tasks arise, for example, in the management of digital video libraries for content-based retrieval and browsing. In this paper, we develop tools based on camera motion for analyzing and annotating a class of structured video using the low-level information available directly from MPEG compressed video. In particular, we show that in certain structured settings it is possible to obtain reliable estimates of camera motion by directly processing data easily obtained from the MPEG format. Working directly with the compressed video greatly reduces the processing time and enhances storage e#ciency.
A Critical Evaluation of Image and Video Indexing Techniques in the Compressed Domain
- IMAGE AND VISION COMPUTING
, 1999
"... Image and video indexing techniques are crucial in multimedia applications. A number of the indexing techniques that operate in the pixel domain have been reported in the literature. The advent of compression standards has led to the proliferation of indexing techniques in the compressed domain. I ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
Image and video indexing techniques are crucial in multimedia applications. A number of the indexing techniques that operate in the pixel domain have been reported in the literature. The advent of compression standards has led to the proliferation of indexing techniques in the compressed domain. In this paper, we present a critical review of the compressed domain indexing techniques proposed in the literature. These include transform domain techniques using Fourier transform, Cosine transform, Karhunen-Loeve transform, Subbands and Wavelets; and spatial domain techniques using Vector Quantization and Fractals. In addition, temporal indexing techniques using motion vectors are also discussed.
A Stochastic Framework For Optimal Key Frame . . .
- MPEG VIDEO DATABASES,” COMPUTER VISION AND IMAGE UNDERSTANDING
, 1999
"... A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered toge ..."
Abstract
-
Cited by 39 (28 self)
- Add to MetaCart
A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered together using a fuzzy formulation and extraction of several key frames is performed for each shot in a contentbased rate sampling framework. In particular, our approach is based on minimization of a cross-correlation criterion among video frames of a given shot so as to be located a set of minimally correlated feature vectors. Experimental results indicating the good performance of the proposed scheme are also presented.
Statistical models of video structure for content analysis and characterization
- IEEE Trans. on Image Processing
, 2000
"... Abstract — Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introdu ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
Abstract — Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models with two practical applications. First, we develop a Bayesian formulation for the shot segmentation problem that is shown to extend the standard thresholding model in an adaptive and intuitive way, leading to improved segmentation accuracy. Second, by applying the transformation into the shot duration/activity feature space to a database of movie clips, we also illustrate how the Bayesian model captures semantic properties of the content. We suggest ways in which these properties can be used as a basis for intuitive content-based access to movie libraries.Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models with two practical applications. First, we develop a Bayesian formulation for the shot segmentation problem that is shown to extend the standard thresholding model in an adaptive and intuitive way, leading to improved segmentation accuracy. Second, by applying the transformation into the shot duration/activity feature space to a database of movie clips, we also illustrate how the Bayesian model captures semantic properties of the content. We suggest ways in which these properties can be used as a basis for intuitive content-based access to movie libraries. I.
Temporal Video Segmentation Using Unsupervised Clustering and Semantic Object Tracking
- Journal of Electronic Imaging
, 1998
"... This paper proposes a content-based temporal video segmentation system that integrates syntactic (domain-independent) and semantic (domain-dependent) features for automatic management of video data. Temporal video segmentation includes scene change detection and shot classification. The proposed sc ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
This paper proposes a content-based temporal video segmentation system that integrates syntactic (domain-independent) and semantic (domain-dependent) features for automatic management of video data. Temporal video segmentation includes scene change detection and shot classification. The proposed scene change detection method consists of two steps: detection and tracking of semantic objects of interest specified by the user, and an unsupervised method for detection of cuts, and edit effects. Object detection and tracking is achieved using a region matching scheme, where the region of interest is defined by the boundary of the object. A new unsupervised scene change detection method based on 2-class clustering is introduced to eliminate the data dependency of threshold selection. The proposed shot classification approach relies on semantic image features and exploits domain-dependent visual properties such as shape, color, and spatial configuration of tracked semantic objects. The syste...

