Results 1 - 10
of
52
Fast and robust short video clip search using an index structure
- In 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR ’04
, 2004
"... Abstract. Query by video clip (QVC) has attracted wide research interests in multimedia information retrieval. In general, QVC may include feature extraction, similarity measure, database organization, and search or query scheme. Towards an effective and efficient solution, diverse applications have ..."
Abstract
-
Cited by 42 (6 self)
- Add to MetaCart
(Show Context)
Abstract. Query by video clip (QVC) has attracted wide research interests in multimedia information retrieval. In general, QVC may include feature extraction, similarity measure, database organization, and search or query scheme. Towards an effective and efficient solution, diverse applications have different considerations and challenges on the abovementioned phases. In this paper, we firstly attempt to broadly categorize most existing QVC work into 3 levels: concept based video retrieval, video title identification, and video copy detection. This 3-level catego-rization is expected to explicitly identify typical applications, robust requirements, likely features, and main challenges existing between mature techniques and hard performance requirements. A brief survey is presented to concretize the QVC categorization. Under this categorization, in this paper we focus on the copy de-tection task, wherein the challenges are mainly due to the design of compact and robust low level features (i.e. an effective signature) and a kind of fast searching mechanism. In order to effectively and robustly characterize the video segments of variable lengths, we design a novel global visual feature (a fixed-size 144-d sig-
Towards effective indexing for very large video sequence database
- In Proc. of SIGMOD Conf
, 2005
"... With rapid advances in video processing technologies and ever fast increments in network bandwidth, the popularity of video content publishing and sharing has made similar-ity search an indispensable operation to retrieve videos of user interests. The video similarity is usually measured by the perc ..."
Abstract
-
Cited by 36 (14 self)
- Add to MetaCart
(Show Context)
With rapid advances in video processing technologies and ever fast increments in network bandwidth, the popularity of video content publishing and sharing has made similar-ity search an indispensable operation to retrieve videos of user interests. The video similarity is usually measured by the percentage of similar frames shared by two video se-quences, and each frame is typically represented as a high-dimensional feature vector. Unfortunately, high complexity of video content has posed the following major challenges for fast retrieval: (a) effective and compact video representa-tions, (b) efficient similarity measurements, and (c) efficient indexing on the compact representations. In this paper, we propose a number of methods to achieve fast similarity search for very large video database. First, each video se-
Real-time nearduplicate elimination for web video search with content and context
- IEEE. Trans. on Multimedia
, 2009
"... Abstract—With the exponential growth of social media, there exist huge numbers of near-duplicate web videos, ranging from simple formatting to complex mixture of different editing effects. In addition to the abundant video content, the social web provides rich sets of context information associated ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
(Show Context)
Abstract—With the exponential growth of social media, there exist huge numbers of near-duplicate web videos, ranging from simple formatting to complex mixture of different editing effects. In addition to the abundant video content, the social web provides rich sets of context information associated with web videos, such as thumbnail image, time duration and so on. At the same time, the popularity of Web 2.0 demands for timely response to user queries. To balance the speed and accuracy aspects, in this paper, we combine the contextual information from time duration, number of views, and thumbnail images with the content analysis derived from color and local points to achieve real-time near-duplicate elimination. The results of 24 popular queries retrieved from YouTube show that the proposed approach integrating content and context can reach real-time novelty re-ranking of web videos with extremely high efficiency, where the majority of duplicates can be rapidly detected and removed from the top rankings. The speedup of the proposed approach can reach 164 times faster than the effective hierarchical method proposed in [31], with just a slight loss of performance. Index Terms—Content, context, copy detection, filtering, nearduplicates, novelty and redundancy detection, similarity measure, web video. I.
Fast similarity search and clustering of video sequences on the world-wide-web
, 2004
"... ..."
(Show Context)
UQLIPS: A Realtime Nearduplicate Video Clip Detection System
, 2007
"... Near-duplicate video clip (NDVC) detection is an important problem with a wide range of applications such as TV broadcast monitoring, video copyright enforcement, content-based video clustering and annotation, etc. For a large database with tens of thousands of video clips, each with thousands of fr ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Near-duplicate video clip (NDVC) detection is an important problem with a wide range of applications such as TV broadcast monitoring, video copyright enforcement, content-based video clustering and annotation, etc. For a large database with tens of thousands of video clips, each with thousands of frames, can NDVC search be performed in real-time? In addition to considering inter-frame similarity (i.e., spatial information), what is the impact of frame sequence similarity (i.e., temporal information) on search speed and accuracy? UQLIPS is a prototype system for online NDVC detection. The core of UQLIPS comprises two novel complementary schemes for detecting NDVCs. Bounded Coordinate System (BCS), a compact representation model ignoring temporal information, globally summarizes each video to a single vector which captures the dominating content and content changing trends of each clip. The other proposal, named FRAme Symbolization (FRAS), maps each clip to a sequence of symbols, and takes temporal order and sequence context information into consideration. Using a large collection of TV commercials, UQLIPS clearly demonstrates that it is feasible to perform real-time NDVC detection with high accuracy.
Clip-based similarity measure for hierarchical video retrieval
- in MIR ’04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
, 2004
"... This paper proposes a new approach and algorithm for the similarity measure of video clips. The similarity is mainly based on two bipartite graph matching algorithms: max-imum matching (MM) and optimal matching (OM). MM is able to rapidly filter irrelevant video clips, while OM is capable of ranking ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
(Show Context)
This paper proposes a new approach and algorithm for the similarity measure of video clips. The similarity is mainly based on two bipartite graph matching algorithms: max-imum matching (MM) and optimal matching (OM). MM is able to rapidly filter irrelevant video clips, while OM is capable of ranking the similarity of clips according to the visual and granularity factors. Based on MM and OM, a hierarchical video retrieval framework is constructed for the approximate matching of video clips. To allow the matching between a query and a long video, an online clip segmenta-tion algorithm is also proposed to rapidly locate candidate clips for similarity measure. The validity of the retrieval framework is theoretically proved and empirically verified on a video database of 21 hours.
Novelty detection for cross-lingual news stories with visual duplicates and speech transcripts
- in Proceedings of ACM Multimedia
, 2007
"... An overwhelming volume of news videos from different channels and languages is available today, which demands automatic management of this abundant information. To effectively search, retrieve, browse and track cross-lingual news stories, a news story similarity measure plays a critical role in asse ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
An overwhelming volume of news videos from different channels and languages is available today, which demands automatic management of this abundant information. To effectively search, retrieve, browse and track cross-lingual news stories, a news story similarity measure plays a critical role in assessing the novelty and redundancy among them. In this paper, we explore the novelty and redundancy detection with visual duplicates and speech transcripts for cross-lingual news stories. News stories are represented by a sequence of keyframes in the visual track and a set of words extracted from speech transcript in the audio track. A major difference to pure text documents is that the number of keyframes in one story is relatively small compared to the number of words and there exist a large number of non-near-duplicate keyframes. These features make the behavior of similarity measures different compared to traditional textual collections. Furthermore, the textual features and visual features complement each other for news stories. They can be further combined to boost the performance. Experiments on the TRECVID-2005 cross-lingual news video corpus show that approaches on textual features and visual features demonstrate different performance, and measures on visual features are quite effective. Overall, the cosine distance on keyframes is still a robust measure. Language models built on visual features demonstrate promising performance. The fusion of textual and visual features improves overall performance.
Video analysis;
"... Current web video search results rely exclusively on text keywords or user-supplied tags. A search on typical popular video often returns many duplicate and near-duplicate videos in the top results. This paper outlines ways to cluster and filter out the nearduplicate video using a hierarchical appro ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Current web video search results rely exclusively on text keywords or user-supplied tags. A search on typical popular video often returns many duplicate and near-duplicate videos in the top results. This paper outlines ways to cluster and filter out the nearduplicate video using a hierarchical approach. Initial triage is performed using fast signatures derived from color histograms. Only when a video cannot be clearly classified as novel or nearduplicate using global signatures, we apply a more expensive local feature based near-duplicate detection which provides very accurate duplicate analysis through more costly computation. The results of 24 queries in a data set of 12,790 videos retrieved from Google, Yahoo! and YouTube show that this hierarchical approach can dramatically reduce redundant video displayed to the user in the top result set, at relatively small computational cost.
EMD-Based Video Clip Retrieval by Many-to-Many Matching
"... Abstract. This paper presents a new approach for video clip retrieval based on Earth Mover’s Distance (EMD). Instead of imposing one-to-one matching constraint as in [11, 14], our approach allows many-to-many matching methodology and is capable of tolerating errors due to video partitioning and vari ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
Abstract. This paper presents a new approach for video clip retrieval based on Earth Mover’s Distance (EMD). Instead of imposing one-to-one matching constraint as in [11, 14], our approach allows many-to-many matching methodology and is capable of tolerating errors due to video partitioning and various video editing effects. We formulate clip-based retrieval as a graph matching problem in two stages. In the first stage, to allow the matching between a query and a long video, an online clip segmentation algorithm is employed to rapidly locate candidate clips for similarity measure. In the second stage, a weighted graph is constructed to model the similarity between two clips. EMD is proposed to compute the minimum cost of the weighted graph as the similarity between two clips. Experimental results show that the proposed approach is better than some existing methods in term of ranking capability. 1