| C. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5--35, 2005. |
....the script to be used as indices to the video content. We illustrate the framework with The wizard of Oz , a well known masterpiece released in 1939, whose continuity script was carefully edited and published on the Internet [1] Alignment of script to video was mentioned by other researchers [2, 3, 4] as a means to provide training data for learning models of objects, scenes and actors. But contrary to the similar problem of aligning bilingual translations of the same text [5, 6] it was never formalized properly. With this work, we would like to contribute to such a formalization. 2. SCRIPT ....
C.G.M. Snoek and M. Worring, "Multimodal video indexing: A review of the state-of-the-art," Multimedia Tools and Applications, 2003, Accepted for publication.
No context found.
C. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Technical Report 20, Intelligent Sensory Information Systems, University of Amsterdam, 2001.
....1 Introduction Effective and efficient extraction of semantic indexes from video documents requires simultaneous analysis of visual, auditory, and textual information sources. In literature several of such methods have been proposed, addressing different types of semantic indexes, see [16] for an extensive overview. Multimodal methods for detection of semantic events are still rare, notable exceptions are [4, 10, 12, 13, 14] For the integration of the heterogeneous data sources a statistical classifier gives the best results [16] compared to heuristic methods, e.g. 4] In ....
....addressing different types of semantic indexes, see [16] for an extensive overview. Multimodal methods for detection of semantic events are still rare, notable exceptions are [4, 10, 12, 13, 14] For the integration of the heterogeneous data sources a statistical classifier gives the best results [16], compared to heuristic methods, e.g. 4] In particular, instances of the Dynamic Bayesian Network (DBN) framework, e.g. 13, 14] Drawbacks of the DBN framework are the fact that the model works with fixed common units, e.g. image frames, thereby ignoring differences in layout schemes of the ....
C.G.M. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications. To appear.
....1. INTRODUCTION Effective and efficient extraction of semantic indexes from video documents requires simultaneous analysis of visual, auditory, and textual information sources. In literature several of such methods have been proposed, addressing different types of semantic indexes, see [12] for an extensive overview. Multimodal methods for detection of semantic events are still rare, notable exceptions are [3, 7, 8, 10] For the integration of the heterogeneous data sources a statistical classifier gives the best results [12] compared to heuristic methods, e.g. 3] In particular, ....
....addressing different types of semantic indexes, see [12] for an extensive overview. Multimodal methods for detection of semantic events are still rare, notable exceptions are [3, 7, 8, 10] For the integration of the heterogeneous data sources a statistical classifier gives the best results [12], compared to heuristic methods, e.g. 3] In particular, instances of the Dynamic Bayesian Network (DBN) framework, e.g. 8, 10] Drawbacks of the DBN framework are the fact that the model works with fixed common units, e.g. image frames, thereby ignoring differences in layout schemes of the ....
C. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications. To appear.
....fact has already been acknowledged by the multimedia research community more than a decade ago, and has resulted in numerous methodologies for automatic analysis and indexing of video assets. For an extensive overview of the general field of video indexing, including the sports domain, we refer to [2]. After analysis and indexing of sport video assets, tools are necessary to browse and retrieve video segments of interest. The Goalgle demonstrator is such a tool, tailored for the domain of soccer. For this specific domain, a user would typically like to find highlight events such as goals, ....
C.G.M. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications. To appear.
....various groups covering most of the above disciplines. Here, we focus on how parts of this project have contributed in limiting the semantic gap. For work of others, good starting points are the two reviews we have published on content based image retrieval [1] and multimodal video indexing [2] respectively. Together they cover some 300 references. Fur thermore, a very good review on pattern recognition, the basic tools on which many indexing methods are based, can be found in [3] In accessing multimedia data there are two related, but still distinct tasks. The first task is ....
....an audio signal, and possibly text in the form of scripts for films and closed captions for news broadcasts. Clearly using all modalities in conjunction yields better indexes as the different information channels give complementary information. For video we have written an elaborate review [2] where we aimed at a general framework fitting the methods in literature. This framework is shown in figure 5. Here we briefly recall the most important aspects of the review. Video data Visual layout Visual content Video Textual layout OCR Textual content Auditory layout ....
Snoek, C., Worring, M.: Multimodal video indexing: a review of the state-of-the-art. Multimedia Tools and Applications (2002) to appear.
....i.e. the sequence of frames, is not always a good representation for the visual content. There are two reasons for employing an alternative representation. Firstly, a shot is not necessarily visually and semantically coherent. In [14] such fractions of a shot are called a shot let, while in [12] this division is referred to as named events, which are short segments with a meaning that does not change in time. Division of the shot into smaller coherent fractions allows for better representation of the shot s content. In the context of TREC the further division is especially important, as ....
C.G.M. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, To appear.
....techniques are used, for example methods exploiting the learning capabilities of statistical classifiers. 3.2 Content segmentation In section 2 we introduced the elements of content. Here we will discuss how to detect them automatically. For a more detailed review of the algorithms we refer to [40]. 3.2.1 People detection Detection of people in video documents can be done in several ways. Most approaches simplify the problem to detection of a human face. Over the years various methods for the detection of faces in images and image sequences are reported, see [49] for a comprehensive and ....
....been applied to extract a variety of the different video indexes described in section 2. Figure 3 presents an overview of all semantic indexes that we found in literature. For an extensive overview of all those methods, including the low level information from which they are derived, we refer to [40]. 5 Conclusion Viewing a video document from the perspective of its author, enabled us to present a framework for multimodal video indexing. This framework formed the starting point for our review on different stateof the art video indexing techniques. At the end of this review we stress that ....
C. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Technical Report 2001-20, Intelligent Sensory Information Systems Group, University of Amsterdam, 2001.
No context found.
C. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5--35, 2005.
No context found.
C. G. M. Snoek and M. Worring, "Multimodal video indexing: a review of the state-of-the-art," Multimedia Tools and Applications, vol. 25, no. 1, pp. 5--35, 2005.
No context found.
C. M. Snoek and M. Worring, Multimodal video indexing: A review of the state-of-the-art, in ISIS Tech. Rep. Series, Univ. Amsterdam, Amsterdam, The Netherlands, vol. 2001-20, Dec. 2001.
No context found.
C. G. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5--35, 2005.
No context found.
C.G.M. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5--35, 2005.
No context found.
C.G.M. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5--35, 2005.
No context found.
C. G. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. In Proc. Multimedia Tools and Applications, 2003.
No context found.
Snoek, C., and Worring, M. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications (2004). In press.
No context found.
C. Snoek and M. Worring, "Multimodal Video Indexing: A Review of the State-of-the-art," Multimedia Tools and Applications, 2004 (in press)
No context found.
C.G.M. Snoek and M. Worring. Multimodal Video Indexing: A Review of the State-of-the-art. Multimedia Tools and Applications, 2004. (in press).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC