Results 1 -
7 of
7
Large-scale multimodal semantic concept detection for consumer video
- in MIR workshop, ACM Multimedia
, 2007
"... In this paper we present a systematic study of automatic classification of consumer videos into a large set of diverse semantic concept classes, which have been carefully selected based on user studies and extensively annotated over 1300+ videos from real users. Our goals are to assess the state of ..."
Abstract
-
Cited by 18 (9 self)
- Add to MetaCart
In this paper we present a systematic study of automatic classification of consumer videos into a large set of diverse semantic concept classes, which have been carefully selected based on user studies and extensively annotated over 1300+ videos from real users. Our goals are to assess the state of the art of multimedia analytics (including both audio and visual analysis) in consumer video classification and to discover new research opportunities. We investigated several statistical approaches built upon global/local visual features, audio features, and audio-visual combinations. Three multi-modal fusion frameworks (ensemble, context fusion, and joint boosting) are also evaluated. Experiment results show that visual and audio models perform best for different sets of concepts. Both provide significant contributions to multimodal fusion, via expansion of the classifier pool for context fusion and the feature bases for feature sharing. The fused multimodal models are shown to significantly reduce the detection errors (compared to single modality models), resulting in a promising accuracy of 83 % over diverse concepts. To the best of our knowledge, this is the first work on systematic investigation of multimodal classification using a large-scale ontology and realistic video corpus.
A Reranking Approach for Context-based Concept Fusion in Video Indexing and Retrieval
- In Conference on Image and Video Retrieval
, 2007
"... We propose to incorporate hundreds of pre-trained concept detectors to provide contextual information for improving the performance of multimodal video search. The approach takes initial search results from established video search methods (which typically are conservative in usage of concept detect ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
We propose to incorporate hundreds of pre-trained concept detectors to provide contextual information for improving the performance of multimodal video search. The approach takes initial search results from established video search methods (which typically are conservative in usage of concept detectors) and mines these results to discover and leverage co-occurrence patterns with detection results for hundreds of other concepts, thereby refining and reranking the initial video search result. We test the method on TRECVID 2005 and 2006 automatic video search tasks and find improvements in mean average precision (MAP) of 15%-30%. We also find that the method is adept at discovering contextual relationships that are unique to news stories occurring in the search set, which would be difficult or impossible to discover even if external training data were available.
A Hybrid Approach to Improving Semantic Extraction of News Video
"... In this paper we describe a hybrid approach to improving semantic extraction from news video. Experiments show the value of careful parameter tuning, exploiting multiple feature sets and multilingual linguistic resources, applying text retrieval approaches for image features, and establishing synerg ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
In this paper we describe a hybrid approach to improving semantic extraction from news video. Experiments show the value of careful parameter tuning, exploiting multiple feature sets and multilingual linguistic resources, applying text retrieval approaches for image features, and establishing synergy between multiple concepts through undirected graphical models. No single approach provides a consistently better result for every concept detection, which suggests that extracting video semantics should exploit multiple resources and techniques rather than a single approach. 1.
Searching Visual Semantic Spaces with Concept Filters
"... Semantic concepts cement the ability to correlate visual information to higher-level semantic concepts. Traditional image search leverages text associated with images, a lowlevel content-based matching, or a combination of the two. We propose a new system that uses 374 semantic concepts (derived fro ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Semantic concepts cement the ability to correlate visual information to higher-level semantic concepts. Traditional image search leverages text associated with images, a lowlevel content-based matching, or a combination of the two. We propose a new system that uses 374 semantic concepts (derived from the LSCOM lexicon [6]) to semantically facilitate fast exploration of a large set of video data. This new system, when coupled with traditional image search techniques produces a very intuitive and fruitful design for targeted user interaction. 1
Kodak consumer video benchmark data set: concept definition and annotation
, 2008
"... Semantic indexing of images and videos in the consumer domain has become a very important issue for both research and actual application. In this work we developed Kodak’s consumer video benchmark data set, which includes (1) a significant number of videos from actual users, (2) a rich lexicon that ..."
Abstract
- Add to MetaCart
Semantic indexing of images and videos in the consumer domain has become a very important issue for both research and actual application. In this work we developed Kodak’s consumer video benchmark data set, which includes (1) a significant number of videos from actual users, (2) a rich lexicon that accommodates consumers ’ needs, and (3) the annotation of a subset of concepts over the entire video data set. To the best of our knowledge, this is the first systematic work in the consumer domain aimed at the definition of a large lexicon, construction of a large benchmark data set, and annotation of videos in a rigorous fashion. Such effort will have significant impact by providing a sound foundation for developing and evaluating large-scale learningbased semantic indexing/annotation techniques in the consumer domain. This report includes information about the concept definitions, the
BBC rush summarization and High-Level Feature extraction In TRECVID2008
"... In this paper, first we describe rushes summarization system which is made this year for the TRECVID2008 BBC rushes task. Our aim this year is to build up base system using minimum information and to see how it works. In the second we will describe about our High-level feature extraction system brie ..."
Abstract
- Add to MetaCart
In this paper, first we describe rushes summarization system which is made this year for the TRECVID2008 BBC rushes task. Our aim this year is to build up base system using minimum information and to see how it works. In the second we will describe about our High-level feature extraction system briefly. 1.

