Results 1 -
6 of
6
Video Summarization by Learning Submodular Mixtures of Objectives
- IEEE Conf. Comput. Vis. Pattern Recognit
, 2015
"... We present a novel method for summarizing raw, casu-ally captured videos. The objective is to create a short sum-mary that still conveys the story. It should thus be both, interesting and representative for the input video. Previous methods often used simplified assumptions and only opti-mized for o ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
We present a novel method for summarizing raw, casu-ally captured videos. The objective is to create a short sum-mary that still conveys the story. It should thus be both, interesting and representative for the input video. Previous methods often used simplified assumptions and only opti-mized for one of these goals. Alternatively, they used hand-defined objectives that were optimized sequentially by mak-ing consecutive hard decisions. This limits their use to a particular setting. Instead, we introduce a new method that (i) uses a supervised approach in order to learn the im-portance of global characteristics of a summary and (ii) jointly optimizes for multiple objectives and thus creates summaries that posses multiple properties of a good sum-mary. Experiments on two challenging and very diverse datasets demonstrate the effectiveness of our method, where we outperform or match current state-of-the-art. 1.
5.3. FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem 6
"... Vision, perception and multimedia interpretation Table of contents ..."
(Show Context)
An Efficient Method for Video Summarization using Moving Object Information
"... Abstract — Video surveillance system captures continuous video for the purpose of security, monitoring, investigating and so on. It requires huge memory space to store as well as enormous time to retrieve important information manually from this high volume of videos. In this paper, we propose a nov ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract — Video surveillance system captures continuous video for the purpose of security, monitoring, investigating and so on. It requires huge memory space to store as well as enormous time to retrieve important information manually from this high volume of videos. In this paper, we propose a novel video summarization scheme using moving object information by considering area of moving objects extracted from dynamic background modeling and frame-to-frame object motion. Through this scheme we rank all frames according to the importance of being key frame by combining moving object features through a fusion method so that users can select desired length of videos for summary. The experimental results show that the proposed method provides better video summary compared to the state-of-the-art method using a publicly available benchmark BL-7F video surveillance dataset. Keywords—Background modelling; frame difference, video summarization I.
Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images
"... We address the problem of fine-grained action localization from temporally untrimmed web videos. We assume that only weak video-level annotations are available for training. The goal is to use these weak labels to identify temporal segments corresponding to the actions, and learn models that general ..."
Abstract
- Add to MetaCart
(Show Context)
We address the problem of fine-grained action localization from temporally untrimmed web videos. We assume that only weak video-level annotations are available for training. The goal is to use these weak labels to identify temporal segments corresponding to the actions, and learn models that generalize to unconstrained web videos. We find that web images queried by action names serve as well-localized highlights for many actions, but are noisily labeled. To solve this problem, we propose a simple yet effective method that takes weak video labels and noisy image labels as in-put, and generates localized action frames as output. This is achieved by cross-domain transfer between video frames and web images, using pre-trained deep convolutional neu-ral networks. We then use the localized action frames to train action recognition models with long short-term mem-ory networks. We collect a fine-grained sports action data set FGA-240 of more than 130,000 YouTube videos. It has 240 fine-grained actions under 85 sports activities. Convinc-ing results are shown on the FGA-240 data set, as well as the THUMOS 2014 localization data set with untrimmed training videos.
Beat-Event Detection in Action Movie Franchises
, 2015
"... HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
- Add to MetaCart
(Show Context)
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Video Co-summarization: Video Summarization by Visual Co-occurrence
"... We present video co-summarization, a novel perspective to video summarization that exploits visual co-occurrence across multiple videos. Motivated by the observation that important visual concepts tend to appear repeatedly across videos of the same topic, we propose to summarize a video by finding s ..."
Abstract
- Add to MetaCart
(Show Context)
We present video co-summarization, a novel perspective to video summarization that exploits visual co-occurrence across multiple videos. Motivated by the observation that important visual concepts tend to appear repeatedly across videos of the same topic, we propose to summarize a video by finding shots that co-occur most frequently across videos collected using a topic keyword. The main technical chal-lenge is dealing with the sparsity of co-occurring patterns, out of hundreds to possibly thousands of irrelevant shots in videos being considered. To deal with this challenge, we de-veloped a Maximal Biclique Finding (MBF) algorithm that is optimized to find sparsely co-occurring patterns, discard-ing less co-occurring patterns even if they are dominant in one video. Our algorithm is parallelizable with closed-form updates, thus can easily scale up to handle a large num-ber of videos simultaneously. We demonstrate the effective-ness of our approach on motion capture and self-compiled YouTube datasets. Our results suggest that summaries gen-erated by visual co-occurrence tend to match more closely with human generated summaries, when compared to sev-eral popular unsupervised techniques. 1.