Results 1 - 10
of
41
Exploiting facial expressions for affective video summarisation
- in Proceeding of the ACM International Conference on Image and Video Retrieval, ser. CIVR ’09
, 2009
"... This paper presents an approach to affective video summari-sation based on the facial expressions (FX) of viewers. A fa-cial expression recognition system was deployed to capture a viewer’s face and his/her expressions. The user’s facial ex-pressions were analysed to infer personalised affective sce ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
This paper presents an approach to affective video summari-sation based on the facial expressions (FX) of viewers. A fa-cial expression recognition system was deployed to capture a viewer’s face and his/her expressions. The user’s facial ex-pressions were analysed to infer personalised affective scenes from videos. We proposed two models, pronounced level and expression’s change rate, to generate affective summaries us-ing the FX data. Our result suggested that FX can be a promising source to exploit for affective video summaries that can be tailored to individual preferences.
Video Précis: Highlighting Diverse Aspects of Videos
"... Abstract—Summarizing long unconstrained videos is gaining importance in surveillance, web-based video browsing, and video-archival applications. Summarizing a video requires one to identify key aspects that contain the essence of the video. In this paper, we propose an approach that optimizes two cr ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Abstract—Summarizing long unconstrained videos is gaining importance in surveillance, web-based video browsing, and video-archival applications. Summarizing a video requires one to identify key aspects that contain the essence of the video. In this paper, we propose an approach that optimizes two criteria that a video summary should embody. The first criterion, “coverage,” requires that the summary be able to represent the original video well. The second criterion, “diversity, ” requires that the elements of the summary be as distinct from each other as possible. Given a user-specified summary length, we propose a cost function to measure the quality of a summary. The problem of generating a précis is then reduced to a combinatorial optimization problem of minimizing the proposed cost function. We propose an efficient method to solve the optimization problem. We demonstrate through experiments (on KTH data, unconstrained skating video, a surveillance video, and a YouTube home video) that optimizing the proposed criterion results in meaningful video summaries over a wide range of scenarios. Summaries thus generated are then evaluated using both quantitative measures and user studies.
Editing by viewing: Automatic home video summarization by viewing behavior analysis
- IEEE Trans. Multimedia
, 2011
"... Abstract—In this paper, we propose the Interest Meter (IM), a system making the computer conscious of user’s reactions to mea-sure user’s interest and thus use it to conduct video summarization. The IM takes account of users ’ spontaneous reactions when they view videos. To estimate user’s viewing i ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we propose the Interest Meter (IM), a system making the computer conscious of user’s reactions to mea-sure user’s interest and thus use it to conduct video summarization. The IM takes account of users ’ spontaneous reactions when they view videos. To estimate user’s viewing interest, quantitative in-terest measures are devised based on the perspectives of attention and emotion. For estimating attention states, variations of user’s eye movement, blink, and head motion are considered. For esti-mating emotion states, facial expression is recognized as positive or neural emotion. By combining characteristics of attention and emotion by a fuzzy fusion scheme, we transform users ’ viewing behaviors into quantitative interest scores, determine interesting parts of videos, and finally concatenate them as video summaries. Experimental results show that the proposed concept “editing by viewing ” works well and may provide a promising direction to con-sider the human factor in video summarization. Index Terms—Attention detection, editing by viewing, emotion recognition, Interest Meter (IM), video summarization. I.
Videoskip: event detection in social web videos with an implicit user heuristic
- Multimedia Tools and Applications
, 2012
"... www.ionio.gr/~choko Abstract. In this paper, we present a user-based event detection method for social web videos. Previous research in event detection has focused on content-based techniques, such as pattern recognition algorithms that attempt to understand the contents of a video. There are few us ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
www.ionio.gr/~choko Abstract. In this paper, we present a user-based event detection method for social web videos. Previous research in event detection has focused on content-based techniques, such as pattern recognition algorithms that attempt to understand the contents of a video. There are few usercentric approaches that have considered either search keywords, or external data such as comments, tags, and annotations. Moreover, some of the user-centric approaches imposed an extra effort to the users in order to capture required information. In this research, we are describing a method for the analysis of implicit users ’ interactions with a web video player, such as pause, play, and thirty-seconds skip or rewind. The results of our experiments indicated that even the simple user heuristic of local maxima might effectively detect the same video-events, as indicated manually. Notably, the proposed technique was more accurate in the detection of events that have a short duration, because those events motivated increased user interaction in video hot-spots. The findings of this research provide evidence that we might be able to infer semantics about a piece of unstructured data just from the way people actually use it.
Visual Storylines: Semantic Visualization of Movie Sequence
"... This paper presents a video summarization approach that automatically extracts and visualizes movie storylines in a static image for the purposes of efficient representation and quick overview. A new type of video visualization, Visual Storylines, is designed to summarize video storylines in a succi ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
This paper presents a video summarization approach that automatically extracts and visualizes movie storylines in a static image for the purposes of efficient representation and quick overview. A new type of video visualization, Visual Storylines, is designed to summarize video storylines in a succinct visual format while preserving the elegance of original videos. This is achieved with a series of video analysis, image synthesis, relationship quantification and geometric layout optimization techniques. Specifically, we analyze video contents and quantify video story unit relationships automatically through clustering video shots according to both visual and audio data. A multi-level storyline visualization method then organizes and synthesizes a suitable amount of representative information, including both locations and interested objects and characters, with the assistants of special visual languages, according to the relationships between video story units and temporal structure of the video sequence. Several results have demonstrated that our approach is able to abstract the storylines of professionally edited video such as commercial movies and TV series. Preliminary user studies have been performed to evaluate our approach and the results show that our approach can be used to assist viewers to grasp video contents efficiently, especially when they are familiar with the context of the video, or a text synopsis is provided.
First-person Hyper-lapse Videos
"... Our system converts first-person videos into hyper-lapse summaries using a set of processing stages. (a) 3D camera and point cloud recovery, followed by smooth path planning; (b) 3D per-camera proxy estimation; (c) source frame selection, seam selection using a MRF, and Poisson blending. We present ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Our system converts first-person videos into hyper-lapse summaries using a set of processing stages. (a) 3D camera and point cloud recovery, followed by smooth path planning; (b) 3D per-camera proxy estimation; (c) source frame selection, seam selection using a MRF, and Poisson blending. We present a method for converting first-person videos, for ex-ample, captured with a helmet camera during activities such as rock climbing or bicycling, into hyper-lapse videos, i.e., time-lapse videos with a smoothly moving camera. At high speed-up rates, simple frame sub-sampling coupled with existing video sta-bilization methods does not work, because the erratic camera shake present in first-person videos is amplified by the speed-up. Our al-gorithm first reconstructs the 3D input camera path as well as dense, per-frame proxy geometries. We then optimize a novel camera path for the output video that passes near the input cameras while ensur-ing that the virtual camera looks in directions that can be rendered well from the input. Finally, we generate the novel smoothed, time-lapse video by rendering, stitching, and blending appropriately se-lected source frames for each output frame. We present a number of results for challenging videos that cannot be processed using tra-ditional techniques.
Video summarization from spatio-temporal features
- In Proc of ACM TRECVid Video Summarization Wkshp
, 2008
"... In this paper we present a video summarization method based on the study of spatio-temporal activity within the video. The visual activity is estimated by measuring the number of interest points, jointly obtained in the spatial and temporal domains. The proposed approach is composed of five steps. F ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
In this paper we present a video summarization method based on the study of spatio-temporal activity within the video. The visual activity is estimated by measuring the number of interest points, jointly obtained in the spatial and temporal domains. The proposed approach is composed of five steps. First, image features are collected using the spatio-temporal Hessian matrix. Then, these features are processed to retrieve the candidate video segments for the summary (denoted clips). Further on, two specific steps are designed to first detect the redundant clips, and second to eliminate the clapperboard images. The final step consists in the construction of the final summary which is performed by retaining the clips showing the highest level of activity. The proposed approach was tested on the BBC Rushes Summarization task within the TRECVID 2008 campaign.
A novel tool for quick video summarization using keyframe extraction techniques
- Proceedings of 9th Workshop on Multimedia Metadata(WMM’09), CEUR Workship Proceedings
, 2009
"... Abstract: The increasing availability of short, unstructured video clips on the Web has generated an unprecedented need to organize, index, annotate and retrieve video contents to make them useful to potential viewers. This paper presents a novel, simple, and easy-to-use tool to benchmark different ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Abstract: The increasing availability of short, unstructured video clips on the Web has generated an unprecedented need to organize, index, annotate and retrieve video contents to make them useful to potential viewers. This paper presents a novel, simple, and easy-to-use tool to benchmark different low level features for video summarization based on keyframe extraction. Moreover, it shows the usefulness of the benchmarking tool by developing hypothesis for a chosen domain through an exploratory study. It discusses the results of exploratory studies involving users and their judgment of what makes the summary generated by the tool a good one. 1
Global vs. Local Feature in Video Summarization: Experimental Results.
"... Abstract. We investigate the usefulness of local features in generating static video summaries. The proposed approach is based on bag of visual words using SIFT features. In an explorative experiment we compare this approach to summaries generated with the help of global features. As a resume we con ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We investigate the usefulness of local features in generating static video summaries. The proposed approach is based on bag of visual words using SIFT features. In an explorative experiment we compare this approach to summaries generated with the help of global features. As a resume we conclude that the local feature based approach does not outperform the other ones, however, it seems to be more stable. 1
Affective video summarization and story board generation using Pupillary dilation and Eye gaze
"... Abstract—We propose a semi-automated, eye-gaze based method for affective analysis of videos. Pupillary Dilation (PD) is introduced as a valuable behavioural signal for assessment of subject arousal and engagement. We use PD information for computationally inexpensive, arousal based composition of v ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Abstract—We propose a semi-automated, eye-gaze based method for affective analysis of videos. Pupillary Dilation (PD) is introduced as a valuable behavioural signal for assessment of subject arousal and engagement. We use PD information for computationally inexpensive, arousal based composition of video summaries and descriptive story-boards. Video summarization and story-board generation is done offline, subsequent to a subject viewing the video. The method also includes novel eyegaze analysis and fusion with content based features to discover affective segments of videos and Regions of interest (ROIs) contained therein. Effectiveness of the framework is evaluated using experiments over a diverse set of clips, significant pool of subjects and comparison with a fully automated state-ofart affective video summarization algorithm. Acquisition and analysis of PD information is demonstrated and used as a proxy for human visual attention and arousal based video summarization and story-board generation. An important contribution is to demonstrate usefulness of PD semantics or affective elements of discourse and story-telling, that are likely to be missed by automated methods. Another contribution is the use of eye-fixations in the close temporal proximity of PD based events for key frame extraction and subsequent story board generation. We also show how PD based video summarization can to generate either a personalized video summary or to represent a consensus over affective preferences of a larger group or community. I.