Results 1 - 10
of
30
Learning Similarity Metrics for Event Identification in Social Media
"... Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of ..."
Abstract
-
Cited by 74 (9 self)
- Add to MetaCart
(Show Context)
Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of real-world events of different type and scale. By automatically identifying these events and their associated user-contributed social media documents, which is the focus of this paper, we can enable event browsing and search in state-of-the-art search engines. To address this problem, we exploit the rich “context ” associated with social media content, including user-provided annotations (e.g., title, tags) and automatically generated information (e.g., content creation time). Using this rich context, which includes both textual and non-textual features, we can define appropriate document similarity metrics to enable online clustering of media to events. As a key contribution of this paper, we explore a variety of techniques for learning multi-feature similarity metrics for social media documents in a principled manner. We evaluate our techniques on large-scale, realworld datasets of event images from Flickr. Our evaluation results suggest that our approach identifies events, and their associated social media documents, more effectively than the state-of-the-art strategies on which we build.
Event Identification in Social Media
"... Social media sites such as Flickr, YouTube, and Facebook host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of real-world events. These range from widely known events, such as the presidential inauguration, to smaller, community ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
(Show Context)
Social media sites such as Flickr, YouTube, and Facebook host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of real-world events. These range from widely known events, such as the presidential inauguration, to smaller, community-specific events, such as annual conventions and local gatherings. By identifying these events and their associated user-contributed social media documents, which is the focus of this paper, we can greatly improve local event browsing and search in state-of-the-art search engines. To address our problem of focus, we exploit the rich “context” associated with social media content, including user-provided annotations (e.g., title, tags) and automatically generated information (e.g., content creation time). We form a variety of representations of social media documents using different context dimensions, and combine these dimensions in a principled way into a single clustering solution—where each document cluster ideally corresponds to one event—using a weighted ensemble approach. We evaluate our approach on a large-scale, real-world dataset of event images, and report promising performance with respect to several baseline approaches. Our preliminary experiments suggest that our ensemble approach identifies events, and their associated images, more effectively than the state-of-the-art strategies on which we build. 1.
Finding Media Illustrating Events
- In Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR ’11
, 2011
"... We present a method combining semantic inferencing and visual analysis for finding automatically media (photos and videos) illustrating events. We report on experiments vali-dating our heuristic for mining media sharing platforms and large event directories in order to mutually enrich the de-scripti ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
(Show Context)
We present a method combining semantic inferencing and visual analysis for finding automatically media (photos and videos) illustrating events. We report on experiments vali-dating our heuristic for mining media sharing platforms and large event directories in order to mutually enrich the de-scriptions of the content they host. Our overall goal is to design a web-based environment that allows users to explore and select events, to inspect associated media, and to dis-cover meaningful, surprising or entertaining connections be-tween events, media and people participating in events. We present a large dataset composed of semantic descriptions of events, photos and videos interlinked with the larger Linked Open Data cloud and we show the benefits of using semantic web technologies for integrating multimedia metadata.
Videoscapes: Exploring sparse, unstructured video collections
- In ACM SIGGRAPH
, 2012
"... Figure 1: A Videoscape formed from casually captured videos and an interactively-formed path through it, consisting of individual videos and automatically generated transitions. A video frame from one such transition is shown here: a 3D reconstruction of Big Ben automatically formed from the frames ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Figure 1: A Videoscape formed from casually captured videos and an interactively-formed path through it, consisting of individual videos and automatically generated transitions. A video frame from one such transition is shown here: a 3D reconstruction of Big Ben automatically formed from the frames across videos, viewed from a point in space between cameras and projected with video frames. The abundance of mobile devices and digital cameras with video capture makes it easy to obtain large collections of video clips that contain the same location, environment, or event. However, such an unstructured collection is difficult to comprehend and explore. We propose a system that analyzes collections of unstructured but related video data to create a Videoscape: a data structure that enables interactive exploration of video collections by visually navigating – spatially and/or temporally – between different clips. We automatically identify transition opportunities, or portals. From these portals, we construct the Videoscape, a graph whose edges are video clips and whose nodes are portals between clips. Now structured, the videos can be interactively explored by walking the graph or by geographic map. Given this system, we gauge preference for different video transition styles in a user study, and generate heuristics that automatically choose an appropriate transition style. We evaluate our system using three further user studies, which allows us to conclude that Videoscapes provides significant benefits over related methods. Our system leads to previously unseen ways of interactive spatio-temporal exploration of casually captured videos, and we demonstrate this on several video collections.
Crowdsourcing Rock N ’ Roll Multimedia Retrieval
"... In this technical demonstration, we showcase a multimedia search engine that facilitates semantic access to archival rock n ’ roll concert video. The key novelty is the crowdsourcing mechanism, which relies on online users to improve, extend, and share, automatically detected results in video fragme ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
In this technical demonstration, we showcase a multimedia search engine that facilitates semantic access to archival rock n ’ roll concert video. The key novelty is the crowdsourcing mechanism, which relies on online users to improve, extend, and share, automatically detected results in video fragments using an advanced timeline-based video player. The userfeedback serves as valuable input to further improve automated multimedia retrieval results, such as automatically detected concepts and automatically transcribed interviews. The search engine has been operational online to harvest valuable feedback from rock n ’ roll enthusiasts. Categories and Subject Descriptors: H.3.3 Information
Automatic generation of video narratives from shared UGC
- In Proceedings of the ACM Conference on Hypertext and Hypermedia
, 2011
"... This paper introduces an evaluated approach to the automatic generation of video narratives from user generated content gathered in a shared repository. In the context of social events, end-users record video material with their personal cameras and upload the content to a common repository. Video n ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
This paper introduces an evaluated approach to the automatic generation of video narratives from user generated content gathered in a shared repository. In the context of social events, end-users record video material with their personal cameras and upload the content to a common repository. Video narrative techniques, implemented using Narrative Structure Language (NSL) and ShapeShifting Media [Ursu, 2008a], are employed to automatically generate movies recounting the event. Such movies are personalized according to the preferences expressed by each individual end-user, for each individual viewing. This paper describes our prototype narrative system, MyVideos, deployed as a web application, and reports on its evaluation for one specific use case: assembling stories of a school concert by parents, relatives and friends. The evaluations carried out through focus groups, interviews and field trials, in the Netherlands and UK, provided validating results and further insights into this approach.
AUDIO FINGERPRINTING TO IDENTIFY MULTIPLE VIDEOS OF AN EVENT
"... The proliferation of consumer recording devices and video sharing websites makes the possibility of having access to multiple recordings of the same occurrence increasingly likely. These co-synchronous recordings can be identified via their audio tracks, despite local noise and channel variations. W ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
The proliferation of consumer recording devices and video sharing websites makes the possibility of having access to multiple recordings of the same occurrence increasingly likely. These co-synchronous recordings can be identified via their audio tracks, despite local noise and channel variations. We explore a robust fingerprinting strategy to do this. Matching pursuit is used to obtain a sparse set of the most prominent elements in a video soundtrack. Pairs of these elements are hashed and stored, to be efficiently compared with one another. This fingerprinting is tested on a corpus of over 700 YouTube videos related to the 2009 U.S. presidential inauguration. Reliable matching of identical events in different recordings is demonstrated, even under difficult conditions. Index Terms — Acoustic signal analysis, Multimedia databases, Database searching
DC Proposal: Enriching Unstructured Media Content About Events to Enable Semi-Automated Summaries, Compilations, and Improved Search by Leveraging Social Networks
"... Abstract. Mobile devices like smartphones together with social networks enable people to generate, share, and consume enormous amounts of media content. Common search operations, for example searching for a music clip based on artist name and song title on video platforms such as YouTube, can be ach ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Mobile devices like smartphones together with social networks enable people to generate, share, and consume enormous amounts of media content. Common search operations, for example searching for a music clip based on artist name and song title on video platforms such as YouTube, can be achieved both based on potentially shallow humangenerated metadata, or based on more profound content analysis, driven by Optical Character Recognition (OCR) or Automatic Speech Recognition (ASR). However, more advanced use cases, such as summaries or compilations of several pieces of media content covering a certain event, are hard, if not impossible to fulfill at large scale. One example of such event can be a keynote speech held at a conference, where, given a stable network connection, media content is published on social networks while the event is still going on. In our thesis, we develop a framework for media content processing, leveraging social networks, utilizing the Web of Data and fine-grained media content addressing schemes like Media Fragments URIs to provide a scalable and sophisticated solution to realize the above use cases: media content summaries and compilations. We evaluate our approach on the entity level against social media platform APIs in conjunction with Linked (Open) Data sources, comparing the current manual approaches against our semi-automated approach. Our proposed framework can be used as an extension for existing video platforms.
CAVVA: Computational Affective Video-in-Video Advertising
"... Abstract—Advertising is ubiquitous in the online commu-nity and more so in the ever-growing and popular online video delivery websites (e.g., YouTube). Video advertising is becoming increasingly popular on these websites. In addition to the existing pre-roll/post-roll advertising and contextual ad-v ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Advertising is ubiquitous in the online commu-nity and more so in the ever-growing and popular online video delivery websites (e.g., YouTube). Video advertising is becoming increasingly popular on these websites. In addition to the existing pre-roll/post-roll advertising and contextual ad-vertising, this paper proposes an in-stream video advertising strategy—Computational Affective Video-in-Video Advertising (CAVVA). Humans being emotional creatures are driven by emo-tions as well as rational thought. We believe that emotions play a major role in influencing the buying behavior of users and hence propose a video advertising strategy which takes into account the emotional impact of the videos as well as advertisements. Given a video and a set of advertisements, we identify candidate adver-tisement insertion points (step 1) and also identify the suitable advertisements (step 2) according to theories from marketing and consumer psychology. We formulate this two part problem as a single optimization function in a non-linear 0–1 integer programming framework and provide a genetic algorithm based solution. We evaluate CAVVA using a subjective user-study and eye-tracking experiment. Through these experiments, we demon-strate that CAVVA achieves a good balance between the following seemingly conflicting goals of (a) minimizing the user disturbance because of advertisement insertion while (b) enhancing the user engagement with the advertising content. We compare our method with existing advertising strategies and show that CAVVA can enhance the user’s experience and also help increase the moneti-zation potential of the advertising content. Index Terms—Ad-insertion, affect, arousal, contextual adver-tising, marketing and consumer psychology, valence. I.
KNOWLEDGE DISCOVERY OVER COMMUNITY-SHARING MEDIA: FROM SIGNAL TO INTELLIGENCE
"... The explosive growth of photos/videos and the advent of mediasharing services have drastically increased the volume of usercontributed multimedia resources, which bring profound social impacts to the society and pose new challenges for the design of efficient search, mining, and visualization method ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The explosive growth of photos/videos and the advent of mediasharing services have drastically increased the volume of usercontributed multimedia resources, which bring profound social impacts to the society and pose new challenges for the design of efficient search, mining, and visualization methods for manipulation. Besides plain visual or audio signals, such large-scale media are augmented with rich context such as user-provided tags, geolocations, time, device metadata, and so on, benefiting a wide variety of potential applications such as annotation, automatic training data acquisition, contextual advertising, and visualization. We review the research advances for enabling such applications and present a brief outlook on open issues and major opportunities. Index Terms — social media, multimedia annotation, multimedia search, multimedia advertising, visualization, machine learning, survey 1.