Results 1 -
7 of
7
How flickr helps us make sense of the world: context and content in community-contributed media collections
- In Proceedings of the 15th International Conference on Multimedia (MM2007
, 2007
"... The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities – and new challenges – to multimed ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities – and new challenges – to multimedia research. How do we analyze, understand and extract patterns from these new collections? How can we use these unstructured, unrestricted community contributions of media (and annotation) to generate “knowledge”? As a test case, we study Flickr – a popular photo sharing website. Flickr supports photo, time and location metadata, as well as a light-weight annotation model. We extract information from this dataset using two different approaches. First, we employ a location-driven approach to generate aggregate knowledge in the form of “representative tags ” for arbitrary areas in the world. Second, we use a tag-driven approach to automatically extract place and event semantics for Flickr tags, based on each tag’s metadata patterns. With the patterns we extract from tags and metadata, vision algorithms can be employed with greater precision. In particular, we demonstrate a location-tag-vision-based approach to retrieving images of geography-related landmarks and features from the Flickr dataset. The results suggest that community-contributed media and annotation can enhance and improve our access to multimedia resources – and our understanding of the world.
Generating Diverse and Representative Image Search Results for Landmarks ABSTRACT
"... Can we leverage the community-contributed collections of rich media on the web to automatically generate representative and diverse views of the world’s landmarks? We use a combination of context- and content-based tools to generate representative sets of images for location-driven features and land ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Can we leverage the community-contributed collections of rich media on the web to automatically generate representative and diverse views of the world’s landmarks? We use a combination of context- and content-based tools to generate representative sets of images for location-driven features and landmarks, a common search task. To do that, we using location and other metadata, as well as tags associated with images, and the images ’ visual features. We present an approach to extracting tags that represent landmarks. We show how to use unsupervised methods to extract representative views and images for each landmark. This approach can potentially scale to provide better search and representation for landmarks, worldwide. We evaluate the system in the context of image search using a real-life dataset of 110,000 images from the San Francisco area.
Hot‐Paper: Multimedia Interaction with Paper Using Mobile Phones
- Proc. 16th ACM Int’l Conf. Multimedia, ACM
, 2008
"... The popularity of camera phones enables many exciting multimedia applications. In this paper, we present a novel technology and several applications that allow users to interact with paper documents, books, and magazines. This interaction is in the form of reading and writing electronic information, ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
The popularity of camera phones enables many exciting multimedia applications. In this paper, we present a novel technology and several applications that allow users to interact with paper documents, books, and magazines. This interaction is in the form of reading and writing electronic information, such as images, web urls, video, and audio, to the paper medium by pointing a camera phone at a patch of text on a document. Our application does not require any special markings, barcodes, or watermarks on the paper document. Instead, we propose a document recognition algorithm that automatically determines the location of a patch of text in a large collection of document images given a small document image. This is very challenging because the majority of phone cameras lack autofocus and macro capabilities and they produce low quality images and video. We
A face annotation framework with partial clustering and interactive labeling
- In International Conf. on Computer Vision and Pattern Recognition
, 2007
"... Face annotation technology is important for a photo management system. In this paper, we propose a novel interactive face annotation framework combining unsupervised and interactive learning. There are two main contributions in our framework. In the unsupervised stage, a partial clustering algorithm ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Face annotation technology is important for a photo management system. In this paper, we propose a novel interactive face annotation framework combining unsupervised and interactive learning. There are two main contributions in our framework. In the unsupervised stage, a partial clustering algorithm is proposed to find the most evident clusters instead of grouping all instances into clusters, which leads to a good initial labeling for later user interaction. In the interactive stage, an efficient labeling procedure based on minimization of both global system uncertainty and estimated number of user operations is proposed to reduce user interaction as much as possible. Experimental results show that the proposed annotation framework can significantly reduce the face annotation workload and is superior to existing solutions in the literature. 1.
Social Multimedia: Highlighting Opportunities for Search and Mining of Multimedia Data in Social Media Applications
- PUBLISHED IN MULTIMEDIA TOOLS AND APPLICATIONS
, 2010
"... In recent years, various Web-based sharing and community services such as Flickr and YouTube have made a vast and rapidly growing amount of multimedia content available online. Uploaded by individual participants, content in these immense pools of content is accompanied by varied types of metadata, ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In recent years, various Web-based sharing and community services such as Flickr and YouTube have made a vast and rapidly growing amount of multimedia content available online. Uploaded by individual participants, content in these immense pools of content is accompanied by varied types of metadata, such as social network data or descriptive textual information. These collections present, at once, new challenges and exciting opportunities for multimedia research. This article presents an approach for “social multimedia” applications. The approach is based on the experience of building a number of successful applications that are based on mining multimedia content analysis in social multimedia context.
Context-Aware Person Identification in Personal Photo Collections
"... Abstract—Identifying the people in photos is an important need for users of photo management systems. We present MediAssist, one such system which facilitates browsing, searching and semi-automatic annotation of personal photos, using analysis of both image content and the context in which the photo ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—Identifying the people in photos is an important need for users of photo management systems. We present MediAssist, one such system which facilitates browsing, searching and semi-automatic annotation of personal photos, using analysis of both image content and the context in which the photo is captured. This semiautomatic annotation includes annotation of the identity of people in photos. In this paper, we focus on such person annotation, and propose person identification techniques based on a combination of context and content. We propose language modelling and nearest neighbor approaches to context-based person identification, in addition to novel face color and image color content-based features (used alongside face recognition and body patch features). We conduct a comprehensive empirical study of these techniques using the real private photo collections of a number of users, and show that combining context- and content-based analysis improves performance over content or context alone. Index Terms—Context and content, person identification, personal photo management. I.
Joint People, Event, and Location Recognition in Personal Photo Collections using Cross-Domain Context ⋆
"... Abstract. We present a framework for vision-assisted tagging of personal photo collections using context. Whereas previous efforts mainly focus on tagging people, we develop a unified approach to jointly tag across multiple domains (specifically people, events, and locations). The heart of our appro ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. We present a framework for vision-assisted tagging of personal photo collections using context. Whereas previous efforts mainly focus on tagging people, we develop a unified approach to jointly tag across multiple domains (specifically people, events, and locations). The heart of our approach is a generic probabilistic model of context that couples the domains through a set of cross-domain relations. Each relation models how likely the instances in two domains are to co-occur. Based on this model, we derive an algorithm that simultaneously estimates the cross-domain relations and infers the unknown tags in a semi-supervised manner. We conducted experiments on two well-known datasets and obtained significant performance improvements in both people and location recognition. We also demonstrated the ability to infer event labels with missing timestamps (i.e. with no event features). 1

