Results 1 - 10
of
112
Learning Similarity Metrics for Event Identification in Social Media
"... Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of ..."
Abstract
-
Cited by 74 (9 self)
- Add to MetaCart
(Show Context)
Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of real-world events of different type and scale. By automatically identifying these events and their associated user-contributed social media documents, which is the focus of this paper, we can enable event browsing and search in state-of-the-art search engines. To address this problem, we exploit the rich “context ” associated with social media content, including user-provided annotations (e.g., title, tags) and automatically generated information (e.g., content creation time). Using this rich context, which includes both textual and non-textual features, we can define appropriate document similarity metrics to enable online clustering of media to events. As a key contribution of this paper, we explore a variety of techniques for learning multi-feature similarity metrics for social media documents in a principled manner. We evaluate our techniques on large-scale, realworld datasets of event images from Flickr. Our evaluation results suggest that our approach identifies events, and their associated social media documents, more effectively than the state-of-the-art strategies on which we build.
Pairwise Interaction Tensor Factorization for Personalized Tag Recommendation
"... Tagging plays an important role in many recent websites. Recommender systems can help to suggest a user the tags he might want to use for tagging a specific item. Factorization models based on the Tucker Decomposition (TD) model have been shown to provide high quality tag recommendations outperformi ..."
Abstract
-
Cited by 72 (11 self)
- Add to MetaCart
(Show Context)
Tagging plays an important role in many recent websites. Recommender systems can help to suggest a user the tags he might want to use for tagging a specific item. Factorization models based on the Tucker Decomposition (TD) model have been shown to provide high quality tag recommendations outperforming other approaches like PageRank, FolkRank, collaborative filtering, etc. The problem with TD models is the cubic core tensor resulting in a cubic runtime in the factorization dimension for prediction and learning. In this paper, we present the factorization model PITF (Pairwise Interaction Tensor Factorization) which is a special case of the TD model with linear runtime both for learning and prediction. PITF explicitly models the pairwise interactions between users, items and tags. The model is learned with an adaption of the Bayesian personalized ranking (BPR) criterion which originally has been introduced for item recommendation. Empirically, we show on real world datasets that this model outperforms TD largely in runtime and even can achieve better prediction quality. Besides our lab experiments, PITF has also won the ECML/PKDD Discovery Challenge 2009 for graph-based tag recommendation.
Evaluating Similarity Measures for Emergent Semantics of Social Tagging
"... Social bookmarking systems and their emergent information structures, known as folksonomies, are increasingly important data sources for Semantic Web applications. A key question for harvesting semantics from these systems is how to extend and adapt traditional notions of similarity to folksonomies, ..."
Abstract
-
Cited by 71 (8 self)
- Add to MetaCart
(Show Context)
Social bookmarking systems and their emergent information structures, known as folksonomies, are increasingly important data sources for Semantic Web applications. A key question for harvesting semantics from these systems is how to extend and adapt traditional notions of similarity to folksonomies, and which measures are best suited for applications such as navigation support, semantic search, and ontology learning. Here we build an evaluation framework to compare various general folksonomy-based similarity measures derived from established information-theoretic, statistical, and practical measures. Our framework deals generally and symmetrically with users, tags, and resources. For evaluation purposes we focus on similarity among tags and resources, considering different ways to aggregate annotations across users. After comparing how tag similarity measures predict user-created tag relations, we provide an external grounding by user-validated semantic proxies based on WordNet and the Open Directory. We also investigate the issue of scalability. We find that mutual information with distributional micro-aggregation across users yields the highest accuracy, but is not scalable; per-user projection with collaborative aggregation provides the best scalable approach via incremental computations. The results are consistent across resource and tag similarity.
L.S.: Learning optimal ranking with tensor factorization for tag recommendation
- In: KDD ’09: Proceeding of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
, 2009
"... Tag recommendation is the task of predicting a personalized list of tags for a user given an item. This is important for many websites with tagging capabilities like last.fm or delicious. In this paper, we propose a method for tag recommendation based on tensor factorization (TF). In contrast to oth ..."
Abstract
-
Cited by 60 (3 self)
- Add to MetaCart
(Show Context)
Tag recommendation is the task of predicting a personalized list of tags for a user given an item. This is important for many websites with tagging capabilities like last.fm or delicious. In this paper, we propose a method for tag recommendation based on tensor factorization (TF). In contrast to other TF methods like higher order singular value decomposition (HOSVD), our method RTF (‘ranking with tensor factorization’) directly optimizes the factorization model for the best personalized ranking. RTF handles missing values and learns from pairwise ranking constraints. Our optimization criterion for TF is motivated by a detailed analysis of the problem and of interpretation schemes for the observed data in tagging systems. In all, RTF directly optimizes for the actual problem using a correct interpretation of the data. We provide a gradient descent algorithm to solve our optimization problem. We also provide an improved learning and prediction method with runtime complexity analysis for RTF. The prediction runtime of RTF is independent of the number of observations and only depends on the factorization dimensions. Besides the theoretical analysis, we empirically show that our method outperforms other state-of-theart tag recommendation methods like FolkRank, PageRank and HOSVD both in quality and prediction runtime.
Human-competitive tagging using automatic keyphrase extraction
"... This paper connects two research areas: automatic tagging on the web and statistical keyphrase extraction. First, we analyze the quality of tags in a collaboratively created folksonomy using traditional evaluation techniques. Next, we demonstrate how documents can be tagged automatically with a stat ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
(Show Context)
This paper connects two research areas: automatic tagging on the web and statistical keyphrase extraction. First, we analyze the quality of tags in a collaboratively created folksonomy using traditional evaluation techniques. Next, we demonstrate how documents can be tagged automatically with a state-of-the-art keyphrase extraction algorithm, and further improve performance in this new domain using a new algorithm, “Maui”, that utilizes semantic information extracted from Wikipedia. Maui outperforms existing approaches and extracts tags that are competitive with those assigned by the best performing human taggers. 1
Document recommendation in social tagging services
- In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM
, 2010
"... Social tagging services allow users to annotate various on-line resources with freely chosen keywords (tags). They not only facilitate the users in finding and organizing online re-sources, but also provide meaningful collaborative semantic data which can potentially be exploited by recommender syst ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
(Show Context)
Social tagging services allow users to annotate various on-line resources with freely chosen keywords (tags). They not only facilitate the users in finding and organizing online re-sources, but also provide meaningful collaborative semantic data which can potentially be exploited by recommender systems. Traditional studies on recommender systems fo-cused on user rating data, while recently social tagging data is becoming more and more prevalent. How to perform re-source recommendation based on tagging data is an emerg-ing research topic. In this paper we consider the problem of document (e.g. Web pages, research papers) recommen-dation using purely tagging data. That is, we only have data containing users, tags, documents and the relation-ships among them. We propose a novel graph-based rep-resentation learning algorithm for this purpose. The users, tags and documents are represented in the same semantic space in which two related objects are close to each other. For a given user, we recommend those documents that are sufficiently close to him/her. Experimental results on two data sets crawled from Del.icio.us and CiteULike show that our algorithm can generate promising recommendations and outperforms traditional recommendation algorithms.
Geofolk: latent spatial semantics in web 2.0 social media
- In: Proc. of ACM WSDM
, 2010
"... We describe an approach for multi-modal characterization of social media by combining text features (e.g. tags as a prominent example of short, unstructured text labels) with spatial knowledge (e.g. geotags and coordinates of images and videos). Our model-based framework GeoFolk combines these two a ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
(Show Context)
We describe an approach for multi-modal characterization of social media by combining text features (e.g. tags as a prominent example of short, unstructured text labels) with spatial knowledge (e.g. geotags and coordinates of images and videos). Our model-based framework GeoFolk combines these two aspects in order to construct better algorithms for content management, retrieval, and sharing. The approach is based on multi-modal Bayesian models which allow us to integrate spatial semantics of social media in a well-formed, probabilistic manner. We systematically evaluate the solu-tion on a subset of Flickr data, in characteristic scenarios of tag recommendation, content classification, and cluster-ing. Experimental results show that our method outper-forms baseline techniques that are based on one of the as-pects alone. The approach described in this contribution can also be used in other domains such as Geoweb retrieval.
Connecting Users and Items with Weighted Tags for Personalized Item Recommendations
- In Proc. of HT’10
"... This is the author’s version of a work that was submitted/accepted for pub-lication in the following source: ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
(Show Context)
This is the author’s version of a work that was submitted/accepted for pub-lication in the following source:
A Simple Word Trigger Method for Social Tag Suggestion
"... It is popular for users in Web 2.0 era to freely annotate online resources with tags. To ease the annotation process, it has been great interest in automatic tag suggestion. We propose a method to suggest tags according to the text description of a resource. By considering both the description and t ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
(Show Context)
It is popular for users in Web 2.0 era to freely annotate online resources with tags. To ease the annotation process, it has been great interest in automatic tag suggestion. We propose a method to suggest tags according to the text description of a resource. By considering both the description and tags of a given resource as summaries to the resource written in two languages, we adopt word alignment models in statistical machine translation to bridge their vocabulary gap. Based on the translation probabilities between the words in descriptions and the tags estimated on a large set of description-tags pairs, we build a word trigger method (WTM) to suggest tags according to the words in a resource description. Experiments on real world datasets show that WTM is effective and robust compared with other methods. Moreover, WTM is relatively simple and efficient, which is practical for Web applications. 1
A Probabilistic Model for Personalized Tag Prediction
- PROCEEDINGS OF THE 16TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING
, 2010
"... Social tagging systems have become increasingly popular for sharing and organizing web resources. Tag recommendation is a common feature of social tagging systems. Social tagging by nature is an incremental process, meaning that once a user has saved a web page with tags, the tagging system can prov ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
(Show Context)
Social tagging systems have become increasingly popular for sharing and organizing web resources. Tag recommendation is a common feature of social tagging systems. Social tagging by nature is an incremental process, meaning that once a user has saved a web page with tags, the tagging system can provide more accurate predictions for the user, based on the user’s incremental behavior. However, existing tag prediction methods do not consider this important factor, in which their training and test datasets are either split by a fixed time stamp or randomly sampled from a larger corpus. In our temporal experiments, we perform a time-sensitive sampling on an existing public dataset, resulting in a new scenario which is much closer to “real-world”. In this paper, we address the problem of tag prediction by proposing a probabilistic model for personalized tag prediction. The model is a Bayesian approach, and integrates three factors— an ego-centric effect, environmental effects and web page content. Two methods—both intuitive calculation and learning optimization—are provided for parameter estimation. Pure graphbased methods which may have significant constraints (such as every user, every item and every tag has to occur in at least p posts) cannot make a prediction in most “real world ” cases while our model improves the F-measure by over 30 % compared to a leading algorithm on a publicly-available real-world dataset.