Results 1 - 10
of
30
Personalized Web Search with Location Preferences
"... Abstract — As the amount of Web information grows rapidly, search engines must be able to retrieve information according to the user’s preference. In this paper, we propose a new web search personalization approach that captures the user’s interests and preferences in the form of concepts by mining ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
(Show Context)
Abstract — As the amount of Web information grows rapidly, search engines must be able to retrieve information according to the user’s preference. In this paper, we propose a new web search personalization approach that captures the user’s interests and preferences in the form of concepts by mining search results and their clickthroughs. Due to the important role location information plays in mobile search, we separate concepts into content concepts and location concepts, and organize them into ontologies to create an ontology-based, multi-facet (OMF) profile to precisely capture the user’s content and location interests and hence improve the search accuracy. Moreover, recognizing the fact that different users and queries may have different emphases on content and location information, we introduce the notion of content and location entropies to measure the amount of content and location information associated with a query, and click content and location entropies to measure how much the user is interested in the content and location information in the results. Accordingly, we propose to define personalization effectiveness based on the entropies and use it to balance the weights between the content and location facets. Finally, based on the derived ontologies and personalization effectiveness, we train an SVM to adapt a personalized ranking function for re-ranking of future search. We conduct extensive experiments to compare the precision produced by our OMF profiles and that of a baseline method. Experimental results show that OMF improves the precision significantly compared to the baseline. I.
Evaluating vector-space and probabilistic models for query to ad matching
- In SIGIR ’08 Workshop on Information Retrieval in Advertising (IRA
, 2008
"... In this work, we evaluate variants of several information re-trieval models from the classic BM25 model to Language Modeling approaches for retrieving relevant textual adver-tisements for Sponsored Search. Within the language mod-eling framework, we explore implicit query expansion via translation t ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
(Show Context)
In this work, we evaluate variants of several information re-trieval models from the classic BM25 model to Language Modeling approaches for retrieving relevant textual adver-tisements for Sponsored Search. Within the language mod-eling framework, we explore implicit query expansion via translation tables derived from multiple sources and pro-pose a novel method for directly estimating the probabil-ity that an advertisement is clicked for a given query. We also investigate explicit query expansion using regular web search results for sponsored search using the vector space framework. We find that web-based expansions result in significant improvement in Mean Average Precision.
Toward Traffic-Driven Location-Based Web Search
"... The emergence of location sharing services is rapidly accelerating the convergence of our online and offline activities. In one direction, Foursquare, Google Latitude, Facebook Places, and related services are enriching real-world venues with the social and semantic connections among online users. I ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
The emergence of location sharing services is rapidly accelerating the convergence of our online and offline activities. In one direction, Foursquare, Google Latitude, Facebook Places, and related services are enriching real-world venues with the social and semantic connections among online users. In analogy to how clickstreams have been successfully incorporated into traditional web ranking based on content and link analysis, we propose to mine traffic patterns revealed through location sharing services to augment traditional location-based search. Concretely, we study locationbased traffic patterns revealed through location sharing services and find that these traffic patterns can identify semantically related locations. Based on this observation, we propose and evaluate a traffic-driven location clustering algorithm that can group semantically related locations with high confidence. Through experimental study of 12 million locations from Foursquare, we extend this result through supervised location categorization, wherein traffic patterns can be used to accurately predict the semantic category of uncategorized locations. Based on these results, we show how traffic-driven semantic organization of locations may be naturally incorporated into location-based web search.
Social tagging for personalized location-based services
- In Proceedings of the 2nd International Workshop on Social Recommender Systems
, 2011
"... The current generation of location-based services (LBSs) does not provide users with personalized recommendations, but only suggests nearby points of interest (POIs) based on their distance from the user current location. To overcome such a limitation, we have realized a social recommender system ab ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
The current generation of location-based services (LBSs) does not provide users with personalized recommendations, but only suggests nearby points of interest (POIs) based on their distance from the user current location. To overcome such a limitation, we have realized a social recommender system able to identify user preferences and information ne-eds, thus suggesting personalized recommendations related to possible POIs in the surroundings. The proposed approach allows users to leverage and assign freely chosen keywords (tags) to resources through collaborative tagging services (fol-ksonomies). Our LBS employs a user-based tag model that derives correspondences between personal tag vocabularies (personomies) and folksonomies. Experimental tests per-formed on 15 real users enabled us to assess the benefits in terms of performance that our approach is able to provide. Author Keywords
A Case Study of Using Geographic Cues to Predict Query News Intent
"... Geographic information retrieval encompasses important tasks including finding the location of a user, and locations relevant to their search queries. Web-based search engines receive queries from numerous users located in very different parts of the world. A typical way for people to find news is t ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Geographic information retrieval encompasses important tasks including finding the location of a user, and locations relevant to their search queries. Web-based search engines receive queries from numerous users located in very different parts of the world. A typical way for people to find news is through a general web search engine, which makes it important for search engines to recognize queries with news intent. An important question for geographic information retrieval is how we can benefit from geographic cues to predict the intent of users. This work presents a case study of an application using geographic features to improve the quality of an important web search task, involving predicting which queries have news intent and hence are likely to receive clicks on news search results. Our case study suggests that information derived from geographic features can help the task. The information we consider includes cues derived from the location of the user, from the IP address, the location relevant to the query, automatically extracted from the query string, and the relation between the two locations. We build a classifier that uses geographical cues to predict whether a query will result in a news click or not. We compare our classifier to a strong baseline that use non-geographic clickbased features and we show that our classifier outperforms the baseline for geographic queries.
B: A systematic review of re-identification attacks on health data. PLoS One 2011
"... Abstract Background: Privacy legislation in most jurisdictions allows the disclosure of health data for secondary purposes without patient consent if it is de-identified. Some recent articles in the medical, legal, and computer science literature have argued that de-identification methods do not pr ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Abstract Background: Privacy legislation in most jurisdictions allows the disclosure of health data for secondary purposes without patient consent if it is de-identified. Some recent articles in the medical, legal, and computer science literature have argued that de-identification methods do not provide sufficient protection because they are easy to reverse. Should this be the case, it would have significant and important implications on how health information is disclosed, including: (a) potentially limiting its availability for secondary purposes such as research, and (b) resulting in more identifiable health information being disclosed. Our objectives in this systematic review were to: (a) characterize known re-identification attacks on health data and contrast that to re-identification attacks on other kinds of data, (b) compute the overall proportion of records that have been correctly re-identified in these attacks, and (c) assess whether these demonstrate weaknesses in current deidentification methods.
Local Web Search Examined
, 2012
"... To provide a theoretical background to understanding current local search engines as an aspect of specialized search, and understanding the data sources and used technologies. Design/Methodology/Approach Selected local search engines are examined and compared towards their use of GIR (Geographic Inf ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
To provide a theoretical background to understanding current local search engines as an aspect of specialized search, and understanding the data sources and used technologies. Design/Methodology/Approach Selected local search engines are examined and compared towards their use of GIR (Geographic Information Retrieval) technologies, data sources, available entity information, processing and interfaces. An introduction to the field of GIR is given and its use in the selected systems is discussed. Findings All selected commercial local search engines utilize GIR technology in varying degrees for information preparation and presentation. It is also starting to be used in regular Web search. However, major differences can be found between the different search engines. Research limitations/implications This study is not exhaustive and only uses informal comparisons without definitive ranking. Due to the unavailability of hard data, informed guesses were made based on available public interfaces and literature. Practical implications A source of background information for understanding the results of local search engines, their provenance and their potential. Originality/value An overview of GIR technology in the context of commercial search engines integrates research efforts and
Real time search on the web: queries, topics, and economic
, 2011
"... a b s t r a c t Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
a b s t r a c t Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, including number of users, queries, and terms. We then classify queries into subject categories using the Google Directory topical hierarchy. We next estimate the economic value of the real time search traffic using the Google AdWords keyword advertising platform. Results shows that 30% of the queries were unique (used only once in the entire dataset), which is low compared to traditional Web searching. Also, 60% of the search traffic comes from the search engine's application program interface, indicating that real time search is heavily leveraged by other applications. There are many repeated queries over time via these application program interfaces, perhaps indicating both long term interest in a topic and the polling nature of real time queries. Concerning search topics, the most used terms dealt with technology, entertainment, and politics, reflecting both the temporal nature of the queries and, perhaps, an early adopter user-based. However, 36% of the queries indicate some geographical affinity, pointing to a location-based aspect to real time search. In terms of economic value, we calculate this real time search stream to be worth approximately US $33,000,000 (US $33 M) on the online advertising market at the time of the study. We discuss the implications for search engines and content providers as real time content increasingly enters the main stream as an information source.
A tale of two (similar) cities: Inferring city similarity through geo-spatial query log analysis
- In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval
, 2011
"... Abstract — Understanding the backgrounds and interest of the people who are consuming a piece of content, such as a news story, video, or music, is vital for the content producer as well the advertisers who rely on the content to provide a channel on which to advertise. We extend traditional search- ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract — Understanding the backgrounds and interest of the people who are consuming a piece of content, such as a news story, video, or music, is vital for the content producer as well the advertisers who rely on the content to provide a channel on which to advertise. We extend traditional search-engine query log analysis, which has primarily concentrated on analyzing either single or small groups of queries or users, to examining the complete query stream of very large groups of users – the inhabitants of 13,377 cities across the United States. Query logs can be a good representation of the interests of the city’s inhabitants and a useful characterization of the city itself. Further, we demonstrate how query logs can be effectively used to gather city-level statistics sufficient for providing insights into the similarities and differences between cities. Cities that are found to be similar through the use of query analysis correspond well to the similar cities as determined through other large-scale and time-consuming direct measurement studies, such as those undertaken by the Census Bureau. Extensive experiments are provided.
The Geographical Life of Search
"... This article describes a geographical study on the usage of a search engine, focusing on the traffic details at the level of countries and continents. The main objective is to understand from a geographic point of view, how the needs of the users are satisfied, taking into account the geographic loc ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
This article describes a geographical study on the usage of a search engine, focusing on the traffic details at the level of countries and continents. The main objective is to understand from a geographic point of view, how the needs of the users are satisfied, taking into account the geographic location of the host in which the search originates, and the host that contains the Web page that was selected by the user in the answers. Our results confirm that the Web is a cultural mirror of society and shed light on the implicit social network behind search. These results are also useful as input for the design of distributed search engines. 1.