Results 1 - 10
of
56
GroupLens: An Open Architecture for Collaborative Filtering of Netnews
, 1994
"... Collaborative filters help people make choices based on the opinions of other people. GroupLens is a system for collaborative filtering of netnews, to help people find articles they will like in the huge stream of available articles. News reader clients display predicted scores and make it easy for ..."
Abstract
-
Cited by 872 (29 self)
- Add to MetaCart
Collaborative filters help people make choices based on the opinions of other people. GroupLens is a system for collaborative filtering of netnews, to help people find articles they will like in the huge stream of available articles. News reader clients display predicted scores and make it easy for users to rate articles after they read them. Rating servers, called Better Bit Bureaus, gather and disseminate the ratings. The rating servers predict scores based on the heuristic that people who agreed in the past will probably agree again. Users can protect their privacy by entering ratings under pseudonyms, without reducing the effectiveness of the score prediction. The entire architecture is open: alternative software for news clients and Better Bit Bureaus can be developed independently and can interoperate with the components we have developed.
NewsWeeder: Learning to Filter Netnews
- in Proceedings of the 12th International Machine Learning Conference (ML95
, 1995
"... A significant problem in many information filtering systems is the dependence on the user for the creation and maintenance of a user profile, which describes the user's interests. NewsWeeder is a netnews-filtering system that addresses this problem by letting the user rate his or her interest level ..."
Abstract
-
Cited by 353 (0 self)
- Add to MetaCart
A significant problem in many information filtering systems is the dependence on the user for the creation and maintenance of a user profile, which describes the user's interests. NewsWeeder is a netnews-filtering system that addresses this problem by letting the user rate his or her interest level for each article being read (1-5), and then learning a user profile based on these ratings. This paper describes how NewsWeeder accomplishes this task, and examines the alternative learning methods used. The results show that a learning algorithm based on the Minimum Description Length (MDL) principle was able to raise the percentage of interesting articles to be shown to users from 14% to 52% on average. Further, this performance significantly outperformed (by 21%) one of the most successful techniques in Information Retrieval (IR), termfrequency /inverse-document-frequency (tf-idf) weighting. 1
The SIFT Information Dissemination System
- ACM Transactions on Database Systems
, 2000
"... Information dissemination is a powerful mechanism for finding information in wide-area environments. An information dissemination server accepts long-term user queries, collects new documents from information sources, matches the documents against the queries, and continuously updates the users wi ..."
Abstract
-
Cited by 97 (1 self)
- Add to MetaCart
Information dissemination is a powerful mechanism for finding information in wide-area environments. An information dissemination server accepts long-term user queries, collects new documents from information sources, matches the documents against the queries, and continuously updates the users with relevant information. This paper is a retrospective of the Stanford Information Filtering Service (SIFT), a system that as of April 1996 was processing over 40,000 worldwide subscriptions and over 80,000 daily documents. The paper describes some of the indexing mechanisms that were developed for SIFT, as well as the evaluations that were conducted to select a scheme to implement. It also describes the implementation of SIFT, and experimental results for the actual system. Finally, it also discusses and experimentally evaluates techniques for distributing a service such as SIFT for added performance and availability. Note to Referees: This paper contains material from three earlier...
Interface Agents that Learn: An Investigation of Learning Issues in a Mail Agent Interface
, 1995
"... In recent years, interface agents have been developed to assist users with various tasks. Some systems employ machine learning techniques to allow the agent to adapt to the user's changing requirements. With the increase in the volume of data on the Internet, agents have emerged which are able to mo ..."
Abstract
-
Cited by 45 (10 self)
- Add to MetaCart
In recent years, interface agents have been developed to assist users with various tasks. Some systems employ machine learning techniques to allow the agent to adapt to the user's changing requirements. With the increase in the volume of data on the Internet, agents have emerged which are able to monitor and learn from their users to identify topics of interest. One such agent, described here, has been developed to filter mail messages. We examine the issues involved in constructing an autonomous interface agent which employs a learning component, and explore the use of two different learning techniques in this context. Submitted to Applied Artificial Intelligence Journal. October 26, 1 INTRODUCTION 1 1 Introduction Agents were once seen as anthropomorphic entities which would assist users with daily tasks. They could be used, for example, to locate information of interest to their user (Kay 1984). Ten years later, many definitions of agents have been proposed. The basic concept of ...
Ontology-based personalized search and browsing
- Web Intelligence and Agent Systems
, 2003
"... This paper has not been submitted elsewhere in identical or similar form, nor will it be during the first three months after its submission to UMUAI. As the number of Internet users and the number of accessible Web pages grows, it is becoming increasingly difficult for users to find documents that a ..."
Abstract
-
Cited by 41 (0 self)
- Add to MetaCart
This paper has not been submitted elsewhere in identical or similar form, nor will it be during the first three months after its submission to UMUAI. As the number of Internet users and the number of accessible Web pages grows, it is becoming increasingly difficult for users to find documents that are relevant to their particular needs. Users must either browse through a large hierarchy of concepts to find the information for which they are looking or submit a query to a publicly available search engine and wade through hundreds of results, most of them irrelevant. The core of the problem is that whether the user is browsing or searching, whether they are an eighth grade student or a Nobel prize winner, the identical information is selected and it is presented the same way. In this paper, we report on research that adapts information navigation based on a user profile structured as a weighted concept hierarchy. A user may create his or her own concept hierarchy and use them for browsing Web sites. Or, the user profile may be created from a reference ontology by ‘watching over the user’s shoulder’ while they browse. We show that these automatically created profiles reflect the user’s interests quite well and they are able to produce moderate improvements when applied to search results. Current work is investigating the interaction between the user profiles and conceptual search wherein documents are indexed by their concepts in addition to their keywords.
The Application of Classical Information Retrieval Techniques to Spoken Documents
, 1995
"... Object Description General Discussion Map Reading Photographic Interpretation Cartoon Description Table 4.1: Message classes in classification experiments of Rose et al. Now, an estimate of I(C i ; w k ) can be calculated by a four--way partition of the set of test messages, depending on (a) whether ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
Object Description General Discussion Map Reading Photographic Interpretation Cartoon Description Table 4.1: Message classes in classification experiments of Rose et al. Now, an estimate of I(C i ; w k ) can be calculated by a four--way partition of the set of test messages, depending on (a) whether or not a message belongs to topic class C i and (b) whether or not it contains word w k . If N is the number of messages in the test collection, R i is the number belonging to topic class C i , n k is the number of messages containing word w k and r ik is the number of messages in class C i containing word w k , then, estimating the probabilities by frequency counts, I(C i ; w k ) = log ( r ik R i ) ( n k N ) : This is actually identical to a form of retrospective term relevance weight, initially proposed in the IR literature by both Barkla [66] and Miller [67], and reviewed by Robertson and Sparck Jones in their classic paper on the subject [42]. Moreover, Rose proposed, but did no...
An Adaptive Algorithm for Learning Changes in User Interests
- In Proceedings of the Eight ACM International Conference on Information and Knowledge Management
, 1999
"... In this paper, we describe a new scheme to learn dynamic users' interests in an automated information filtering and gathering system running on the Internet. Our scheme is aimed to handle multiple domains of long-term and short-term user's interests simultaneously, which is learned through positive ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
In this paper, we describe a new scheme to learn dynamic users' interests in an automated information filtering and gathering system running on the Internet. Our scheme is aimed to handle multiple domains of long-term and short-term user's interests simultaneously, which is learned through positive and negative user's relevance feedback. We developed a 3-descriptor approach to represent the user's interest categories. Using a learning algorithm derived for this representation, our scheme adapts quickly to significant changes in user interest, and is also able to learn exceptions to interest categories.
Learning User Interest Dynamics with a Three-Descriptor Representation
, 2000
"... Learning users' interest categories is challenging in a dynamic environment like the Web because they change over time. This paper describes a novel scheme to represent a user's interest categories, and an adaptive algorithm to learn the dynamics of the user's interests through positive and negative ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Learning users' interest categories is challenging in a dynamic environment like the Web because they change over time. This paper describes a novel scheme to represent a user's interest categories, and an adaptive algorithm to learn the dynamics of the user's interests through positive and negative relevance feedback. We propose a three-descriptor model to represent a user's interests. The proposed model maintains a long-term interest descriptor to capture the user's general interests and a short-term interest descriptor to keep track of the user's more recent, faster-changing interests. An algorithm based on the three-descriptor representation is developed to acquire high accuracy of recognition for long-term interests, and to adapt quickly to changing interests in the short-term. The model is also extended to multiple three-descriptor representations to capture a broader range of interests. Empirical studies confirm the effectiveness of this scheme to accurately model a user's inter...
Social information filtering for music recommendation
, 1994
"... Abstract Filters which select items for individual users based upon content suffer from several limitations. The items being filtered must be amenable to parsing by a computer. Furthermore, Content-Based Filters possess no inherent method for serendipitous exploration of the information space. ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
Abstract Filters which select items for individual users based upon content suffer from several limitations. The items being filtered must be amenable to parsing by a computer. Furthermore, Content-Based Filters possess no inherent method for serendipitous exploration of the information space.
Experience with Rule Induction and k-Nearest Neighbour Methods for Interface Agents that Learn
- In ML95 Workshop on Agents that Learn from Other Agents
, 1995
"... this paper use the same feature extraction mechanism, which extracts words according to word frequency. The underlying assumption here is that words which act as good classifiers for identifying message topics appear frequently. Whilst this model appears to work for Magi, where the task is primarily ..."
Abstract
-
Cited by 20 (8 self)
- Add to MetaCart
this paper use the same feature extraction mechanism, which extracts words according to word frequency. The underlying assumption here is that words which act as good classifiers for identifying message topics appear frequently. Whilst this model appears to work for Magi, where the task is primarily that of grouping together related messages, it is unsuitable for UNA where articles have already been sorted into topics, or newsgroups. Features identified by the current feature extraction module are not ideal for determining the user's interest in an article. The performance of UNA degrades significantly when multiple narrow classifications are used. We are currently studying this phenomenon, however as the number of classes increases, there is a greater chance of features appearing in more than one class. Algorithms such as CN2 and MBR consider each classification as distinct from the others, as a result, such features will be considered as poor classifiers. An important difference between the two algorithms is the time taken to induce and apply user profiles to new articles. The instance based approach builds a sub-symbolic representation in the form of weights and distance metrics. Unlike rule induction in CN2, these calculations do not involve searching through a large space of possible solutions. The search performed by CN2 is compounded by the large number of features generated by the article body. It was found that tests involving CN2 took significantly (30 to 40 times) longer than tests involving MBR. Considerations such as speed of profile induction and classification are important. In order to induce a user profile based on observations, many examples are needed, and large log files are generated. As agent technology is applied to commercial tools such as web ...

