Results 1 - 10
of
25
Learning Implicit User Interest Hierarchy for Context in Personalization
- In Proc. of International Conference on Intelligent User Interface (IUI
, 2003
"... To provide a more robust context for personalization, we desire to extract a continuum of general (long-term) to specific (short-term) interests of a user. Our proposed approach is to learn a user interest hierarchy (UIH) from a set of web pages visited by a user. We devise a divisive hierarchical c ..."
Abstract
-
Cited by 63 (4 self)
- Add to MetaCart
(Show Context)
To provide a more robust context for personalization, we desire to extract a continuum of general (long-term) to specific (short-term) interests of a user. Our proposed approach is to learn a user interest hierarchy (UIH) from a set of web pages visited by a user. We devise a divisive hierarchical clustering (DHC) algorithm to group words (topics) into a hierarchy where more general interests are represented by a larger set of words. Each web page can then be assigned to nodes in the hierarchy for further processing in learning and predicting interests. This approach is analogous to building a subject taxonomy for a library catalog system and assigning books to the taxonomy. Our approach does not need user involvement and learns the UIH "implicitly." Furthermore, it allows the original objects, web pages, to be assigned to multiple topics (nodes in the hierarchy). In this paper, we focus on learning the UIH from a set of visited pages. We propose a few similarity functions and dynamic threshold-funding methods, and evaluate the resulting hierarchies according to their meaningfulhess and shape.
PageRank, HITS and a Unified Framework for Link Analysis
"... Two popular webpage ranking algorithms are HITS and PageRank. HITS emphasizes mutual reinforcement between authority and hub webpages, while PageRank emphasizes hyperlink weight normalization and web surfing based on random walk models. We systematically generalize/combine these concepts into a unif ..."
Abstract
-
Cited by 54 (4 self)
- Add to MetaCart
(Show Context)
Two popular webpage ranking algorithms are HITS and PageRank. HITS emphasizes mutual reinforcement between authority and hub webpages, while PageRank emphasizes hyperlink weight normalization and web surfing based on random walk models. We systematically generalize/combine these concepts into a unified framework. The ranking framework contains a large algorithm space; HITS and PageRank are two extreme ends in this space. We study several normalized ranking algorithms which are intermediate between HITS and PageRank, and obtain closed-form solutions. We show that, to first order approximation, all ranking algorithms in this framework, including PageRank and HITS, lead to same ranking which is highly correlated with ranking by indegree.
Clustering relational data using attribute and link information
- In Proceedings of the Text Mining and Link Analysis Workshop, 18th International Joint Conference on Artificial Intelligence
, 2003
"... Clustering is a descriptive task that seeks to identify natural groupings in data. Relational data offer a wealth of information for identifying groups of similar items. Both attribute information and the structure of relationships can be used for clustering. Graph partitioning and data clustering t ..."
Abstract
-
Cited by 41 (0 self)
- Add to MetaCart
Clustering is a descriptive task that seeks to identify natural groupings in data. Relational data offer a wealth of information for identifying groups of similar items. Both attribute information and the structure of relationships can be used for clustering. Graph partitioning and data clustering techniques can be applied independently to relational data but a technique that exploits both sources of information simultaneously may produce more meaningful clusters. This paper will describe our work synthesizing data clustering and graph partitioning techniques into improved clustering algorithms for relational data. 1
Exploiting relational structure to understand publication patterns in high-energy physics
- SIGKDD Explorations
, 2003
"... We analyze publication patterns in theoretical high-energy physics using a relational learning approach. We focus on four related areas: understanding and identifying patterns of citations, examining publication patterns at the author level, predicting whether a paper will be accepted by specific jo ..."
Abstract
-
Cited by 32 (8 self)
- Add to MetaCart
We analyze publication patterns in theoretical high-energy physics using a relational learning approach. We focus on four related areas: understanding and identifying patterns of citations, examining publication patterns at the author level, predicting whether a paper will be accepted by specific journals, and identifying research communities from the citation patterns and paper text. Each of these analyses contributes to an overall understanding of theoretical highenergy physics. 1.
Mobility Performance of
- Macrocell-Assisted Small Cells in Manhattan Model,” Vehicular Technology Conference (VTC Spring), 2014 IEEE 79th
, 2014
"... Recent research efforts have made notable progress in improving the performance of (exhaustive) maximal clique enumeration (MCE). However, existing algorithms still suffer from exploring the huge search space of MCE. Furthermore, their results are often undesir-able as many of the returned maximal c ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
(Show Context)
Recent research efforts have made notable progress in improving the performance of (exhaustive) maximal clique enumeration (MCE). However, existing algorithms still suffer from exploring the huge search space of MCE. Furthermore, their results are often undesir-able as many of the returned maximal cliques have large overlap-ping parts. This redundancy leads to problems in both computa-tional efficiency and usefulness of MCE. In this paper, we aim at providing a concise and complete sum-mary of the set of maximal cliques, which is useful to many ap-plications. We propose the notion of τ-visible MCE to achieve this goal and design algorithms to realize the notion. Based on the refined output space, we further consider applications includ-ing an efficient computation of the top-k results with diversity and an interactive clique exploration process. Our experimental results demonstrate that our approach is capable of producing output of high usability and our algorithms achieve superior efficiency over classic MCE algorithms.
Business Intelligence Explorer: A knowledge map framework for discovering business intelligence on the Web
- In R.H. Sprague (Ed.), Proceedings of the 36th Hawaii International Conference on System Sciences, Island of Hawaii, HI: IEEE Computer Society
, 2003
"... Nowadays, information overload hinders the discovery of business intelligence on the World Wide Web. Existing business intelligence tools suffer from a lack of analysis and visualization capabilities and traditional result list display by search engines often overwhelms business analysts with irrele ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
(Show Context)
Nowadays, information overload hinders the discovery of business intelligence on the World Wide Web. Existing business intelligence tools suffer from a lack of analysis and visualization capabilities and traditional result list display by search engines often overwhelms business analysts with irrelevant information. Thus, developing tools that enable better analysis while reduce information overload has been a challenge. The literature show that hierarchical and map displays enable effective access and browsing of information. However, they have not been widely applied to discover business intelligence on the Web. This research proposes Business Intelligence Explorer, a tool implementing the steps in a knowledge map framework for discovering business intelligence on the Web. Two browsing methods, namely, Web community and knowledge map, have been implemented. Web community uses a genetic algorithm to organize different Web sites into a hierarchical format. Knowledge map uses a multidimensional scaling algorithm to place different Web sites as points on a map. Preliminary results of our user study show that Web community helps users locate results quickly and effectively. Users liked the intuitive map display of knowledge map. Our Business Intelligence Explorer contributes to alleviate information overload in business analysis. Future directions on applying document visualization techniques in discovering business intelligence are described.
Implications of the recursive representation problem for automatic concept identification in on-line governmental information
- In Proceedings of the ASIST Special Interest Group on Classification Research (ASIST SIG-CR
, 2003
"... This paper describes ongoing research into the application of unsupervised learning techniques for improving access to governmental information on the Web. Under the auspices of the GovStat Project ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
This paper describes ongoing research into the application of unsupervised learning techniques for improving access to governmental information on the Web. Under the auspices of the GovStat Project
Link analysis: Hubs and authorities on the world wide web
- SIAM Review
, 2001
"... Abstract. Ranking the tens of thousands of retrieved webpages for a user query on a Web search engine such that the most informative webpages are on the top is a key information retrieval technology. A popular ranking algorithm is the HITS algorithm of Kleinberg. It explores the reinforcing interpla ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
(Show Context)
Abstract. Ranking the tens of thousands of retrieved webpages for a user query on a Web search engine such that the most informative webpages are on the top is a key information retrieval technology. A popular ranking algorithm is the HITS algorithm of Kleinberg. It explores the reinforcing interplay between authority and hub webpages on a particular topic by taking into account the structure of the Web graphs formed by the hyperlinks between the webpages. In this paper, we give a detailed analysis of the HITS algorithm through a unique combination of probabilistic analysis and matrix algebra. In particular, we show that to first-order approximation, the ranking given by the HITS algorithm is the same as the ranking by counting inbound and outbound hyperlinks. Using Web graphs of different sizes, we also provide experimental results to illustrate the analysis.
Topic Identification: Framework and Application
- Proc of International Conference on Knowledge Management (I-KNOW
, 2004
"... Abstract: This paper is on topic identification, i. e., the construction of useful labels for sets of documents. Topic identification is essential in connection within categorizing search applications, where several sets of documents are delivered and an expressive description for each category must ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Abstract: This paper is on topic identification, i. e., the construction of useful labels for sets of documents. Topic identification is essential in connection within categorizing search applications, where several sets of documents are delivered and an expressive description for each category must be constructed on the fly. The contributions of this paper are threefold. (1) It presents a framework to formally specify the topic identification problem along with its desired properties, (2) it introduces a classification scheme for topic identification algorithms and outlines the respective algorithm of the AIsearch meta search engine, (3) it proposes a hybrid approach to topic identification, which relies on classification knowledge of existing ontologies.
Towards a formal concept analysis approach to exploring communities on the world wide web
- ICFCA 2005, volume 3403 of LNAI
, 2005
"... Abstract. An interesting problem associated with the World Wide Web (Web) is the definition and delineation of so calledWeb communities. The Web can be characterized as a directed graph whose nodes represent Web pages and whose edges represent hyperlinks. An authority is a page that is linked to by ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Abstract. An interesting problem associated with the World Wide Web (Web) is the definition and delineation of so calledWeb communities. The Web can be characterized as a directed graph whose nodes represent Web pages and whose edges represent hyperlinks. An authority is a page that is linked to by high quality hubs, while a hub is a page that links to high quality authorities. A Web community is a highly interconnected aggregate of hubs and authorities. We define a community core to be a maximally connected bipartite subgraph of the Web graph. We observe that the web subgraph can be viewed as a formal context and that web communities can be modeled by formal concepts. Addi-tionally, the notions of hub and authority are captured by the extent and intent, respectively, of a concept. Though Formal Concept Analysis (FCA) has previously been applied to the Web, none of the FCA based approaches that we are aware of consider the link structure of the Web pages. We utilize notions from FCA to explore the community structure of the Web graph. We discuss the problem of utilizing this structure to locate and organize communities in the form of a knowledge base built from the resulting concept lattice and discuss methods to reduce the complexity of the knowledge base by coalescing similar Web communi-ties. We present preliminary experimental results obtained from real Web data that demonstrate the usefulness of FCA for improving Web search. 1