Results 11 - 20
of
25
G.: Information retrieval and filtering over self-organising digital libraries
- In: Proceedings of the 12th European Conference on Research and Advanced Technology for Digital Libraries (ECDL
, 2008
"... Abstract. We present iClusterDL, a self-organising overlay network that supports information retrieval and filtering functionality in a digital library environment. iClusterDL is able to handle huge amounts of data provided by digital libraries in a distributed and self-organising way. The two-tier ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. We present iClusterDL, a self-organising overlay network that supports information retrieval and filtering functionality in a digital library environment. iClusterDL is able to handle huge amounts of data provided by digital libraries in a distributed and self-organising way. The two-tier architecture and the use of semantic overlay networks provide an infrastructure for creating large networks of digital libraries that require minimum administration, yet offer a rich set of tools to the end-user. We present the main components of our architecture, the protocols that regulate peer interactions, and an experimental evaluation that shows the efficiency, and the retrieval and filtering effectiveness of our approach. 1
Provisioning and scheduling resources for worldwide data-sharing services
- in: IEEE Int’l. Conf. on e-Science and Grid Computing (e-Science), IEEE Computer Society
"... Grid computing is becoming the natural way to aggregate and share large and heterogeneous sets of resources. However, grid development and acceptance hinge on proving that grids reliably support large communities of users, and their real applications. In this paper we assess the ability of existing ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Grid computing is becoming the natural way to aggregate and share large and heterogeneous sets of resources. However, grid development and acceptance hinge on proving that grids reliably support large communities of users, and their real applications. In this paper we assess the ability of existing grid infrastructures to provision resources for a class of applications with numerous potential users, namely the class of world-wide data-sharing services. For this purpose, we first analyze the requirements of this class of applications, and match them against the existing spare capacity in three existing large-scale grid environments, namely OSG/Grid3, NorduGrid, and CERN LCG. We then address the need to allocate insufficient resources to world-wide data-sharing services by introducing and assessing through trace-based simulation five domain-specific scheduling policies. Our findings support the idea that grid technology could be leveraged with great success for existing and future world-wide data-sharing services, without impacting the level of service for the currently existing load. 1
A Measure for Cluster Cohesion in Semantic Overlay Networks
"... Semantic overlay networks cluster peers that are semantically, thematically or socially close into groups by means of a rewiring procedure that is periodically executed by each peer. Rewiring proceeds by establishing new connections to similar peers, and by discarding connections that are outdated o ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Semantic overlay networks cluster peers that are semantically, thematically or socially close into groups by means of a rewiring procedure that is periodically executed by each peer. Rewiring proceeds by establishing new connections to similar peers, and by discarding connections that are outdated or pointing to dissimilar peers. This process aims at improving cluster quality (how well peers with similar content are clustered together) and by this, at improving the flow of information in the network by reducing the number of messages that are exchanged. Therefore, measuring the quality of clustering is an important issue by itself. This is exactly the issue this work is dealing with. In this paper, we introduce a new clustering measure that takes into account the whole neighborhood of a peer (rather than its direct neighbors) thus, providing better insight on the quality of the underlying clustered organisation. Our experimental evaluation with real-word data and queries confirms our assumption that the new measure is better suited for measuring clustering quality than other known measures, such as the (generalised) clustering coefficient.
Towards a Managed Extensible Control Plane for Knowledge-Based Networking
- Department of Computer Science, Trinity College Dublin
, 2006
"... Abstract — This paper proposes an open, extensible control plane for a global event service, based on semantically rich messages. This is based on the novel application of control plane separation and semantic-based matching to Content-Based Networks. Here we evaluate the performance issues involved ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract — This paper proposes an open, extensible control plane for a global event service, based on semantically rich messages. This is based on the novel application of control plane separation and semantic-based matching to Content-Based Networks. Here we evaluate the performance issues involved in attempting to perform ontology-based reasoning for content-based routing. This provides us with the motivation to explore peer-clustering techniques to achieve efficient aggregation of semantic queries. The clustering of super-peers using decentralized policy engineering will deliver the incremental deployment of new peer-clustering strategies. 1.
Content-based peer-to-peer network overlay for full-text federated search
- In Proceedings of 8 th RIAO Conference on Large-Scale Semantic Access to Content. Morpheus
, 2007
"... Peer-to-peer network overlays have mostly been designed to support search over document names, identifiers, or keywords from a small or controlled vocabulary. In this paper we propose a content-based P2P network overlay for full-text federated search over heterogeneous, open-domain contents. Local a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Peer-to-peer network overlays have mostly been designed to support search over document names, identifiers, or keywords from a small or controlled vocabulary. In this paper we propose a content-based P2P network overlay for full-text federated search over heterogeneous, open-domain contents. Local algorithms are developed to dynamically construct a network overlay with content-based locality and content-based small-world properties. Experimental results using P2P testbeds of real documents demonstrate the effectiveness of our approach. 1.
Improving ICE Service Selection in a P2P System using the Gradient Topology
"... Internet Connectivity Establishment (ICE) is becoming increasingly important for P2P systems on the open Internet, as it enables NAT-bound peers to provide accessible services. A problem for P2P systems that provide ICE services is how peers discover good quality ICE servers for NAT traversal, that ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Internet Connectivity Establishment (ICE) is becoming increasingly important for P2P systems on the open Internet, as it enables NAT-bound peers to provide accessible services. A problem for P2P systems that provide ICE services is how peers discover good quality ICE servers for NAT traversal, that is, the TURN and STUN servers that provide relaying and hole-punching services, respectively. Skype provides a P2P-based solution to this problem, where super-peers provide ICE services. However, experimental analysis of Skype indicates that peers perform a random walk of super-peers to find one with an acceptable roundtrip latency. In this paper, we discuss a self-organizing approach to discovering good quality ICE servers in a P2P system based the walk Topology. The walk Topology uses
Peer-to-Peer Clustering for Semantic Overlay Network Generation
"... Abstract. The peer-to-peer (P2P) paradigm presents an attractive solution for applications that require scalability, fault-tolerance and autonomy. P2P systems in their basic unstructured form suffer high costs when it comes to efficiently locating content, mainly due to the lack of global knowledge. ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The peer-to-peer (P2P) paradigm presents an attractive solution for applications that require scalability, fault-tolerance and autonomy. P2P systems in their basic unstructured form suffer high costs when it comes to efficiently locating content, mainly due to the lack of global knowledge. It is therefore crucial to organize content in an unsupervised way by creating groups of peers with similar content, in order to support efficient search mechanisms. In this paper, we discuss the need for content organization in unstructured P2P networks and present the requirements that must be fulfilled by any approach. We propose P2P clustering as a potential solution to Semantic Overlay Network (SON) generation for organizing P2P networks, and we present our unsupervised approach for decentralized SON creation towards this end. 1
Query workload-aware overlay construction using histograms
- In Proceedings of CIKM ’05
, 2005
"... Peer-to-peer (p2p) systems offer an efficient means of data sharing among a dynamically changing set of a large number of autonomous nodes. Each node in a p2p system is connected with a small number of other nodes thus creating an overlay network of nodes. A query posed at a node is routed through t ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Peer-to-peer (p2p) systems offer an efficient means of data sharing among a dynamically changing set of a large number of autonomous nodes. Each node in a p2p system is connected with a small number of other nodes thus creating an overlay network of nodes. A query posed at a node is routed through the overlay network towards nodes hosting data items that satisfy it. In this paper, we consider building overlays that exploit the query workload so that nodes are clustered based on their results to a given query workload. The motivation is to create overlays where nodes that match a large number of similar queries are a few links apart. Query frequency is also taken into account so that popular queries have a greater effect on the formation of the overlay than unpopular ones. We focus on range selection queries and use histograms to estimate the query results of each node. Then, nodes are clustered based on the similarity of their histograms. To this end, we introduce a workloadaware edit distance metric between histograms that takes into account the query workload. Our experimental results show that workload-aware overlays increase the percentage of query results returned for a given number of nodes visited as compared to both random (i.e., unclustered) overlays and non workload-aware clustered overlays (i.e., overlays that cluster nodes based solely on the nodes ’ content).
A Recall-Based Cluster Formation Game in Peer-to-Peer Systems
"... In many large-scale content sharing applications, participants or peers are grouped together forming clusters based on their content or interests. In this paper, we deal with the maintenance of such clusters in the presence of updates. We model the evolution of the system as a strategic game, where ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In many large-scale content sharing applications, participants or peers are grouped together forming clusters based on their content or interests. In this paper, we deal with the maintenance of such clusters in the presence of updates. We model the evolution of the system as a strategic game, where peers determine their cluster membership based on a utility function of the query recall. Peers are guided either by selfish or altruistic motives: selfish peers aim at improving the recall of their own queries, whereas altruistic peers aim at improving the recall of the queries of other peers. We study the evolution of such clusters both theoretically and experimentally under a variety of conditions. We show that, in general, local decisions made independently by each peer enable the system to adapt to changes and maintain the overall recall of the query workload. 1.
Semantic Query Routing and Distributed Top-k Query Processing in Peer-to-Peer Networks
, 2006
"... Requirements for widely distributed information systems supporting virtual organizations have given rise to a new category of peer-to-peer (p2p) systems called schema-based. In such systems each peer is a database management system in itself, exposing its own schema. In such a setting, a main object ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Requirements for widely distributed information systems supporting virtual organizations have given rise to a new category of peer-to-peer (p2p) systems called schema-based. In such systems each peer is a database management system in itself, exposing its own schema. In such a setting, a main objective is the efficient search across peer databases by processing each incoming query without overly consuming bandwidth. In this report, we adopt a super-peer-based architecture and suggest a query routing mechanism, upon which we propose a query processing technique for top-k queries. Top-k queries in the context of p2p systems give the opportunity to filter the results and to eliminate network traffic by choosing the k highest ranked results. We introduce HT-p2p and HT-p2p+, two extended versions of the Hybrid Threshold Algorithm adapted to our p2p scenario. For the evaluation of these algorithms we implemented a prototype system upon the JXTA platform. Extensive experiments with different data sets and parameters have shown promising results about the performance of the query processing strategy. ± Acknowledgement: Special thanks to Evi Dagalaki for her contribution to the evaluation process. 1

