Results 1 - 10
of
26
Epidemic-style Management of Semantic Overlays for Content-Based Searching
- In EuroPar
, 2005
"... Abstract. A lot of recent research on content-based P2P searching for filesharing applications has focused on exploiting semantic relations between peers to facilitate searching. To the best of our knowledge, all methods proposed to date suggest reactive ways to seize peers ’ semantic relations. Tha ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
Abstract. A lot of recent research on content-based P2P searching for filesharing applications has focused on exploiting semantic relations between peers to facilitate searching. To the best of our knowledge, all methods proposed to date suggest reactive ways to seize peers ’ semantic relations. That is, they rely on the usage of the underlying search mechanism, and infer semantic relations based on the queries placed and the corresponding replies received. In this paper we follow a different approach, proposing a proactive method to build a semantic overlay. Our method is based on an epidemic protocol that clusters peers with similar content. It is worth noting that this peer clustering is done in a completely implicit way, that is, without requiring the user to specify his preferences or to characterize the content of files he shares. 1
Clustering in Peer-to-Peer File Sharing Workloads
- In 3rd International Workshop on Peer-to-Peer Systems (IPTPS
, 2004
"... Peer-to-peer file sharing systems now generate a significant portion of Internet tra#c. A good understanding of their workloads is crucial in order to improve their scalability, robustness and performance. Previous measurement studies on Kazaa and Gnutella were based on monitoring peer requests, and ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Peer-to-peer file sharing systems now generate a significant portion of Internet tra#c. A good understanding of their workloads is crucial in order to improve their scalability, robustness and performance. Previous measurement studies on Kazaa and Gnutella were based on monitoring peer requests, and mostly concerned with peer and file availability and network tra#c. In this paper, we take di#erent measurements: instead of passively recording requests, we actively probe peers to get their cache contents information. This provides us with a map of contents, that we use to evaluate the degree of clustering in the system , and that could be exploited to improve significantly the search process.
Exploiting semantic clustering in the edonkey p2p network
- In 11th ACM SIGOPS European Workshop (SIGOPS
, 2004
"... Peer-to-peer file sharing now represents a significant portion of the Internet traffic and has generated a lot of interest from the research community. Some recent measurements studies of peer-to-peer workloads have demonstrated the presence of semantic proximity between peers. One way to improve pe ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Peer-to-peer file sharing now represents a significant portion of the Internet traffic and has generated a lot of interest from the research community. Some recent measurements studies of peer-to-peer workloads have demonstrated the presence of semantic proximity between peers. One way to improve performance of peer-to-peer file sharing systems is to exploit this locality of interest in order to connect semantically related peers so as to improve the search both in flooding- and server-based systems. Creating these additional connections raises interesting challenges and in particular (i) how to capture the semantic relationship between peers (ii) how to exploit these relationships and (iii) how to evaluate these improvements. In this paper, we evaluate several strategies to exploit the semantic proximity between peers against a real trace collected in November 2003 in the eDonkey 2000 peer-to-peer network. We present the results of this evaluation which confirm the presence of clustering in such networks and the interest to exploit it. 1 Introduction and
Discovering and exploiting keyword and attribute-value co-occurrences to improve p2p routing indices
- In CIKM
, 2006
"... Peer-to-Peer (P2P) search requires intelligent decisions for query routing: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional search results. These decisions are based on statistical summaries for each peer, which are usually organ ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
Peer-to-Peer (P2P) search requires intelligent decisions for query routing: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional search results. These decisions are based on statistical summaries for each peer, which are usually organized on a per-keyword basis and managed in a distributed directory of routing indices. Such architectures disregard the
T-Man: Gossipbased fast overlay topology construction. Computer Networks
, 2009
"... Large-scale overlay networks have become crucial ingredients of fully-decentralized applications and peer-to-peer systems. Depending on the task at hand, overlay networks are organized into different topologies, such as rings, trees, semantic and geographic proximity networks. We argue that the cent ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Large-scale overlay networks have become crucial ingredients of fully-decentralized applications and peer-to-peer systems. Depending on the task at hand, overlay networks are organized into different topologies, such as rings, trees, semantic and geographic proximity networks. We argue that the central role overlay networks play in decentralized application development requires a more systematic study and effort towards understanding the possibilities and limits of overlay network construction in its generality. Our contribution in this paper is a gossip protocol called T-MAN that can build a wide range of overlay networks from scratch, relying only on minimal assumptions. The protocol is fast, robust, and very simple. It is also highly configurable as the desired topology itself is a parameter in the form of a ranking method that orders nodes according to preference for a base node to select them as neighbors. The paper presents extensive empirical analysis of the protocol along with theoretical analysis of certain aspects of its behavior. We also describe a practical application of T-MAN for building Chord distributed hash table overlays efficiently from scratch. Key words: gossip-based protocols, overlay networks, bootstrapping, self-organizing middleware 1.
Emerging semantic communities in peer web search
- In P2PIR ’06: Proceedings of the international workshop on Information retrieval in peer-to-peer networks
, 2006
"... Peer network systems are becoming an increasingly important development in Web search technology. Many studies show that peer search systems perform better when a query is sent to a group of peers semantically similar to the query. This suggests that semantic communities should form so that a query ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Peer network systems are becoming an increasingly important development in Web search technology. Many studies show that peer search systems perform better when a query is sent to a group of peers semantically similar to the query. This suggests that semantic communities should form so that a query can quickly propagate to many appropriate peers. For the network to be functional, its dynamic communication topology must match the semantic clustering of peers. We introduce two criteria to evaluate a peer search network based on the concept of semantic locality: first, the “smallworld” topology of the network; second, we use topical semantic similarity to monitor the quality of a peer’s neighbors over time by looking at whether a peer chooses semantically appropriate neighbors to route its queries. We present several simulation experiments conducted with different peer search algorithms on our peer Web search system, 6S. The results suggest that 6S, despite its use of an unstructured overlay network; can effectively foster the spontaneous formation of semantic communities through local peer interactions alone.
Two-level semantic caching scheme for super-peer networks
- In IEEE Tenth International Workshop on Web Content Caching and Distribution, Sophia Antipolis
, 2005
"... Abstract. Some recent measurement studies of file-sharing peer-to-peer networks have demonstrated the presence of semantic proximity between peers and between shared files. This observation may be used for improving the performance of searching by introducing semantic caches. One type of such caches ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Abstract. Some recent measurement studies of file-sharing peer-to-peer networks have demonstrated the presence of semantic proximity between peers and between shared files. This observation may be used for improving the performance of searching by introducing semantic caches. One type of such caches links peers that are interested in similar files. The query routing mechanism uses this information by forwarding queries first to peers which are semantically close. The second type of semantic caches groups similar content instead of similar nodes. In this paper we show how to combine both methods by introducing a two-level caching infrastructure based on super-peers. The super-peers in our system cache pointers to files recently requested by their client peers. The client peers, on the other hand, constantly look for the super-peers that are most suitable for them. We propose a simple, yet powerful cache management policy that guarantees high cache hit ratios also for the less popular files. Further, we discuss the design choices and optimizations of the presented model. Finally, we evaluate our system versus the symmetric network that uses only one level of semantic caches. 1
Clustering in p2p exchanges and consequences on performances
- In IPTPS
, 2005
"... Abstract — We propose here an analysis of a rich dataset which gives an exhaustive and dynamic view of the exchanges processed in a running eDonkey system. We focus on correlation in term of data exchanged by peers having provided or queried at least one data in common. We introduce a method to capt ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Abstract — We propose here an analysis of a rich dataset which gives an exhaustive and dynamic view of the exchanges processed in a running eDonkey system. We focus on correlation in term of data exchanged by peers having provided or queried at least one data in common. We introduce a method to capture these correlations (namely the data clustering), and study it in detail. We then use it to propose a very simple and efficient way to group data into clusters and show the impact of this underlying structure on search in typical P2P systems. Finally, we use these results to evaluate the relevance and limitations of a model proposed in a previous publication. We indicate some realistic values for the parameters of this model, and discuss some possible improvements. I. PRELIMINARIES
Optimizing peer relationships in a super-peer network
- in: 27th International Conference on Distributed Computing Systems (ICDCS 2007
, 2007
"... Super-peer architectures exploit the heterogeneity of nodes in a P2P network by assigning additional responsibilities to higher-capacity nodes. In the design of a super-peer network for file sharing, several issues have to be addressed: how client peers are related to super-peers, how super-peers lo ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Super-peer architectures exploit the heterogeneity of nodes in a P2P network by assigning additional responsibilities to higher-capacity nodes. In the design of a super-peer network for file sharing, several issues have to be addressed: how client peers are related to super-peers, how super-peers locate files, how the load is balanced among the super-peers, and how the system deals with node failures. In this paper we introduce a self-organizing super-peer network architecture (SOSPNet) that solves these issues in a fully decentralized manner. SOSPNet maintains a super-peer network topology that reflects the semantic similarity of peers sharing content interests. Super-peers maintain semantic caches of pointers to files which are requested by peers with similar interests. Client peers, on the other hand, dynamically select super-peers offering the best search performance. We show how this simple approach can be employed not only to optimize searching, but also to solve generally difficult problems encountered in P2P architectures such as load balancing and fault tolerance. We evaluate SOSPNet using a model of the semantic structure derived from the 8-month traces of two large file-sharing communities. The obtained results indicate that SOSPNet achieves close-to-optimal file search performance, quickly adjusts to changes in the environment (node joins and leaves), survives even catastrophic node failures, and efficiently distributes the system load taking into account peer capacities. 1
pNear: combining content clustering and distributed hash tables
- In P2PKM
, 2005
"... Abstract. Full-text search is a challenging problem in Peer-to-Peer (P2P) systems. Currently two promising directions to solve this problem are (1) distributed indexes like hash-tables (DHTs) and (2) semantic overlay networks (SONs) which can be divided into systems that cluster peers with similar c ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract. Full-text search is a challenging problem in Peer-to-Peer (P2P) systems. Currently two promising directions to solve this problem are (1) distributed indexes like hash-tables (DHTs) and (2) semantic overlay networks (SONs) which can be divided into systems that cluster peers with similar content based on term overlap and systems that map both the content and queries on a shared semantic data structure. In this paper we present the pNear system that combines DHTs with clustering via term overlap and show that we are able to tackle some important disadvantages that hold for the individual approaches. We evaluate our approach via simulations based on a large and realistic data-set that we have constructed for this purpose, and which will be useful for similar experiments by others. 1

