• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities (2002)

by F M Cuenca-Acuna, C Peery, R P Martin, T D Nguyen
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 195
Next 10 →

Network Applications of Bloom Filters: A Survey

by Andrei Broder, Michael Mitzenmacher - INTERNET MATHEMATICS , 2002
"... A Bloomfilter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used in ..."
Abstract - Cited by 522 (17 self) - Add to MetaCart
A Bloomfilter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used in database applications since the 1970s, but only in recent years have they become popular in the networking literature. The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications.

Trickle: A Self-Regulating Algorithm for Code Propagation and Maintenance in Wireless Sensor Networks

by Philip Levis, Neil Patel, David Culler, Scott Shenker - In Proceedings of the First USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI , 2004
"... We present Trickle, an algorithm for propagating and maintaining code updates in wireless sensor networks. Borrowing techniques from the epidemic/gossip, scalable multicast, and wireless broadcast literature, Trickle uses a "polite gossip" policy, where motes periodically broadcast a code ..."
Abstract - Cited by 376 (9 self) - Add to MetaCart
We present Trickle, an algorithm for propagating and maintaining code updates in wireless sensor networks. Borrowing techniques from the epidemic/gossip, scalable multicast, and wireless broadcast literature, Trickle uses a "polite gossip" policy, where motes periodically broadcast a code summary to local neighbors but stay quiet if they have recently heard a summary identical to theirs. When a mote hears an older summary than its own, it broadcasts an update. Instead of flooding a network with packets, the algorithm controls the send rate so each mote hears a small trickle of packets, just enough to stay up to date. We show that with this simple mechanism, Trickle can scale to thousand-fold changes in network density, propagate new code in the order of seconds, and impose a maintenance cost on the order of a few sends an hour.
(Show Context)

Citation Context

... controlled, density-aware flooding algorithms for wireless and multicast networks [6, 16, 19]. The second is epidemic and gossiping algorithms for maintaining data consistency in distributed systems =-=[2, 4, 5]-=-. Prior work in network broadcasts has dealt with a different problem than the one Trickle tackles: delivering a piece of data to as many nodes as possible within a certain time period. Early work sho...

The Bloomier filter: An efficient data structure for static support lookup tables

by Bernard Chazelle, Joe Kilian, Ronitt Rubinfeld, Ayellet Tal - in Proc. Symposium on Discrete Algorithms , 2004
"... “Oh boy, here is another David Nelson” ..."
Abstract - Cited by 97 (0 self) - Add to MetaCart
“Oh boy, here is another David Nelson”
(Show Context)

Citation Context

...ers are widely used in practice when storage is at a premium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks =-=[5, 8, 17]-=-, resource routing [15, 26], packet routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, an...

Open Problems in Data-Sharing Peer-to-Peer Systems

by Neil Daswani, Hector Garcia-molina, Beverly Yang - In ICDT 2003 , 2003
"... In a Peer-To-Peer (P2P) system, autonomous computers pool their resources (e.g., les, storage, compute cycles) in order to inexpensively handle tasks that would normally require large costly servers. The scale of these systems, their \open nature", and the lack of centralized control pose dicul ..."
Abstract - Cited by 77 (1 self) - Add to MetaCart
In a Peer-To-Peer (P2P) system, autonomous computers pool their resources (e.g., les, storage, compute cycles) in order to inexpensively handle tasks that would normally require large costly servers. The scale of these systems, their \open nature", and the lack of centralized control pose dicult performance and security challenges. Much research has recently focused on tackling some of these challenges
(Show Context)

Citation Context

...ch if k is much less than the total number of results (which is generally the case, for example, in web searches). Techniques for ranked search exists for distributed systems of moderate scale (e.g., =-=[7]-=-), but future research must extend these techniques to support much larger systems. { Aggregates: A user may sometimes be interested in knowing aggregate properties of the system or data collection as...

Hybrid Global-Local Indexing for Efficient Peer-To-Peer Information Retrieval

by Chunqiang Tang, Sandhya Dwarkadas, Hya Dwarkadas , 2004
"... Content-based full-text search still remains a particularly challenging problem in peer-to-peer (P2P) systems. Traditionally, there have been two index partitioning structures---partitioning based on the document space or partitioning based on keywords. The former requires search of every node in th ..."
Abstract - Cited by 67 (3 self) - Add to MetaCart
Content-based full-text search still remains a particularly challenging problem in peer-to-peer (P2P) systems. Traditionally, there have been two index partitioning structures---partitioning based on the document space or partitioning based on keywords. The former requires search of every node in the system to answer a query whereas the latter transmits a large amount of data when processing multi-term queries. In this paper, we propose eSearch---a P2P keyword search system based on a novel hybrid indexing structure. In eSearch, each node is responsible for certain terms. Given a document, eSearch uses a modern information retrieval algorithm to select a small number of top (important) terms in the document and publishes the complete term list for the document to nodes responsible for those top terms. This selective replication of term lists allows a multi-term query to proceed local to the nodes responsible for query terms. We also propose automatic query expansion to alleviate the degradation of quality of search results due to the selective replication, overlay source multicast to reduce the cost of disseminating term lists, and techniques to balance term list distribution across nodes.

Improving Collection Selection with Overlap Awareness in P2P Search Engines

by Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum, Christian Zimmer - In SIGIR , 2005
"... Collection selection has been a research issue for years. Typically, in related work, precomputed statistics are employed in order to estimate the expected result quality of each collection, and subsequently the collections are ranked accordingly. Our thesis is that this simple approach is insuffici ..."
Abstract - Cited by 66 (23 self) - Add to MetaCart
Collection selection has been a research issue for years. Typically, in related work, precomputed statistics are employed in order to estimate the expected result quality of each collection, and subsequently the collections are ranked accordingly. Our thesis is that this simple approach is insufficient for several applications in which the collections typically overlap. This is the case, for example, for the collections built by autonomous peers crawling the web. We argue for the extension of existing quality measures using estimators of mutual overlap among collections and present experiments in which this combination outperforms CORI, a popular approach based on quality estimation. We outline our prototype implementation of a P2P web search engine, coined MINERVA 1, that allows handling large amounts of data in a distributed and self-organizing manner. We conduct experiments which show that taking overlap into account during collection selection can drastically decrease the number of collections that have to be contacted in order to reach a satisfactory level of recall, which is a great step toward the feasibility of distributed web search.
(Show Context)

Citation Context

...e stored only where they originate from. In contrast, our approach leaves it to the peers to what extent they want to crawl interesting fractions of the Web and build their own local indexes. PlanetP =-=[8]-=- is a publish-subscribe service for P2P communities, supporting content ranking search. PlanetP distinguishes local indexes and a global index to describe all peers and their shared information. The g...

Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks

by Wolf-tilo Balke, Wolfgang Nejdl, Wolf Siberski, Uwe Thaden , 2005
"... Query processing in traditional information management systems has moved from an exact match model to more flexible paradigms allowing cooperative retrieval by aggregating the database objects' degree of match for each different query predicate and returning the best matching objects only. In p ..."
Abstract - Cited by 58 (10 self) - Add to MetaCart
Query processing in traditional information management systems has moved from an exact match model to more flexible paradigms allowing cooperative retrieval by aggregating the database objects' degree of match for each different query predicate and returning the best matching objects only. In peer-to-peer systems such strategies are even more important, given the potentially large number of peers, which may contribute to the results. Yet current peer-to-peer research has barely started to investigate such approaches. In this paper we will discuss the benefits of best match/top-k queries in the context of distributed peer-to-peer information infrastructures and show how to extend the limited query processing in current peer-to-peer networks by allowing the distributed processing of top-k queries, while maintaining a minimum of data traffic. Relying on a super-peer backbone organized in the HyperCuP topology we will show how to use local indexes for optimizing the necessary query routing and how to process intermediate results in inner network nodes at the earliest possible point in time cutting down the necessary data traffic within the network. Our algorithm is based on dynamically collected query statistics only, no continuous index update processes are necessary, allowing it to scale easily to large numbers of peers, as well as dynamic additions/deletions of peers. We will show our approach to always deliver correct result sets and to be optimal in terms of necessary object accesses and data traffic. Finally, we present simulation results for both static and dynamic network environments.
(Show Context)

Citation Context

...cal systems (with real-time constraints) like discussed in [4]. In the context of peer-to-peer networks, only very few authors have explored retrieval algorithms taking rankings into account. PlanetP =-=[11]-=- concentrates on peer-to-peer communities in unstructured peer-to-peer networks with sizes up to ten thousand peers. They introduce two data structures for searching and ranking, which create a replic...

NeuroGrid: Semantically Routing Queries in Peer-to-Peer Networks

by Sam Joseph - In Proc. Intl. Workshop on Peer-to-Peer Computing , 2002
"... NeuroGrid is an adaptive decentralized search system. NeuroGrid nodes support distributed search through semantic routing forwarding of queries based on content), and a learning mechanism that dynamically adjusts metadata describing the contents of nodes and the files that make up those contents. Ne ..."
Abstract - Cited by 58 (1 self) - Add to MetaCart
NeuroGrid is an adaptive decentralized search system. NeuroGrid nodes support distributed search through semantic routing forwarding of queries based on content), and a learning mechanism that dynamically adjusts metadata describing the contents of nodes and the files that make up those contents. NeuroGrid is an open-source project, and prototype software has been made available at http://www.neurogrid.net/ NeuroGrid presents users with an alternative to hierarchical, folder-based file organization, and in the process offers an alternative approach to distributed search.
(Show Context)

Citation Context

... meta-data keys that include the TFIDF 3 rankings of keywords in Freenet documents, as well as the Freenet document key. The TFIDF model is also used in the PlanetP architecture of Cuenca-Acuna et al =-=[4], wh-=-ich relies on “gossiping” between nodes in order for information about remote node contents to be updated. FASD employs a cosine correlation to determine document-query closeness and Freenet routi...

Theory and network applications of dynamic bloom filters

by Deke Guo, Honghui Chen, Xueshan Luo - In Proceedings of the 25th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM , 2006
"... Abstract — A bloom filter is a simple, space-efficient, randomized data structure for concisely representing a static data set, in order to support approximate membership queries. It has great potential for distributed applications where systems need to share information about what resources they ha ..."
Abstract - Cited by 38 (6 self) - Add to MetaCart
Abstract — A bloom filter is a simple, space-efficient, randomized data structure for concisely representing a static data set, in order to support approximate membership queries. It has great potential for distributed applications where systems need to share information about what resources they have. The space efficiency is achieved at the cost of a small probability of false positive in membership queries. However, for many applications the space savings and short locating time consistently outweigh this drawback. In this paper, we introduce dynamic bloom filters (DBF) to support concise representation and approximate membership queries of dynamic sets, and study the false positive probability and union algebra operations. We prove that DBF can control the false positive probability at a low level by adjusting the number of standard bloom filters used according to the actual size of current dynamic set. The space complexity is also acceptable if the actual size of dynamic set does not deviate too much from the predefined threshold. Furthermore, we present multidimension dynamic bloom filters (MDDBF) to support concise representation and approximate membership queries of dynamic sets in multiple attribute dimensions, and study the false positive probability and union algebra operations through mathematic analysis and experimentation. We also explore the optimization approach and three network applications of bloom filters, namely bloom joins, informed search, and global index implementation. Our simulation shows that informed search based on bloom filters can obtain higher recall and success rate of query than the blind search protocol.
(Show Context)

Citation Context

...ions [2] and have received widespread attention in networking literature recently. Bloom filters can be used to summarize contents to aid global collaboration in peer-to-peer (P2P) networks [3], [4], =-=[5]-=-, to support probabilistic algorithms for routing and locating resources [6], [7], [8], [9], and to share web cache information [10]. In fact, bloom filters are a better data structure and have great ...

Data discovery and dissemination with dip

by Kaisen Lin, Philip Levis - in Information Processing in Sensor Networks, 2008. IPSN ’08. International Conference on , 2008
"... We present DIP, a data discovery and dissemination pro-tocol for wireless networks. Prior approaches, such as Trickle or SPIN, have overheads that scale linearly with the number of data items. For T items, DIP can identify new items withO(log(T)) packets while maintaining aO(1) de-tection latency. T ..."
Abstract - Cited by 37 (3 self) - Add to MetaCart
We present DIP, a data discovery and dissemination pro-tocol for wireless networks. Prior approaches, such as Trickle or SPIN, have overheads that scale linearly with the number of data items. For T items, DIP can identify new items withO(log(T)) packets while maintaining aO(1) de-tection latency. To achieve this performance in a wide spec-trum of network configurations, DIP uses a hybrid approach of randomized scanning and tree-based directed searches. By dynamically selecting which of the two algorithms to use, DIP outperforms both in terms of transmissions and speed. Simulation and testbed experiments show that DIP sends 20-60 % fewer packets than existing protocols and can be 200 % faster, while only requiring O(log(log(T))) addi-tional state per data item. 1
(Show Context)

Citation Context

... similar, but with an opposite purpose: while they find similarities in filters, DIP seeks to find differences. Bloom filters are commonly used in distributed and replicated IP systems (e.g., PlanetP =-=[4]-=-), but to our knowledge DIP represents their first use in wireless dissemination. The tradeoffs between deterministic and randomized algorithms appear in many domains. At one extreme of data reliabili...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University