5 citations found. Retrieving documents...
Zhihong Lu. Scalable Distributed Architectures for Information Retrieval. PhD thesis, University of Massachussets Amherst, 1999.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Optimized Query Execution in Large Search Engines with Global.. - Long, Suel (2003)   (3 citations)  (Correct)

....5.2. Note that [8] does not give much detail on how fancy hits are used and we do not know what types of pruning schemes are used in the current Google engine. However, to our knowledge several of the major engines still scan the entire inverted lists for most queries. With the exception of [27], we are not aware of any previous large scale study on query throughput in large engines under web query loads. 5 Pruning Techniques We now describe the different pruning techniques that we study in this paper. Recall that we are given a global ordering of the web pages and an associated global ....

Z. Lu. Scalable Distributed Architectures For Information Retrieval. PhD thesis, Univ. of Massachusetts, May 1999.


Partial Collection Replication versus Caching for Information.. - Lu, McKinley (2000)   (2 citations)  Self-citation (Lu)   (Correct)

....used to build it, and additional queries that are a good match. The combination of these and the new trace results presented here demonstrates that replicas can satisfy more queries than caches. We demonstrate that partial replicas can significantly outperform caches using a validated simulator [7, 18] which closely matches our working prototype system with replica selection. The prototype uses InQuery for the basic IR functionality [8] We compare performance for searching a terabyte of data and find that partial replication begins to outperform caching when its hit rate increases by 3 to 6 . ....

....documents in a hierarchy of proxy servers placed between clients and Web servers. 2.2 Partial Collection Replication Searchable replicas speed up both query processing and document access. We use a replica selector to select a partial replica based on content and load, rather than exact match [19, 18]. In previous work, we showed that the inference network model is very effective at selecting a relevant replica [19] We implemented the replica selection inference network as a pseudo InQuery database. Each pseudo document corresponds to a replica or text database, and its index stores the ....

[Article contains additional citation context not shown here]

Z. Lu. Scalable Distributed Architectures For Information Retrieval. PhD thesis, University of Massachusetts at Amherst, May 1999.


Searching a Terabyte of Text Using Partial Replication - Lu, McKinley (1999)   Self-citation (Lu)   (Correct)

....a selection function must determine whether replicas contain all, some, or none of the relevant documents for a query. We describe such a function elsewhere [24] In this paper using a hierarchy of replicas, we report on performance as a function of locality using a validated simulator [23]. The simulator closely matches our prototype system that uses InQuery for the basic IR functionality on all text databases: original and replicated [9] We compare the performance of searching a terabyte of text using partial replication to partitioning, and find partial replication is more ....

....Replication (DP=40 ) a) using the simulator Figure 3: Performance validation of simulator with partial replication. Digital Unix V3.2D 1 (Rev 41) and servers are connected by a 10 Mbps Ethernet. In previous work, we showed the simulator closely matches a multithreaded implementation of InQuery [8, 23]. In addition, we report on the validation of some of our simulation results below, comparing partitioning and replication with varying degrees of locality for a 16GB text database on a single server, and again our measured times closely match our simulator. Of course, simulation enables us to ....

[Article contains additional citation context not shown here]

Zhihong Lu. Scalable Distributed Architectures For Information Retrieval. PhD thesis, University of Mas19 sachusetts at Amherst, May 1999.


Partial Collection Replication For Information Retrieval - Lu, McKinley (2003)   (1 citation)  Self-citation (Lu)   (Correct)

....most relevant of the replicas or original collection, and thus maintains the highest retrieval effectiveness while searching the least amount of data as compared with the ranking functions for collection ranking. We also report on performance as a function of locality using a validated simulator [Lu, 1999]. The simulator closely matches our prototype system that uses InQuery for the basic IR functionality on all collections, original and replicated [Callan et al. 1995a] We compare the performance of searching a terabyte of text using partial replication to partitioning, and find partial ....

....on DEC Alpha Server 2100 5 250 with 3 CPUs (clocked at 250 MHZ) and 1024 MB main memory, running Digital Unix V3.2D1 (Rev 41) Servers are connected by a 10 Mbps Ethernet. In previous work, we showed the simulator closely matches a multithreaded implementation of InQuery [Cahoon et al. 1999, Lu, 1999] In addition, we report on the validation of some of our simulation results below, comparing partitioning and replication with varying degrees of locality for a 16GB collection on a single server, and again our measured times closely match our simulator. Of course, simulation enables us to ....

[Article contains additional citation context not shown here]

Lu, Z. (1999). Scalable Distributed Architectures For Information Retrieval. PhD thesis, University of Massachusetts at Amherst.


Design of a Parallel and Distributed Web Search Engine - Orlando, Perego, Silvestri (2001)   (Correct)

No context found.

Zhihong Lu. Scalable Distributed Architectures for Information Retrieval. PhD thesis, University of Massachussets Amherst, 1999.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC