17 citations found. Retrieving documents...
B. Cahoon, K. S. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transactions on Information Systems, 18(1): 1--43, January 2000.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
ODISSEA: A Peer-to-Peer Architecture for Scalable.. - Suel, Mathur, Wu, .. (2003)   (13 citations)  (Correct)

....our approach does in fact not send the entire list, as explained later. The two index organizations are illustrated in Figure 2. There have been a number of performance comparisons between local and global index organizations and several hybrid organizations on parallel architectures; see, e.g. [4, 12, 48], but these studies do not directly apply to widely distributed environments. table and table chair chair Figure 2: Query processing in a local (left) and global index organization. The main issue with local index organizations is that all or most nodes need to be contacted for most ....

.... execution in IR and search engines, we refer to [3, 5, 51] and for issues in parallel search engine architecture we refer to [7, 8, 28, 41] Discussions and comparisons of local and global index partitioning schemes and the resulting query performance on parallel architectures are given, e.g. in [4, 12, 25, 31, 32, 48]. There has been a lot of recent interest in the pruning techniques of Fagin et. al [17, 19] see also [18] for a survey and [13] for early related ideas. Most of the interest has been focused on multimedia and meta search scenarios, and we are not aware of previous applications in a peer to peer ....

B. Cahoon, K. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. IEEE Transactions on Information Systems, 18(1):1--43, January 2000.


Optimized Query Execution in Large Search Engines with Global.. - Long, Suel (2003)   (3 citations)  (Correct)

.... For background on indexing and query execution in IR and search engines, we refer to [3, 5, 40] and for basics of parallel search engine architecture we refer to [7, 8, 26, 34] Discussions and comparisons of local and global index partitioning schemes and their performance are given, e.g. in [4, 12, 23, 28, 37]. A large amount of recent work has focused on link based ranking and analysis schemes; see [6, 22, 24, 25, 31, 33] for a small sample. Previous work on pruning techniques for top can be divided into two fairly disjoint sets of literature. In the IR community, researcher have studied pruning ....

B. Cahoon, K. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. IEEE Transactions on Information Systems, 18(1):1--43, January 2000.


ODISSEA: A Peer-to-Peer Architecture for Scalable.. - Suel, Mathur, Wu, .. (2003)   (13 citations)  (Correct)

....list to the node holding the list for table. We emphasize here that our approach does in fact not send the entire list, as explained later. There have been a number of performance comparisons between local and global index organizations and several hybrid organizations on parallel architectures [3, 9, 34], but these do not directly apply to widely distributed environments. The main issue with local index organizations is that all or most nodes need to be contacted for most queries, and thus such schemes are unlikely to scale beyond a few hundred nodes. There have been attempts to overcome this ....

B. Cahoon, K. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. IEEE Transactions on Information Systems, 18(1):1--43, January 2000.


Searching a Terabyte of Text Using Partial Replication - Zhihong Lu Kathryn   Self-citation (Mckinley Lu)   (Correct)

....using InQuery running on DEC Alpha Server 2100 5 250 with 3 CPUs (clocked at 250 MHZ) and 1024 MB main memory, running Digital Unix V3.2D 1 (Rev 41) and servers are connected by a 10 Mbps Ethernet. In previous work, we showed the simulator closely matches a multithreaded implementation of InQuery [8, 23]. In addition, we validated some of our simulation results below comparing partitioning and replication with varying degrees of locality for a 16GB text database on a single server, and again our measured times closely match our simulator [23] Of course, simulation enables us to explore in a ....

B. Cahoon, K. S. McKinley, and Zhihong Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transaction on Information Syetems (submitted), 1998.


Partial Collection Replication versus Caching for Information.. - Lu, McKinley (2000)   (2 citations)  Self-citation (Mckinley Lu)   (Correct)

....used to build it, and additional queries that are a good match. The combination of these and the new trace results presented here demonstrates that replicas can satisfy more queries than caches. We demonstrate that partial replicas can significantly outperform caches using a validated simulator [7, 18] which closely matches our working prototype system with replica selection. The prototype uses InQuery for the basic IR functionality [8] We compare performance for searching a terabyte of data and find that partial replication begins to outperform caching when its hit rate increases by 3 to 6 . ....

.... have investigated the performance of distributed IR systems [5, 6, 9, 12, 20, 25] Most of the previous work experiments with a text database less than 1 GB and focuses on speedup when a text database is distributed over more servers [5, 12, 17, 20] Only Couvreur et al. 9] and Cahoon et al. [6, 7] use simulation to experiment with more than 100 GB of data. None of these previous studies include partial replication or caching. InQuery, our base system, is not the fastest text retrieval system available today [13] We model and validate against a 3 processor 250MHz Alpha which can maintain ....

B. Cahoon, K. S. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transaction on Information Systems (accepted), 1999.


The Effect Of Collection Organization And Query Locality On.. - Zhihong Lu Att   Self-citation (Mckinley Lu)   (Correct)

....In this section, we discuss architectures for parallel and distributed IR systems. Our research combines and extends previous work in distributed IR (Burkowski, 1990; Harman et al. 1991; Couvreur et al. 1994; Burkowski et al. 1995; Cahoon and McKinley, 1996; Hawking, 1997; Hawking et al. 1998; Cahoon et al. 1999) since we model and analyze a complete system architecture with replicas, replica selection, and collection selection under a variety of workloads and conditions. We base our distributed system on INQUERY (Callan et al. 1992; Turtle, 1991) a proven, effective retrieval engine. We also model ....

....containing a 10.2 GB collection. The experiment uses four shorter queries of 4 to 16 terms. This work focus on the speedup of a single query, while our work evaluates the performance for a loaded system under a variety of workloads and collection configurations. 6 Cahoon and McKinley, 1996, and Cahoon et al. 1999, report a simulation study on a distributed information retrieval system based on INQUERY. They assume the collections are uniformly distributed, and experiment with collections up to 128 GB using a variety of workloads. They measure performance as a function of system parameters such as client ....

Cahoon, B., McKinley, K. S., and Lu, Z. (1999). Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transaction on Information Systems. To appear.


Scalable Distributed Architectures for Information Retrieval - Lu (1999)   (2 citations)  Self-citation (Mckinley Lu)   (Correct)

....to calculate the precision recall table. 2.2 Performance of Distributed Information Retrieval Distributed IR systems enable users to simultaneously access multiple text collections distributed over the network. A number of studies have investigated the performance of distributed IR systems [11, 12, 13, 20, 41, 53, 56, 58, 59, 70, 71]. Most of the previous work experiments with collections less than 1 GB and focuses on speedup of query processing for an unloaded system when a collection is distributed over several servers [11, 41, 53, 56, 58, 59] Only Couvreur et al. 20] and Cahoon and McKinley [12, 13] use simulation to ....

.... 41, 53, 56, 58, 59, 70, 71] Most of the previous work experiments with collections less than 1 GB and focuses on speedup of query processing for an unloaded system when a collection is distributed over several servers [11, 41, 53, 56, 58, 59] Only Couvreur et al. 20] and Cahoon and McKinley [12, 13] use simulation to experiment with more than 100 GB of data. Macleod et al. Burkowski, and Martin et al. built their distributed IR systems on a network of very slow servers [11, 41, 56, 58, 59] The machines they used are IBM PC XT, IBM PC AT, and Apple Macintosh II, and the link speed is 9600 ....

[Article contains additional citation context not shown here]

Cahoon, B., McKinley, K. S., and Lu, Z. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transaction on Information Syetems (submitted) (1999).


Searching a Terabyte of Text Using Partial Replication - Lu, McKinley (1999)   Self-citation (Mckinley Lu)   (Correct)

....Replication (DP=40 ) a) using the simulator Figure 3: Performance validation of simulator with partial replication. Digital Unix V3.2D 1 (Rev 41) and servers are connected by a 10 Mbps Ethernet. In previous work, we showed the simulator closely matches a multithreaded implementation of InQuery [8, 23]. In addition, we report on the validation of some of our simulation results below, comparing partitioning and replication with varying degrees of locality for a 16GB text database on a single server, and again our measured times closely match our simulator. Of course, simulation enables us to ....

B. Cahoon, K. S. McKinley, and Zhihong Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transactions on Information Systems (submitted), 1998.


Partial Collection Replication For Information Retrieval - Lu, McKinley (2003)   (1 citation)  Self-citation (Mckinley Lu)   (Correct)

....using InQuery running on DEC Alpha Server 2100 5 250 with 3 CPUs (clocked at 250 MHZ) and 1024 MB main memory, running Digital Unix V3.2D1 (Rev 41) Servers are connected by a 10 Mbps Ethernet. In previous work, we showed the simulator closely matches a multithreaded implementation of InQuery [Cahoon et al. 1999, Lu, 1999] In addition, we report on the validation of some of our simulation results below, comparing partitioning and replication with varying degrees of locality for a 16GB collection on a single server, and again our measured times closely match our simulator. Of course, simulation enables ....

Cahoon, B., McKinley, K. S., and Lu, Z. (1999). Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transactions on Information Systems (accepted).


A Performance Evaluation of Parallel Information Retrieval .. - Lu, McKinley, Cahoon   Self-citation (Cahoon Mckinley)   (Correct)

....response includes the document titles and the first few sentences of the documents. A document command requests a document using its document identifier. The response includes the complete text of the document. 2.1. 2 Simulation Model We use a simulation model we previously built for InQuery work [3, 4]. Because we use a more recent version of InQuery on a DEC AlphaServer 2100 5 250 clocked at 250 MHz instead of an MIPS R3000 clocked at 40 MHz, we validated query response time of our simulator again. We model query response time as a function of query length and term frequency. Our validation ....

....exploits parallelism as follows: 1) It executes multiple IR commands in parallel, by either multitasking or multithreading; or (2) It executes one command against multiple partitions of a collection in parallel. 2.2.1 Multithreading vs. Multitasking Since we already have distributed servers [3, 4] where each server is single thread process built on a uniprocessor machine, we implemented a parallel server via multitasking. This version simply executes a light weight broker and multiple executables of the single thread server on the same machine,communicating by message passing [3, 4] ....

[Article contains additional citation context not shown here]

B. Cahoon and K. S. McKinley. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. Submitted for publication, 1997.


The Hardware/Software Balancing Act for Information.. - Lu, McKinley, Cahoon   Self-citation (Cahoon Mckinley)   (Correct)

....the document titles and the first few sentences of the documents. A document command requests a document using its document identifier. The response includes the complete text of the document. 3.1. 2 Simulation Model and Validation We use a simulation model we previously built for InQuery work [5, 6]. The simulation model is driven by empirical timing measurements from the actual system. We model three basic IR operations: query evaluation, obtaining summary information, and retrieving documents. We measure CPU, I O bus, and disk usage for each operation, but do not measure the memory and ....

....collection is 2.3 KB, which is very close to the average Web page size (around 2 KB according to the figures published by AltaVista [2] 2 ) The simulator only accepts natural language queries. Two parameters, query length and query term frequency, determine the characteristics of a query. See [5, 6] for more details. Because we use a more recent version of InQuery on a DEC AlphaServer 2100 5 250 clocked at 250 MHz instead of an MIPS R3000 clocked at 40 MHz, we validate query response time of our simulator again. We model query response time as a function of query length and term frequency. ....

[Article contains additional citation context not shown here]

B. Cahoon and K. S. McKinley. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. Submitted for publication, 1997.


In Search of Reliable Retrieval Experiments - William Webber And (2005)   (Correct)

No context found.

B. Cahoon, K. S. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transactions on Information Systems, 18(1): 1--43, January 2000.


Efficient Query Evaluation on Large Textual Collections in a.. - Zhang, Suel (2005)   (Correct)

No context found.

B. Cahoon, K. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. IEEE Transactions on Information Systems, 18(1):1--43, Jan. 2000.


Design of a Parallel and Distributed Web Search Engine - Orlando, Perego, Silvestri (2001)   (Correct)

No context found.

B. Cahoon, K.S. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workload. IEEE Transactions on Information Systems, 1999.


ODISSEA: A Peer-to-Peer Architecture for Scalable.. - Suel, Mathur, Wu, .. (2003)   (13 citations)  (Correct)

No context found.

B. Cahoon, K. McKinley, and Z. Lu. Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. IEEE Transactions on Information Systems, 18(1):1--43, January 2000.


Stochastic Modeling of Intrusion-Tolerant Server.. - Gupta, Lam..   (Correct)

No context found.

B. Cahoon, K. S. McKinley, and Z. Lu, "Evaluating the Performance of Distributed Architectures for Information Retrieval using a Variety of Workloads," IEEE Trans. on Information Sys., Vol. 18, No. 1, pp. 1--43, 1997.


A Dispatcher-Driven Processing Architecture For - Image Similarity Retrieval (2002)   (Correct)

No context found.

B. Cahoon, K.S. McKinley and Z. Lu, Evaluating the performance of distributed architectures for information retrieval using a variety of workloads, ACM Trans. on

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC