| Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195--222, 1997. |
....left out due to space limitation. Experimental results will be presented in Section 6. We conclude the paper in Section 7. 2. RELATED WORK In the last several years, a large number of research papers on issues related to metasearch engines or distributed collections have been published (e.g. [1, 5, 8, 9, 10, 17, 18, 19, 26, 27, 31, 33, 30, 34]) Due to space limitation, we compare our approach only with the existing works which are closest to what is presented here. A classification of different approaches can be found in [22] In [31] it is shown that if databases are ranked in descending order of similarity of the most similar ....
D. Dreilinger and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3):195-222, July 1997.
....for query processing, query planning and query optimization. In IWIZ, the mediator is capable of conventional query evaluation and can learn how to fuse, cleanse, and reconcile information. Meta Search Engines A Meta Search Engine (MSE) is a tool that accesses multiple (local) search engines [22, 84, 85]. It retrieves documents and ranks them according to their relevancy to the search phrase provided by the user. This meta search engine is built on top of multiple local search engines (e.g. AltaVista, Yahoo, Infoseek) The IWIZ, on the other hand, is an information integration system that ....
D. Dreilinger and A. E. Howe, "Experiences with Selecting Search Engines Using Metasearch," ACM Transaction on Information Systems, vol. 15, pp. 195-222, 1997.
....is basically a rankbased, simulated score. In e ect, instead of using relevance scores alone, or ranks alone, they use a combination of the two. Finally (phase 3) they assign a nal score to a document d as the sum of the scores given d by the input systems. The SavvySearch metasearch engine [34, 20] (now Search.com ) combines results similarly, summing the normalized scores for each document. http: www.apple.com sherlock 28 The ProFusion metasearch engine [28, 29] in advance of receiving any queries, evaluates the performance of each search engine based on human made ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195-222, 1997.
....1. INTRODUCTION A significant amount of valuable information on the web is stored in databases, some of which is hidden behind search interfaces and not crawlable by traditional search engines. An attractive way to allow easy interaction with these databases is through metasearchers (e.g. [20, 7]) which provide users with a single interface to query multiple databases simultaneously. A metasearcher performs three main tasks: After receiving a query, it determines the best databases to evaluate the query (database selection) it translates the query in a suitable form for each database ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195--222, 1997.
....other gene databases. In order to perfect this project, we will improve the current search methods by using various approximate sequence match algorithms [11] adapted to the citrus features. Also, in case of a good match can not be found in CitrusDB, the system will apply a metasearch technology [3] to search other gene databases. A metasearch is to search several databases simultaneously and present results in some sort of integrated format. On the other hand, the proposed system and other useful resources will be put on the Web for the public access, and the authors will ask other gene ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195-222, July 1997.
....that support easier mediation [2, 6] A number of popular metasearch systems of web based information retrieval engines are in widespread use. Popular metasearch engines include www.inquirus.com, www.mamma.com, www.dogpile.com, www.savvysearch. com, and www.metacrawler.com. Result combination [10, 20, 24, 23] and caching strategies [24, 23] are essential to data integration in metasearch engines. In addition, specialized indexes that develop summaries of source collections may be used to effectively choose which source system is most likely to contain the results [15, 16] Work has also been ....
Dreilinger, D. and A. E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems. 15(3):195-222, 1997.
....(FAQ) identification. Another Web document clustering algorithm is suggested in [12] 5. METASEARCHES None of the current search engines is able to cover the Web comprehensively. Using an individual search engine may miss some critical information provided by other engines. Metasearch engines [15, 27, 47] conduct a search using several other search engines simultaneously, and present the results in some sort of integrated format. This lets users see at a glance which particular search engine returned the best results for a query without having to search each one individually. They typically do not ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195-222, July 1997.
....and ecient metasearch engine has been accumulated in recent years. One of the main challenging problems is the database selection problem, which is to identify, for a given user query, the local search engines that are likely to contain useful documents (Baumgarten, 1997; Callan et al., 1995; Dreilinger and Howe, 1997; Gravano and Garcia Molina, 1995; Kahle and Medlar, 1991; Koster, 1994; Liu et al., 2001; Manber and Bigot, 1997; Meng et al., 1998; Selberg and Etzioni, 1997; Yu et al., 1999; Yuwono and Lee, 1997) The objective of performing database selection is to improve eciency as it enables the metasearch ....
Dreilinger D, Howe A (1997) Experiences with selecting search engines using metasearch. ACM TOIS, July 1997, 15(3): 195-222.
....technique based on this framework. Experimental results will be presented in Section 5. We conclude the paper in Section 6. 2 Related Work In the last several years, a large number of research papers on issues related to metasearch engines or distributed collections have been published (e.g. [1, 7, 10, 12, 13, 25, 27, 28, 29, 34, 35, 36, 37, 40, 43, 44]) For database selection, most approaches rank the databases for a given query based on certain usefulness measures. For example, gGlOSS uses the sum of document similarities that are higher than a threshold [12] CORI Net uses the probability that a database contains relevant documents due to ....
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997, pp. 195-222.
....synonyms also exist in the metadata manager to assist with this process. If structured data are found, a SQL query is built from the initial natural language query. Otherwise, a metasearch of unstructured sources is used. Hence, our mediator is no worse than a typical metasearch engine [Bart94, Drei97, Glov99, Glov00]. A typical metasearch engine simply submits the query to a group of unstructured sources we do this and consult structured sources as well. Following the selection of sources for query processing, the query is sent to the corresponding query modules and results are returned to the mediator ....
Dreilinger, D. and A. E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems. 15(3):195-222, 1997.
.... description or keywords , are only used on the homepages of 34 of sites. Only 0,3 of the sites use the Dublin Core meta data standard [Dublin Core, 1999] Facing these problems, we find various types of search engines currently indexing the World Wide Web. There are also metasearch engines [Dreilinger and Howe 1997]. These place queries in multiple search engines and join the results in a common interface. Appendix A discusses in more detail the behavior and coverage of search tools. III.2. IR IN DIGITAL PUBLISHING ENVIRONMENTS As information publishing gained widespread use in the Internet, many ....
....with several search engines can became a difficult task even for experienced users. Metasearch engines are designed to deal with these problems. They provide an interface to automatically access multiple conventional search engines, adding an additional level of abstraction to Web searching. [Dreilinger and Howe 1997] present the design for a metasearch system and the results of its usage evaluation. Their metasearch system eventually became the SavvySearch service [SavvySearch 1999] a powerful widely used search engine that selects other relevant search engines based on user query terms and submits the user ....
DREILINGER, D. AND HOWE, A. 1997. Experiences with Selecting Search Engines Using Metasearch. In ACM Transactions on Information Systems, 3(15), pages 195-222.
....query results. Various document clustering algorithms have been proposed in [3, 8, 31] 5. METASEARCHES None of the current search engines is able to cover the Web comprehensively. Using an individual search engine may miss some critical information provided by other engines. Metasearch engines [9, 14, 19, 29] search several other search engines simultaneously, and present results in some sort of integrated format. This lets users see at a glance which particular search Figure 2: System structure of a metasearch engine. Table 1: Major commercial search engines. SE: Search Engine, and AS: Answering ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195-222, July 1997.
....be presented in Section 5. We brie y describe our prototype system in Section 6. Finally,we conclude the paper in Section 7. 2. RELATED WORK In the last several years, alargenumber of research papers on issues related to metasearch engines or distributed collections have been published (e.g. [1, 4, 6, 8, 9, 17, 19, 20, 21, 26, 27, 28, 32, 37]) For database selection, most approaches rank the databases for a given query based on certain usefulness measures. For example, gGlOSS uses the sum of document similarities that are higher than a threshold [8] CORI Net uses the probability that a database contains relevant documents due to ....
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch.ACM TOIS, 15(3), July 1997, pp. 195-222.
....(HTML )viewer would have to be built, but that was beyond the scope of the feasibility study. 6. 3 SEARCH ENGINES The original intention was to use a search engine which is capable of not only performing full text queries but also to index documents according to certain attributes marked by tags [21]. Several search engines had been tested (AltaVista, ZyIndex, MS Index Server, Excite, etc. but all faced the restriction of not being able to define the tags they should look for. Most of them supported their own document syntax but none of them was configurable in a way to support a variety of ....
D.Dreilinger and A.E.Howe, Experiences with Selecting Search Engines Using Metasearch, ACM Transactions on Information Systems, 15(3), 1997, 195-222. APPENDIX: INTERACTION BETWEEN USER INTERFACE WINDOWS
....10000 Web search servers, performing automatic selection over 956 servers as examined later in this paper does not seem unreasonable. Brokers which concurrently query multiple servers and merge their results already exist on the Web. Examples include Inquirus [15] MetaCrawler [18] SavvySearch [6] and ProFusion [7] For this reason, this paper concentrates on the problem of server selection in such an environment, assuming retrieval and merging are carried out using methods already implemented in Inquirus. This paper has three goals. The first is to evaluate the CORI [1] vGlOSS [11] and ....
....S i # M j#1 cwt i# j (3) This formulation is used in evaluation experiments below. The same value of l was also used in [9] and [8] although under a different evaluation framework. Server ranking methods based on user input were excluded from this study. The Web search broker SavvySearch [6] performs server selection. For a query containing term t,a server s future selection score for t is boosted if the user visits a page returned by that server, and is reduced if the server returns no results. The Web search broker ProFusion also 1. Probe queries over 956 servers 2. Test queries ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195--222, July 1997.
....term weight of any query term. But the representatives may not contain certain information desired by a particular database selector. For non cooperative search engines that do not follow any standard, their representatives may be extracted from past retrieval experiences 14 (e.g. SavvySearch [14]) or from sampled documents (e.g. 15] But sampling may cause inaccuracies. There are two major challenges in developing good database selection algorithms. One is to identify appropriate database representatives. A good representative should permit fast and accurate estimation of database ....
....it may be difficult to identify appropriate training queries and the learned knowledge may become less accurate when the contents of the component databases change. SavvySearch approach SavvySearch (www.search.com) is a metasearch engine employing the dynamic learning approach. In SavvySearch [14], the ranking score of a component search engine with respect to a query is computed based on the past retrieval experience of using the terms in the query. More specifically, for each search engine, a weight vector (w 1 ; wm ) is maintained by the database selector, where each w i ....
[Article contains additional citation context not shown here]
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997, pp. 195-222.
....technique based on this framework. Experimental results will be presented in Section 5. We conclude the paper in Section 6. 2 Related Work In the last several years, a large number of research papers on issues related to metasearch engines or distributed collections have been published (e.g. [1, 4, 6, 8, 9, 17, 19, 20, 21, 26, 27, 28, 31, 36]) For database selection, most approaches rank the databases for a given query based on certain usefulness measures. For example, gGlOSS uses the sum of document similarities that are higher than a threshold [8] CORI Net uses the probability that a database contains relevant documents due to the ....
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997, pp. 195-222.
....research has been done in this area. Several methods of selecting search engines based on user queries have been proposed, for example GlOSS [33, 34] maintains word statistics on available database, in order to estimate which databases are most useful for a given query. Related research includes [19, 24, 26, 32, 46, 49, 61, 62]. It would be of great benefit if the major web search engines attempted to direct users to the best specialized search engine where appropriate, however many of the search engines have incentives not to provide such a service. For example, they may prefer to maximize use of other services that ....
D. Dreilinger and A. Howe. Experiences with selecting search engines using meta-search. ACM Transactions on Information Systems, 15(3):195--222, 1997.
....graphics, audio and video analysis techniques, have been used to locate documents in a large set and interpret the contents of such documents. Approximate reasoning techniques for making the most relevant information available for the information seeker are frequently involved. Web search engines [14] serve as basic instances of such tools. The many IR publications, such as, 18] and IR systems present more advanced approaches, frequently employing context information (knowledge bases, thesauri or alternatively grammatical principles of natural language) as well as statistical exploration to ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195--222 (1997).
....for federating the process of content based retrieval in the case of multiple, large distributed repositories. We propose a framework for developing a VIR meta search engine that is stimulated by the appearance of meta search engines which federate the process for Web document searching [10]. We will describe the system in more detail in a later section. WebSEEk A Case Study of Internet VIR The World Wide Web includes a rich collection of visual information, which is also integrated with a vast variety of non visual information. Although there are many popular search engines for ....
....this may be partly affected by the limitation of the content based search functions that are implemented in the current system. Meta Search Engines for Images The proliferation of text search engines on the Web has motivated the recent research in integrated search or meta search engines [10]. Meta search engines serve as common gateways that link users to multiple cooperative or competitive search engines. They accept query requests from users, sometimes, along with user specified query plans to select target search engines. The meta search engines may also keep track of the past ....
[Article contains additional citation context not shown here]
Daniel Drelinger and Adele E. Howe," Experiences with Selecting Search Engines Using Meta-Search," to appear in ACM Transactions of Information Systems, 1997.
....should have the retrieval effectiveness close to that as if all documents were in a single database while minimizing the access cost. A substantial body of research work addressing different aspects of building an effective and efficient metasearch engine has been accumulated in recent years [2, 4, 7, 12, 13, 15, 18, 23, 24, 25, 26, 32, 34, 37, 42]. However, most existing systems work consider only small scale metasearch engines that have no more than a few hundred local search engines. None of these approaches can scale to tens of thousands of or more 2 local search engines and at the same time achieve good effectiveness. The reason is as ....
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997, pp. 195-222.
....method in gGlOSS in a distributed database environment using documents and queries that are used to evaluate gGlOSS. Learning based Approaches make use of previous queries and their retrieval results with respect to a database to determine the usefulness of the database. SavvySearch [8] and ProFusion [9] are examples of using learning based database selection approaches. The theoretical approaches taken by [2, 12] are very different from ours. No experimental results are reported in [12] Recent experimental results reported in [3] show that if the number of documents retrieved ....
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997.
....may then view documents by downloading them from the appropriate document servers. In an environment where a large number of search servers are available, it is possible to employ a special client known as a metasearcher [Lawrence and Giles, 1998, Gauch et al. 1996, Selberg and Etzioni, 1995, Dreilinger and Howe, 1997, Smeaton and Crimmins, 1996] Metasearchers merge results from multiple search servers into a single ranked list using some results merging strategy. In addition, some form of query translation technology is necessary, to interact with different search servers, and some server selection method ....
Dreilinger, Daniel and Howe, Adele E. (1997). Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195--222.
....services strongly depend on the service type. Nevertheless, we can attempt to identify them for essential functions such as search, transfer or presentation. Quality of a search service is determined by the quality of the results of the search and the quality of query formulation framework [Dreilinger, 1997]. We can then evaluate and quantify search strategies according to the following dimensions: query formulation, language, ranking algorithms and duplicates removing. The quality of transfer can be evaluated according to reliability, security and performance categories previously presented. ....
Dreilinger, D., & Howe, A. (1997). Experiences with Selecting Search Engines Using Meta-Search. ACM Transactions on Information Systems, 15 No 3 (July 1997), 195-222.
....different aspects of building an effective and efficient metasearch engine has been accumulated in recent years. Among the main challenges, the database selection problem is to identify, for a given user query, the local search engines that are likely to contain useful documents for the query [1, 6, 10, 13, 16, 18, 22, 25, 26, 30, 36, 38]. The objective of performing database selection is to improve efficiency as the metasearch engine can send each query to only potentially useful search engines, cutting down network traffic and the cost of searching useless databases. The document selection problem is to determine what documents ....
.... as retrieving these documents may have several negative effects (higher local cost, higher communication cost for shipping these documents and higher cost to merge them) The result merging problem is to combine the documents returned from multiple search engines into a single ranked list [6, 10, 30, 38]. A good metasearch engine should have the retrieval effectiveness close to that as if all documents were in a single database while minimizing the access cost. Search engines in the Internet are usually designed and implemented independently. As a consequence, substantial heterogeneities exist ....
[Article contains additional citation context not shown here]
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997, pp. 195-222.
....is to find the answer most relevant to a specific information seeking context, and the use of specialized resources helps assure the relevance of the result to that context. Surprisingly little work has been done on source selection. The most notable example, a previous version of SavvySearch (Dreilinger Howe 1997) kept track of how well search engines handled past queries, and used vector space retrieval to match the current query to a search engine that has previously done well with similar queries. ProFusion (Gauch Wang 1996) used a handbuilt knowledge hierarchy to categorize queries and select ....
Dreilinger, D., and Howe, A. 1997. Experiences with selecting search engines using meta-search. ACM Transactions on Information Systems 15(3).
....will be shown to be much less accurate. In addition, the estimation methods employed in [6, 7] are based on two very restrictive assumptions. 5. Learning based Approaches make use of past retrieval experiences with respect to a database to determine the usefulness of the database. SavvySearch [4] is an example of a learning based approach. SavvySearch only ranks local search engines and does not estimate the number of potentially useful documents. The method in [2] proposes to select databases based on the estimated probability of relevance distribution of their documents. Relevance ....
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997.
....are, what their strength is, and how to use them. Consequently, searching the Web for specific information has become a very time consuming and inefficient task for even experienced users. This situation has motivated the recent research and development in integrate search or meta search engines [8]. Meta search engines serve as common gateways, which automatically link users to multiple and maybe competing search engines. They accept requests from users, sometimes, along with user specified query plans to select the target search engines. The meta search engines may also keep track of the ....
....in section 5. Finally, section 6 summarizes the results of this paper. 2. RELATED RESEARCH Meta search engines serve as common gateways linking users to multiple search engines in a transparent manner. Working meta search engines usually include three basic components, as depicted in Figure 1 [8]. The dispatching component selects the target search engines for each query. The query translator translates the user specified query to compatible scripts to each target search engine. The display interface component merges the query results from each search engine, removes duplicates, and ....
[Article contains additional citation context not shown here]
Daniel Dreilinger, and Adele E. Howe, "Experiences with Selecting Search Engines Using Meta-search", to appear in ACM Transactions of Information Systems, 1997.
....criteria. First, some systems perform feature extraction using the compressed images, rather than the original uncompressed pixel data [16] This approach avoids expensive expansion of the coded data and manipulation in the decoded domain. Second, the idea of meta search system for VIR [17] has been stimulated by similar work in the field of information retrieval. We will discuss this issue with more 3 details in a later section. WebSEEk A Case Study of Internet VIR The World Wide Web includes a rich collection of visual information, which is also inter linked with a vast ....
....query plans to select target search engines. The meta search engine may also keep track of the past performance of each search engine and use it in selecting target search engines for future queries. A working meta search engine includes three basic components, as depicted in Figure 6 [17]. The query interface component accepts the query specification from the user and translate it to compatible query scripts to each target search engine. The dispatching component selects target search engines for each query. The display interface component merges the query results from each ....
[Article contains additional citation context not shown here]
Daniel Drelinger and Adele E. Howe," Experiences with Selecting Search Engines Using Meta-Search," to appear in ACM Transactions of Information Systems, 1997.
....are designed to retrieve, and how to use them. Consequently, searching the Web for specific information has become a very time consuming and inefficient task for even the most expert users. This situation has motivated the recent research and development in integrated search or meta search engines [1]. Meta search engines serve as common gateways, which automatically link users to multiple or competitive search engines. They accept requests from users, sometimes, along with user specified query plans to select target search engines. The metasearch engines may also keep track of the past ....
....section 5 closes with concluding remarks and open issues for future research. 2. RELATED RESEARCH Meta search engines serve as common gateways, linking users to multiple search engines in a transparent manner. Working meta search engines include three basic components, as depicted in Figure 1[1]. The dispatching component selects target search engines for each query. The query interface component translates the user specified query to compatible scripts to each target search engine. The display interface component merges the query results from each search engines, removes duplicates and ....
[Article contains additional citation context not shown here]
Daniel Dreilinger and Adele E. Howe, "Experiences with Selecting Search Engines Using Meta-search", to appear in ACM Transactions of Information Systems, 1997.
No context found.
Daniel Dreilinger. The SavvySearch meta-search engine. http://www.savvysearch.com/. D. Dreilinger and A.E. Howe. Experiences with selecting search engines using meta-search. ACM Transactions on Information Systems, 15(3):195--222, 1997.
....Project The SavvySearch project has been quite successful. Currently, SavvySearch processes over 20,000 queries each day and, based on electronic mail, has attracted a large wellsatisfied user base. SavvySearch, as described here, is the result of almost two years of work and a series of studies [Dreilinger and Howe, to appear] The present design of the resource reasoning and learning algorithm resulted, in part, from the results of these studies [Dreilinger, 1996] We studied the effects of the learning by starting from a minimal metaindex (compiled from 2 days worth of data) and allowing it to accumulate experience ....
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using meta-search. ACM Transactions on Information Systems, to appear.
No context found.
Daniel Dreilinger and Adele E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):195--222, 1997.
No context found.
Dreilinger, D. & Howe, A. E. (1997), `Experiences with selecting search engines using metasearch', 17(3), 229--229.
No context found.
D. Dreilinger and A.E. Howe. Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3):1952.
No context found.
Dreilinger, D., and Howe, A.E., "Experiences with selecting search engines using metasearch," ACM Trans. on Information Systems 15(3), 195-222, 1997. Available at http://citeseer.nj.nec.com/65714.html 31
No context found.
Daniel Dreilinger, "Experiences with Selecting Search Engines using Meta-Search," December, 1996.
No context found.
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997.
No context found.
D. Dreilinger, and A. Howe. Experiences with Selecting Search Engines Using Metasearch. ACM TOIS, 15(3), July 1997, pp. 195-222.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC