| I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with Fetuccino. Computer Networks, 31(11--16):1653--1665, 1999. 9 |
....positive and negative 7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0 500 1000 1500 2000 average recall pages crawled BFS1 BFS256 BreadthFirst Figure 4: Average recall of examples when the crawls start from the neighbors. example pages to guide their focused crawlers. Fetuccino [3] and InfoSpiders [18] begin their focused crawling with starting points generated from CLEVER [7] or other search engines. Most crawlers follow fixed strategies, while some can adapt in the course of the crawl by learning to estimate the quality of links [18, 1, 22] The question of exploration ....
....a page and the centroid of the seeds. In fact contentbased similarity assessments form the basis of relevance decisions in several examples of research [8, 19] Others exploit link information to estimate page relevance with methods based on in degree, out degree, PageRank, hubs and authorities [2, 3, 4, 8, 9, 20]. For example, Cho et al. 9] consider pages with PageRank score above a threshold as relevant. Najork and Wiener [20] use a crawler that can fetch millions of pages per day; they then calculate the average PageRank of the pages crawled daily, under the assumption that PageRank estimates ....
I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with Fetuccino. Computer Networks, 31(11--16):1653--1665, 1999. 9
....Research on the design of e#ective focused crawlers is very vibrant. Many di#erent types of crawling algorithms have been developed. For example, Chakrabarti et al. 8] use classifiers built from training sets of positive and negative example pages to guide their focused crawlers. Fetuccino [3] and InfoSpiders [23] begin their focused crawling with starting points generated from CLEVER [7] or other search engines. Most crawlers follow fixed strategies, while some can adapt in the course of the crawl by learning to estimate the quality of links [23, 1, 27] The question of exploration ....
....a page and the centroid of the seeds. In fact content based similarity assessments form the basis of relevance decisions in several examples of research [8, 24] Others exploit link information to estimate page relevance with methods based on in degree, out degree, PageRank, hubs and authorities [2, 3, 4, 8, 9, 25]. For example, Cho et al. 9] consider pages with PageRank score above a threshold as relevant. 15 Najork and Wiener [25] use a crawler that can fetch millions of pages per day; they then calculate the average PageRank of the pages crawled daily, under the assumption that PageRank estimates ....
I Ben-Shaul, M Herscovici, M Jacovi, YS Maarek, D Pelleg, M Shtalhaim, V Soroka, and S Ur. Adding support for dynamic and focused search with Fetuccino. Computer Networks, 31(11--16):1653--1665, 1999.
....information available through the Web [1, 14, 13, 8, 2] However, such agents have generally limited autonomy either they rely completely on search engines, or they must be told where to go by the user, or they follow some fixed heuristics. Other systems, such as Fish Search [4] and Fetuccino [3], crawl at query time but are hindered by a lack of adaptability all agents are identical clones following fixed search strategies. We suggest a more radical solution to the scalability problem: complementing index based search engines with intelligent search agents at the user s end. This ....
I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with fetuccino. In Proceedings of 8th International World Wide Web Conference, 1999.
....built using positive and negative example pages to determine page importance [6] We also explore classifiers for page evaluation, however there are di#erences as described below. In degree, out degree, PageRank, hubs and authorities are the more commonly used link based page importance measures [1, 2, 3, 6, 7]. For example, Cho et al. consider pages with PageRank above a specified threshold as being relevant to the query [7] Kleinberg s recursive notion of hubs and authorities [13] has been extended by several others. For example, edge weights are considered important [6] and so are edges that connect ....
.... and PageRank are e#ective at identifying high quality pages as judged by human experts [1] Most of the link based methods such as PageRank were designed to operate within a neighborhood graph of a query and thus implicitly recognize content based criteria [13] For instance in Fetuccino [2] and InfoSpiders [15] the crawl starting points are obtained via CLEVER (IBM s engine based on HITS) or any search engine, respectively. Various combinations of similarity and link based criteria have also been suggested to evaluate links and guide crawlers, such as looking at the words ....
[Article contains additional citation context not shown here]
I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with fetuccino. In Proceedings of 8th International World Wide Web Conference, 1999.
....are also 1 http: www.tomsawyer.com exploited by a commercial spin off of Xerox, called Inxight 2 . Earlier in this survey, we referred to NESTOR[1] or WebOFDAV[69] which can be used as web navigation tools. Other examples in this category are the Harmony Browser[1] Mapa[32] or Fetuccino[7] (the latter also combines the results of a web search engine with graph visualization) 6 Journals and Conferences This survey is based on an extensive literature overview drawn from various conferences and journals. One of the difficulties of the field is that results are spread over a large ....
I. Ben--Shaul, M. Herscovici, M. Jacovi, Y. S. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur, "Adding support for dynamic and focused search with Fetuccino", Proceedings of 8 th International World Wide Web Conference, Elsevier Science, pp. 575--587, 1999.
.....# Visualizing Web spaces. There are different reasons for creating a graphical representation of a Web space. The first is to support user navigation, as discussed in the following paragraph. Mapuccino [19] is a dynamic site mapping system developed at IBM Research that now uses SGF metadata [2]. This tool has been integrated in IBM products, including WebSphere and Tivoli. More generally, the visualization of a site provides an interface to better understand both its structure (i.e. the relationships between documents) and the properties of its documents. For example, a diagram can ....
I. Ben-Shaul, M. Hersovici, M. Jacovi, Y. S. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur, "Adding support for Dynamic and Focused Search with Fetuccino", in proceedings of WWW'8 International Conference, Toronto, Canada, 1999.
....engine for computer science research papers, based on a crawler trained to extract such papers from a given list of starting points at suitable department and universities. Information filtering agents such as WebWatcher [20] HotList and ColdList [30] Fish Search [16] Shark Search and Fetuccino [2], and clan search [32] have served a similar purpose. These are special cases of the general example and topic driven automatic web exploration that we undertake. 2 Architecture Our problem formulation in the previous section does not in itself suggest a procedure to attain that goal. If pages ....
I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. S. Maarek, D. Pelleg, M. Shtalheim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with Fetuccino. In 8th World Wide Web Conference. Toronto, May 1999.
....just URL, title, and description) and the quality of these page descriptors is thus quite important to a post hoc textual ranking or clustering of the pages. 2. 4 Focused crawlers Focused crawlers are web crawlers that follow links that are expected to be relevant to the client s interest (e.g. [11, 4, 26, 24] and the query similarity crawler in [12] They may use the results of a search engine as a starting point, or they may crawl the web from their own dataset. In either case, they assume that it is possible to find highly relevant pages using local search starting with other relevant pages. Dean ....
I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. S. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with Fetuccino. In Proceedings of the Eighth International World Wide Web Conference, Toronto, Canada, May 1999.
....surfers and or improving the quality (precision) of results. Some systems and services prede ne the topics of interest (a.k.a. domains) themselves (see for instance various directory services suchasYahoo [22] and Google [7] while others allow users to de ne topics of interest (e.g. Fetuccino [3], Focused Crawler [6] WTMS [13] and knowledge agents [1] Since the amount of memory available on pervasive devices is very limited, it is crucial that the information stored locally be highly representative of the topic, yet concise. While the focused search techniques mentioned above can be ....
I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. S. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with fetuccino. In Proceedings of the 8th International Word Wide Web Conference (WWW8), pages 575-587, Toronto, Canada, May 1999. Elsevier.
....Section 3 shows how, in practice, the weighted ranking formula can be integrated in typical IR engines while not drastically changing their architecture or affecting their efficiency in terms of response time. In Section 4, we describe the integration of the reranking feature into Fetuccino [Ben Shaul et al. 1999], and continue our discussion of our running example of U.S. France tax treaties. Section 5 considers the effects, in the weighted case, of allowing search terms to be prefaced by the plus symbol ( and the minus symbol ( Gamma) in the unweighted case, the plus symbol typically means that the ....
....obtained from the original search, along with the set of search terms and the userassigned weighting. By using our weighting formula, we can convert any search parasite into one that takes into account user assigned weightings. Since we have available to us a search parasite, namely Fetuccino [Ben Shaul et al. 1999], we implemented this approach. We describe this implementation in more detail in the next section. 4 Integration in a Search Parasite for Dynamic Reranking We integrated the dynamic reranking feature described above, in an experimental version of Fetuccino [Ben Shaul et al. 1999] Fetuccino ....
[Article contains additional citation context not shown here]
Ben-Shaul, I., Herscovici, M., Jacovi, M., Maarek, Y. S., Pelleg, D., Shtalhaim, M., Soroka, V., and Ur., S. (1999). Adding support for dynamic and focused search with fetuccino. In Proceedings of the WWW8 Conference, Toronto, CA.
No context found.
I. Ben-Shaul, M. Herscovici, M. Jacovi, Y. S. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, and S. Ur. Adding support for dynamic and focused search with Fetuccino. In Proceedings of the WWW8 Conference, Toronto, CA, May 1999.
....indexing and the clustering algorithm described above. We have integrated Lassi features into our search parasite, Fetuccino, whose purpose is to take search results from various search services, identify links between them, and pursue directed crawling in the most relevant directions (See [Ben Shaul et al. 1999] for more information) Fetuccino results can be laid out as a graph or simply as a traditional ranked list. Figure 5 shows how the user can invoke the clustering feature provided by Lassi from within Fetuccino. Lassi performs the indexing and clustering of the documents on the fly and generates ....
Ben-Shaul, I., Herscovici, M., Jacovi, M., Maarek, Y., Pelleg, D., Shtalhaim, M., Soroka, V., Ur, S. Adding support for dynamic and focused search with Fetuccino. WWW8 / Computer Networks 31(11-16), 1999, 1653--1665.
No context found.
Israel Ben-Shaul, Michael Herscovici, Michal Jacovi, Yoelle S. Maarek, Dan Pelleg, Menachem Shtalhaim, Vladimir Soroka, and Sigalit Ur. Adding support for dynamic and focused search with Fetuccino. In Proceedings of the Eighth International World Wide Web Conference, Toronto, Canada, May 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC