| S. Chakrabarti, D. Gibson, and K. McCurley. Surfing backwards on the Web. In Proceedings of the 8th International World Wide Web Converence (WWW8), 1999. |
....characterization project. I believe that links very widely in nature, even on a given page, and are not necessarily a good basis for relevance assessments. They are a most convenient structure to use, however. There is a large body of literature on the topic of focused crawling, with Chakrabarti [27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 38] probably being the most prolific. It is probably beneficial to start a focused crawl with a hub , in Kleinberg s terminology[69] The ARC system [30] augments Kleinberg s HITS technology [53] by adding the textual content of the link anchors to the mix, as did Brin and Page [19] Others [45, ....
S. Chakrabarti, D. Gibson, and K. McCurley. Surfing backwards on the Web. In Proceedings of the 8th International World Wide Web Converence (WWW8), 1999.
....the CS department member list, existing focused crawlers cannot move up the hierarchy to 4 the CS department home page. Our focused crawler utilizes a compact context representation called a Context Graph to model and exploit hierarchies. The crawler also utilizes the limited backward crawling [13, 14] possible using general search engine indices to efficiently focus crawl the web. Unlike Rennie and McCallum s approach[12] our approach does not learn the context within which target documents are located from a small set of web sites, but in principle can back crawl a significant fraction of ....
S. Chakrabarti, D. Gidson, and K. McCurley, "Surfing backwards on the web," in Proc 8th World Wide Web Conference (WWW8), 1999.
....to the CS department member list, existing focused crawlers cannot move up the hierarchy to the CS department home page. Our focused crawler utilizes a compact context representation called a Context Graph to model and exploit hierarchies. The crawler also utilizes the limited backward crawling [13, 14] possible using general search engine indices to efficiently focus crawl the web. Unlike Rennie and McCallum s approach [12] our approach does not learn the context within which target documents are located from a small set of web sites, but in principle can back crawl a significant fraction of ....
S. Chakrabarti, D. Gidson, and K. McCurley, "Surfing backwards on the web," in Proc 8th World Wide Web Conference (WWW8), 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC