(Enter summary)
Abstract: The organization of HTML into a tag tree structure, which is rendered by browsers as roughly rectangular regions with embedded text and HREF links, greatly helps surfers locate and click on links that best satisfy their information need. Can an automatic program emulate this human behavior and thereby learn to predict the relevance of an unseen HREF target page w.r.t. an information need, based on information limited to the HREF source page? Such a capability would be of great interest in... (Update)
Cited by: More
Defining Evaluation Methodologies for Topical Crawlers - Srinivasan, Menczer, Pant
(Correct)
Using URLs and Table Layout for Web Classification Tasks - Lawrence Kai Shih (2004)
(Correct)
Topical Crawling for Business Intelligence - Pant, Menczer
(Correct)
Active bibliography (related documents): More All
0.5: Kernel Regression Trees - Torgo
(Correct)
0.5: LawBOT: an assistant for legal research - Debnath, Sen, Blackstock
(Correct)
0.4: Intelligent Crawling on the World Wide Web with Arbitrary.. - Aggarwal, Al-Garawi, Yu (2001)
(Correct)
Similar documents based on text: More All
0.7: Focused Web Crawling: A Generic Framework for Specifying.. - Ester, Gross, Kriegel (2001)
(Correct)
0.6: Focused crawling: a new approach to topic-specific.. - Chakrabarti, van den .. (1999)
(Correct)
0.6: Focused Crawls, Tunneling, and Digital Libraries - Bergmark, Lagoze, Sbityakov (2002)
(Correct)
Related documents from co-citation: More All
11: Focused crawling: a new approach to topic-specific Web resource discovery
- Chakrabarti, van der Berg et al. - 1999
10: The Shark-Search algorithm --- an application: tailored Web site mapping (context) - Hersovici, Jacovi et al. - 1998
10: Adaptive retrieval agents: Internalizing local context and scaling up to the web
- Menczer, Belew - 1999
BibTeX entry: (Update)
S. Chakrabarti, K. Punera, and M. Subramanyam. Accelerated focused crawling through online relevance feedback. In WWW, Hawaii, May 2002. ACM. http://citeseer.ist.psu.edu/chakrabarti02accelerated.html More
@inproceedings{ chakrabarti02accelerated,
author = "S. Chakrabarti and K. Punera and M. Subramanyam",
title = "Accelerated focused crawling through online relevance feedback",
booktitle = "{WWW}, Hawaii",
month = "May",
publisher = "ACM",
year = "2002",
url = "citeseer.ist.psu.edu/chakrabarti02accelerated.html" }
Citations (may not include all citations):
1256
Introduction to Modern Information Retrieval (context) - Salton, McGill - 1983
976
Machine Learning (context) - Mitchell - 1997
180
Combining labeled and unlabeled data with co-training
- Blum, Mitchell - 1998
154
Automatic resource compilation by analyzing hyperlink struct..
- Chakrabarti, Dom et al. - 1998
149
Focused crawling: a new approach to topic-specific web resou..
- Chakrabarti, van den Berg et al. - 1999
140
A comparison of event models for naive Bayes text classifica..
- McCallum, Nigam - 1998
94
A method for disambiguating word senses in a large corpus (context) - Gale, Church et al. - 1993
90
Enhanced hypertext categorization using hyperlinks
- Chakrabarti, Dom et al. - 1998
72
Bow: A toolkit for statistical language modeling (context) - McCallum - 1998
63
Automated learning of decision rules for text categorization
- Apte, Damerau et al. - 1994
59
Stochastic models for the Web graph
- RaviKumar, Raghavan et al. - 2000
57
Focused crawling using context graphs
- Diligenti, Coetzee et al. - 2000
36
cient crawling through URL ordering (context) - Cho, Garcia-Molina et al. - 1998
32
Topical locality in the Web
- Davison - 2000
31
Evaluating topic-driven Web crawlers
- Menczer, Pant et al. - 2001
27
The shark-search algorithm---an application: Tailored Web si.. (context) - Hersovici, Jacovi et al. - 1998
26
Intelligent crawling on the World Wide Web with arbitrary pr..
- Aggarwal, Al-Garawi et al.
25
classification and signature generation for organizing large.. (context) - Chakrabarti, Dom et al. - 1998
21
Information retrieval in the world-wide web: Making client-b..
- De Bra, Post - 1994
19
Integrating the document object model with hyperlinks for en..
- Chakrabarti - 2001
11
Using reinforcement learning to spider the web e#ciently
- Rennie, McCallum - 1999
10
WTMS: a system for collecting and analyzing topic-specific W.. (context) - Mukherjea - 2000
6
Links tell us about lexical and semantic Web content (context) - Menczer - 2001
3
Exploring the Web with reconnaissance agents (context) - Leiberman, Fry et al. - 2001
3
WebWatcher: A tour guide for the web (context) - Joachims, Freitag et al. - 1997
3
Searching for arbitrary information in the WWW: The fish sea..
- De Bra, Post - 1994
2
Letizia: An agent that assists Web browsing (context) - Leiberman - 1995
2
Regression by classification
- Torgo, Gama - 1996
1
Focused crawling using TFIDF centroid (context) - Subramanyam, Phanindra et al. - 2001
1
Mining the Web (context) - Mitchell - 2001
1
Longer version available as Technical Report CS98-579 (context) - Menczer, Belew et al. - 2000
1
Artificial Intelligence Review (context) - Chakrabarti, Dom et al. - 1999
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://http.cs.berkeley.edu/~soumen/): More
Keyword Searching and Browsing in Databases using BANKS - Bhalotia, Hulgeri.. (2002)
(Correct)
The Structure of Broad Topics on the Web - Chakrabarti, Joshi, Punera.. (2002)
(Correct)
Fast and Accurate Text Classification Via Multiple.. - Chakrabarti, Roy.. (2002)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC