See this document in CiteSeerX!

Accelerated Focused Crawling through Online Relevance Feedback (2002)  (Make Corrections)  (15 citations)
Soumen Chakrabarti, Kunal Punera, Mallela Subramanyam
WWW, Hawaii



  Home/Search   Context   Related

 
View or download:
berkeley.edu/~soum...36chakrabarti.pdf
cse.iitb.ac.in/~so...36chakrabarti.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  berkeley.edu/~soumen/ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: The organization of HTML into a tag tree structure, which is rendered by browsers as roughly rectangular regions with embedded text and HREF links, greatly helps surfers locate and click on links that best satisfy their information need. Can an automatic program emulate this human behavior and thereby learn to predict the relevance of an unseen HREF target page w.r.t. an information need, based on information limited to the HREF source page? Such a capability would be of great interest in... (Update)

Cited by:   More
Defining Evaluation Methodologies for Topical Crawlers - Srinivasan, Menczer, Pant   (Correct)
Using URLs and Table Layout for Web Classification Tasks - Lawrence Kai Shih (2004)   (Correct)
Topical Crawling for Business Intelligence - Pant, Menczer   (Correct)

Active bibliography (related documents):   More   All
0.5:   Kernel Regression Trees - Torgo   (Correct)
0.5:   LawBOT: an assistant for legal research - Debnath, Sen, Blackstock   (Correct)
0.4:   Intelligent Crawling on the World Wide Web with Arbitrary.. - Aggarwal, Al-Garawi, Yu (2001)   (Correct)

Similar documents based on text:   More   All
0.7:   Focused Web Crawling: A Generic Framework for Specifying.. - Ester, Gross, Kriegel (2001)   (Correct)
0.6:   Focused crawling: a new approach to topic-specific.. - Chakrabarti, van den .. (1999)   (Correct)
0.6:   Focused Crawls, Tunneling, and Digital Libraries - Bergmark, Lagoze, Sbityakov (2002)   (Correct)

Related documents from co-citation:   More   All
11:   Focused crawling: a new approach to topic-specific Web resource discovery - Chakrabarti, van der Berg et al. - 1999
10:   The Shark-Search algorithm --- an application: tailored Web site mapping (context) - Hersovici, Jacovi et al. - 1998
10:   Adaptive retrieval agents: Internalizing local context and scaling up to the web - Menczer, Belew - 1999

BibTeX entry:   (Update)

S. Chakrabarti, K. Punera, and M. Subramanyam. Accelerated focused crawling through online relevance feedback. In WWW, Hawaii, May 2002. ACM. http://citeseer.ist.psu.edu/chakrabarti02accelerated.html   More

@inproceedings{ chakrabarti02accelerated,
  author = "S. Chakrabarti and K. Punera and M. Subramanyam",
  title = "Accelerated focused crawling through online relevance feedback",
  booktitle = "{WWW}, Hawaii", 
  month = "May",
  publisher = "ACM",
  year = "2002",
  url = "citeseer.ist.psu.edu/chakrabarti02accelerated.html" }
Citations (may not include all citations):
1256   Introduction to Modern Information Retrieval (context) - Salton, McGill - 1983
976   Machine Learning (context) - Mitchell - 1997
180   Combining labeled and unlabeled data with co-training - Blum, Mitchell - 1998
154   Automatic resource compilation by analyzing hyperlink struct.. - Chakrabarti, Dom et al. - 1998
149   Focused crawling: a new approach to topic-specific web resou.. - Chakrabarti, van den Berg et al. - 1999
140   A comparison of event models for naive Bayes text classifica.. - McCallum, Nigam - 1998
94   A method for disambiguating word senses in a large corpus (context) - Gale, Church et al. - 1993
90   Enhanced hypertext categorization using hyperlinks - Chakrabarti, Dom et al. - 1998
72   Bow: A toolkit for statistical language modeling (context) - McCallum - 1998
63   Automated learning of decision rules for text categorization - Apte, Damerau et al. - 1994
59   Stochastic models for the Web graph - RaviKumar, Raghavan et al. - 2000
57   Focused crawling using context graphs - Diligenti, Coetzee et al. - 2000
36   cient crawling through URL ordering (context) - Cho, Garcia-Molina et al. - 1998
32   Topical locality in the Web - Davison - 2000
31   Evaluating topic-driven Web crawlers - Menczer, Pant et al. - 2001
27   The shark-search algorithm---an application: Tailored Web si.. (context) - Hersovici, Jacovi et al. - 1998
26   Intelligent crawling on the World Wide Web with arbitrary pr.. - Aggarwal, Al-Garawi et al.
25   classification and signature generation for organizing large.. (context) - Chakrabarti, Dom et al. - 1998
21   Information retrieval in the world-wide web: Making client-b.. - De Bra, Post - 1994
19   Integrating the document object model with hyperlinks for en.. - Chakrabarti - 2001
11   Using reinforcement learning to spider the web e#ciently - Rennie, McCallum - 1999
10   WTMS: a system for collecting and analyzing topic-specific W.. (context) - Mukherjea - 2000
6   Links tell us about lexical and semantic Web content (context) - Menczer - 2001
3   Exploring the Web with reconnaissance agents (context) - Leiberman, Fry et al. - 2001
3   WebWatcher: A tour guide for the web (context) - Joachims, Freitag et al. - 1997
3   Searching for arbitrary information in the WWW: The fish sea.. - De Bra, Post - 1994
2   Letizia: An agent that assists Web browsing (context) - Leiberman - 1995
2   Regression by classification - Torgo, Gama - 1996
1   Focused crawling using TFIDF centroid (context) - Subramanyam, Phanindra et al. - 2001
1   Mining the Web (context) - Mitchell - 2001
1   Longer version available as Technical Report CS98-579 (context) - Menczer, Belew et al. - 2000
1   Artificial Intelligence Review (context) - Chakrabarti, Dom et al. - 1999



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://http.cs.berkeley.edu/~soumen/):   More
Keyword Searching and Browsing in Databases using BANKS - Bhalotia, Hulgeri.. (2002)   (Correct)
The Structure of Broad Topics on the Web - Chakrabarti, Joshi, Punera.. (2002)   (Correct)
Fast and Accurate Text Classification Via Multiple.. - Chakrabarti, Roy.. (2002)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC