See this document in CiteSeerX!

Enhanced Topic Distillation using Text, Markup Tags, and Hyperlinks (2001)  (Make Corrections)  (1 citation)
Soumen Chakrabarti, Mukul M. Joshi, Vivek B. Tawde,
Research and Development in Information Retrieval



  Home/Search   Context   Related

 
View or download:
cse.iitb.ernet.in:...rtiJT2001uhits.pdf
cse.iitb.ac.in/~so...rtiJT2001uhits.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cse.iitb.ernet.in:8000/...#papers (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Topic distillation is the analysis of hyperlink graph structure to identify mutually reinforcing authorities (popular pages) and hubs (comprehensive lists of links to authorities). Topic distillation is becoming common in Web search engines, but the best-known algorithms model the Web graph at a coarse grain, with whole pages as single nodes. Such models may lose vital details in the markup tag structure of the pages, and thus lead to a tightly linked irrelevant subgraph winning over a... (Update)

Similar documents based on text:   More   All
0.6:   Homepage Finding and Topic Distillation using a Common.. - Anh, Moffat (2002)   (Correct)
0.5:   UIC at TREC-2002: Web Track - Liu, Yu, Wu   (Correct)
0.5:   Topic Distillation with Knowlede Agents - Amitay, Dalrow, Lempel, Soffer (2003)   (Correct)

BibTeX entry:   (Update)

S. Chakrabarti, M. Joshi, V. Tawde, "Enhanced Topic Distillation Using Text, Markup, Tags and Hyperlinks", SIGIR Conf., 2001. http://citeseer.ist.psu.edu/chakrabarti01enhanced.html   More

@inproceedings{ chakrabarti01enhanced,
    author = "Soumen Chakrabarti and Mukul Joshi and Vivek Tawde",
    title = "Enhanced Topic Distillation Using Text, Markup Tags, and Hyperlinks",
    booktitle = "Research and Development in Information Retrieval",
    pages = "208-216",
    year = "2001",
    url = "citeseer.ist.psu.edu/chakrabarti01enhanced.html" }
Citations (may not include all citations):
2441   Johns Hopkins University Press (context) - Golub, van Loan - 1989
2319   Elements of Information Theory (context) - Cover, Thomas - 1991
1256   Introduction to Modern Information Retrieval (context) - Salton, McGill - 1983
641   The anatomy of a large-scale hypertextual web search engine - Brin, Page
576   Authoritative sources in a hyperlinked environment - Kleinberg - 1998
416   Information Retrieval - van Rijsbergen - 1979
163   Improved algorithms for topic distillation in a hyperlinked .. - Bharat, Henzinger - 1998
80   Multi-paragraph segmentation of expository text - Hearst - 1994
72   Bow: A toolkit for statistical language modeling (context) - McCallum - 1998
60   Statistical models for text segmentation - Beeferman, Berger et al. - 1999
54   Clustering categorical data: An approach based on dynamical .. - Gibson, Kleinberg et al. - 1998
49   Mining the Web's link structure (context) - Chakrabarti, Dom et al. - 1999
35   Text segmentation by topic - Ponte, Croft - 1997
31   Finding authorities and hubs from link structures on the wor.. - Borodin, Roberts et al. - 2001
31   Using clustering and SuperConcepts within SMART: TREC - Buckley, Mitra et al. - 1998

[Article contains additional citations not shown here]

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC