(Enter summary)
Abstract: This article presents a high-level discussion of some problems in information retrieval that
are unique to web search engines. The goal is to raise awareness and stimulate research in these
areas. (Update)
Cited by: More
Embracing Statistical Challenges in the Information Technology Age - Yu
(Correct)
Propagating Trust and Distrust to Demote Web Spam - Baoning Wu Vinay (2006)
(Correct)
Discovering Large Dense Subgraphs in Massive Graphs - David Gibson Ravi (2005)
(Correct)
Active bibliography (related documents): More All
0.5: Web Information Retrieval - an Algorithmic Perspective - Henzinger (2000)
(Correct)
0.3: Infodemiology: The Epidemiology of (Mis)information - Eysenbach (2002)
(Correct)
0.3: Focused crawling in depression portal search: A feasibility study - Thanh Tin Tang
(Correct)
Similar documents based on text: More All
1.2: Query-Free News Search - Henzinger, Chang, Milch, Brin (2003)
(Correct)
0.4: Searching the Web by Voice - Franz, Milch
(Correct)
0.2: Who Links to Whom: Mining Linkage between Web Sites - Bharat, Chang, Henzinger, Ruhl (2003)
(Correct)
Related documents from co-citation: More All
8: Authoritative sources in a hyperlinked environment
- Kleinberg - 1997
6: Identifying link farm spam pages
- Wu, Davison - 2005
5: The anatomy of a large-scale hypertextual Web search engine
- Brin, Page
BibTeX entry: (Update)
Henzinger, M.R., Motwani, R., Silverstein, C.: Challenges in Web Search Engines. In: Proc. of the 18th International Joint Conference on Artificial Intelligence (2003) 1573-1579 http://citeseer.ist.psu.edu/henzinger02challenges.html More
@misc{ henzinger03challenges,
author = "M. Henzinger and R. Motwani and C. Silverstein",
title = "Challenges in Web Search Engines",
text = "Henzinger, M.R., Motwani, R., Silverstein, C.: Challenges in Web Search
Engines. In: Proc. of the 18th International Joint Conference on Artificial
Intelligence (2003) 1573-1579",
year = "2003",
url = "citeseer.ist.psu.edu/henzinger02challenges.html" }
Citations (may not include all citations):
641
The Anatomy of a Large-Scale Hypertextual Web Search Engine
- Brin, Page - 1998
576
Authoritative sources in a hyperlinked environment
- Kleinberg - 1998
67
the resemblance and containment of documents
- Broder - 1997
66
Extracting Schema from Semistructured Data
- Nestorov, Abiteboul et al. - 1998
57
Analysis of a very large AltaVista query log
- Silverstein, Henzinger et al. - 1999
55
Copy detection mechanisms for digital documents
- Brin, Davis et al.
25
What can you do with a Web in your Pocket
- Brin, Page et al. - 1998
19
Finding replicated web collections
- Cho, Shivakumar et al.
19
Integrating the Document Object Model with hyperlinks for en..
- Chakrabarti - 2001
16
ective Site Finding using Link Anchor Information (context) - Craswell, Hawking et al. - 2001
12
Trawling emerging cybercommunities automatically (context) - Kumar, Raghavan et al. - 1999
11
Generating grammars for SGML tagged texts lacking DTD
- Ahonen, Mannila et al. - 1994
4
Enhanced topic distillation using text (context) - Chakrabarti - 2001
3
Health Information on the Internet Accessibility (context) - Berland, Elliott et al. - 2001
2
A comparison of Techniques to Find Mirrored Hosts on the Wor.. (context) - Bharat, Broder et al. - 2000
1
Evaluation Search Engines using Clickthrough Data (context) - Joachims - 2002
1
Attending to Web Pages (context) - Faraday - 2001
http://www.w3.org/Style/
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.henzinger.com/monika/mpublications.html): More
Analysis of a Very Large AltaVista Query Log - Silverstein (1998)
(Correct)
Web Information Retrieval - an Algorithmic Perspective - Henzinger (2000)
(Correct)
Query-Free News Search - Henzinger, Chang, Milch, Brin (2003)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC