See this document in CiteSeerX!

Web Mining (2004)  (Make Corrections)  (1 citation)
Johannes Fürnkranz



  Home/Search   Context   Related

 
View or download:
ke.informatik.tud...webminingcrc.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  faure.isti.cnr....CP2(Google)pdf (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Research in web mining tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. This chapter provides a brief overview of web mining techniques and research areas, most notably hypertext classification, wrapper induction, recommender systems and web usage... (Update)

Cited by:   More
Conceptual Knowledge Processing with Google - Koester (2005)   (Correct)

Similar documents (at the sentence level):
24.7%:   Web Structure Mining Exploiting the Graph Structure of the.. - Fürnkranz (2002)   (Correct)
5.3%:   Hyperlink Ensembles: A Case Study in Hypertext Classification - Fürnkranz (2001)   (Correct)

Active bibliography (related documents):   More   All
0.6:   WebMate: A Personal Agent for Browsing and Searching - Chen, Sycara (1998)   (Correct)
0.5:   Personalising On-Line Information Retrieval Support with a Genetic.. - Er (1996)   (Correct)
0.5:   Using Context to Assist in Personal File Retrieval - Soules (2006)   (Correct)

Similar documents based on text:   More   All
0.1:   ParaMEME: A Parallel Implementation and a Web Interface for a.. - Grundy, al. (1996)   (Correct)
0.1:   Integrating External Information Sources to Guide Worldwide.. - Monge, Elkan (1995)   (Correct)
0.1:   The WEBFIND tool for finding scientific papers over the.. - Monge, Elkan (1996)   (Correct)

BibTeX entry:   (Update)

Johannes F urnkranz. Web mining. In Oded Maimon and Lior Rokach, editors, The Data Mining and Knowledge Discovery Handbook, pages 899-- 920. Springer, 2005. http://citeseer.ist.psu.edu/urnkranz04web.html   More

@misc{ urnkranz05web,
  author = "J. urnkranz",
  title = "Web mining",
  text = "Johannes F urnkranz. Web mining. In Oded Maimon and Lior Rokach, editors,
    The Data Mining and Knowledge Discovery Handbook, pages 899-- 920. Springer,
    2005.",
  year = "2005",
  url = "citeseer.ist.psu.edu/urnkranz04web.html" }
Citations (may not include all citations):
641   The anatomy of a large-scale hypertextual Web search engine - Brin, Page - 1998
576   Authoritative sources in a hyperlinked environment - Kleinberg - 1999
568   Indexing by latent semantic analysis - Deerwester, Dumais et al. - 1990
492   Learning logical definitions from relations (context) - Quinlan - 1990
463   Term-weighting approaches in automatic text retrieval (context) - Salton, Buckley - 1988
432   Automatic Text Processing: The Transformation (context) - Salton - 1989
404   Agents that reduce work and information overload (context) - Maes - 1994
375   On power-law relationships of the internet topology - Faloutsos, Faloutsos et al. - 1999
318   Scientific American (context) - Berners-Lee, Hendler et al. - 2001
225   NewsWeeder: Learning to filter netnews - Lang - 1995
215   A comparative study on feature selection in text categorizat.. - Yang, Pedersen - 1997
207   WebWatcher: A learning apprentice for the world wide web - Armstrong, Freitag et al. - 1995
188   Empirical analysis of predictive algorithms for collaborativ.. - Breese, Heckerman et al. - 1998
178   A softbot-based interface to the internet - Etzioni, Weld - 1994
171   A scalable comparison-shopping agent for the World-Wide Web - Doorenbos, Etzioni et al. - 1997
164   Webert: Identifying interesting web sites (context) - Pazzani, Muramatsu et al.
163   Improved algorithms for topic distillation in a hyperlinked .. - Bharat, Henzinger - 1998
155   Grouplens: Applying collaborative filtering to usenet news - Konstan, Miller et al. - 1997
154   Automatic resource compilation by analyzing hyperlink struct.. - Chakrabarti, Dom et al. - 1998
140   Graph structure in the Web (context) - Broder, Kumar et al. - 2000
140   A comparison of event models for naive bayes text classifica.. - McCallum, Nigam - 1998
139   Machine learning in automated text categorization - Sebastiani - 2002
132   Data preparation for mining world wide web browsing patterns - Cooley, Mobasher et al. - 1999
129   Searching the world wide web - Lawrence, Giles - 1998
126   Diameter of the world-wide web (context) - Albert, Jeong et al. - 1999
124   Learning information retrieval agents: Experiments with auto.. - Balabanovic, Shoham - 1995
123   A vector space model for automatic indexing (context) - Salton, Wong et al. - 1975
114   Learning interface agents (context) - Kozierok, Maes - 1993
111   Collaborative interface agents - Lashkari, Metral et al. - 1994
105   Learning information extraction rules for semi-structured an.. - Soderland - 1999
90   Enhanced hypertext categorization using hyperlinks - Chakrabarti, Dom et al. - 1998
90   Ensemble methods in machine learning - Dietterich - 2000
87   Ontologies: Silver Bullet for Knowledge Management and Elect.. - Fensel - 2001
85   Web usage mining: Discovery and applications of usage patter.. - Srivastava, Cooley et al. - 2000
82   Finding related pages in the World Wide Web - Dean, Henzinger - 1999
81   Learning to construct knowledge bases from the World Wide We.. - Craven, DiPasquo et al. - 2000
77   Evolving agents for personalized information filtering (context) - Sheth, Maes - 1993
75   ParaSite: Mining structural information on the Web (context) - Spertus - 1997
73   Information extraction from HTML: Application of a general m.. - Freitag - 1998
73   An evaluation of phrasal and clustered representations on a .. (context) - Lewis - 1992
68   A technique for measuring the relative size and overlap of p.. (context) - Bharat, Broder - 1998
66   Learning rules that classify e-mail - Cohen - 1996
65   GENVL and WWWW: Tools for taming the Web - McBryan - 1994
64   Automatically generating extraction patterns from untagged t.. - Riloff - 1996
62   Automatic personalization based on web usage mining - Mobasher, Cooley et al. - 2000
57   Optimizing search engines using clickthrough data - Joachims - 2002
52   The connectivity server: Fast access to linkage information .. (context) - Bharat, Broder et al. - 1998
48   Generating finite-state transducers for semistructured data .. - Hsu, Dung - 1998
47   Towards adaptive web sites: Conceptual framework and case st.. - Perkowitz, Etzioni - 2000
47   Item-based collaborative filtering recommendation algorithms - Sarwar, Karypis et al. - 2001
45   Moving up the information food chain: Deploying softbots on .. - Etzioni
44   Wrapper induction: Efficiency and expressiveness - Kushmerick - 2000
41   Determinate literals in inductive logic programming (context) - Quinlan - 1991
38   Web mining research: A survey - Kosala, Blockeel - 2000
35   Latent class models for collaborative filtering (context) - Hofmann, Puzicha - 1999
33   An empirical study of automated dictionary construction for .. - Riloff - 1996
31   Clustering methods for collaborative filtering - Ungar, Foster - 1998
31   Interface agents that learn: An investigation of learning is.. - Payne, Edwards - 1997
29   A study of approaches to hypertext categorization - Yang, Slattery et al. - 2002
28   Learning relations by pathfinding - Richards, Mooney - 1992
24   First-order learning for Web mining - Craven, Slattery et al. - 1998
24   Data mining for hypertext: A tutorial survey - Chakrabarti - 2000
22   Probabilistic models for unified collaborative and content-b.. - Popescul, Ungar et al. - 2001
21   Relational learning with statistical predicate invention: Be.. - Craven, Slattery - 2001
20   Knowledge-based navigation of complex information spaces - Burke, Hammond et al. - 1996
17   Discovery and evaluation of aggregate usage profiles for web.. - Mobasher, Dai et al. - 2002
16   A case study in using linguistic phrases for text categoriza.. (context) - Frnkranz, Mitchell et al. - 1998
16   A practical hypertext categorization method using links and .. (context) - Oh, Myaeng et al. - 2000
16   Special issue on recommender systems (context) - Resnick, Varian - 1997
15   Towards semantic web mining - Berendt, Hotho et al. - 2002
15   Discovering test set regularities in relational domains - Slattery, Mitchell - 2000
15   Knowledge portals --- ontologies at work - Staab, Maedche - 2001
14   Information extraction from world wide web -- a survey - Eikvil - 1999
14   Better bayesian filtering (context) - Graham - 2003
14   A unifying approach to HTML wrapper representation and learn.. - Grieser, Jantke et al. - 2000
14   Feature subset selection in text-learning (context) - Mladeni - 1998
14   Efficient adaptive-support association rule mining for recom.. - Lin, Alvarez et al. - 2002
13   Electronic commerce recommender applications (context) - Schafer, Konstan et al. - 2000
12   Discovery of web robot sessions based on their navigational .. - Tan, Kumar - 2002
10   Wrapper maintenance: A machine learning approach - Lerman, Minton et al. - 2003
10   Content-boosted collaborative filtering for improved recomme.. - Melville, Mooney et al. - 2002
8   Ontology learning part one --- on discovering taxonomic rela.. - Maedche, Pekar et al. - 2003
8   Knowledge and Information Systems (context) - Levene, Borges et al. - 2001
8   Feature engineering for text classification - Scott, Matwin - 1999
7   Bottom-up relational learning of pattern matching rules for .. - Califf - 2003
7   Web-collaborative filtering: Recommending music by crawling .. - Cohen, Fan - 2000
6   The Lixto data extraction project --- back and forth between.. - Gottlob, Koch et al. - 2004
6   Learning ontologies for the semantic web - Maedche, Staab - 2001
6   Department of Intelligent Systems (context) - Mladeni, WebWatcher et al. - 1996
6   Effective web data extraction with standard XML technologies - Myllymaki - 2001
5   Turning Yahoo into an automatic web-page classifier (context) - Mladeni - 1998
5   Using site semantics to analyze (context) - Berendt - 2002
5   Word sequences as features in text learning (context) - Mladeni, Grobelnik - 1998
4   Technical paper recommendation: A study in combining multipl.. - Basu, Hirsh et al. - 2001
4   Learning to filter unsolicited commercial e-mail (context) - Androutsopoulos, Paliouras et al. - 2004
4   Web usage mining as a tool for personalization: A survey (context) - Pierrakos, Paliouras et al. - 2003
4   Relational Data Mining: Inductive Logic Programming for Know.. (context) - Dzeroski, Lavra - 2001
3   IEMS -- the intelligent email sorter (context) - Crawford, Kay et al. - 2002
3   The laborious way from data mining to web log mining - Spiliopoulou - 1999
3   Mining the World Wide Web: An Information Search Approach (context) - Chang, Healy et al. - 2001
2   Communications of the ACM (context) - Berners-Lee, Cailliau et al. - 1994
2   Using collaborative filtering to weave and information tapes.. (context) - Goldberg, Nichols et al. - 1992
2   Hybrid hill-climbing and knowledge-based methods for intelli.. (context) - Mock - 1996
2   Mining the Web: Analysis of Hypertext and Semi Structured Da.. (context) - Chakrabarti - 2002
1   Text-learning and related intelligent agents: A survey (context) - Mladeni - 1999
1   spam filtering: A challenge problem for data mining (context) - Fawcett, vivo - 2003
1   Austrian Research Institute for Artificial Intelligence (context) - Frnkranz, using et al. - 1998
1   Hyperlink ensembles: A case study in hypertext classificatio.. (context) - Frnkranz - 2002
1   Frequently-asked question files: Experiences with the FAQ fi.. (context) - Burke, Hammond et al. - 1997
1   User profiling for the Melvil knowledge retrieval system (context) - Frnkranz, Holzbaur et al. - 2002
1   Wiemer-Hastings (context) - Staab, Maedche et al. - 2000
1   Information Extraction in the Web Era: Natural Language Comm.. (context) - Pazienza - 2003
1   Learning to match ontologies (context) - Doan, Madhavan et al. - 2003
1   Machine Learning for Information Extraction: Proceedings of .. (context) - Califf - 1999
1   Email answering assistance by semi-supervised text classific.. - Scheffer - 2004

Documents on the same site (http://faure.isti.cnr.it/~fabrizio/CP2(Google)-pdf.html):   More
Deliverable Identification Sheet - Project Ref No   (Correct)
How Weak Text Categorizers Can Strengthen Performance.. - Uren, Addis (2001)   (Correct)
Low level information extraction: a Bayesian network based.. - Bouckaert   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC