(Enter summary)
Abstract: This master thesis investigates the possibilities of using Suffix Tree Clustering (STC) algorithm to clustering search results in the Polish language as obtained from Web search engines. This work indicates possible problems arising when docttments in language of complex inflection and syntax are being dustered. We also share the results of our observations and experiments about the influence of STC thresholds on the number of created clusters. (Update)
Active bibliography (related documents): More All
0.5: Computer-Assisted Enlargement of Morphological Dictionaries - Daciuk
(Correct)
0.5: Partitioned Multiagent Systems in Information Oriented Domains - Goldman, Rosenschein
(Correct)
0.5: On Evaluating Information Visualization Techniques - Freitas, Luzzardi, Cava.. (2002)
(Correct)
Similar documents based on text: More All
0.9: Carrot 2 and Language Properties - Stefanowski, Weiss (2003)
(Correct)
0.8: Web Search Results Clustering in Polish: Experimental.. - Weiss, Stefanowski (2003)
(Correct)
0.7: Web search results clustering in Polish: experimental.. - Weiss, Stefanowski (2003)
(Correct)
BibTeX entry: (Update)
@mastersthesis{ weiss-clustering,
author = "Dawid Weiss",
title = "A Clustering Interface For Web Search Results In Polish And English",
school = "Pozna\'n University of Technology",
year = 2001,,
url = "citeseer.ist.psu.edu/541626.html" }
Citations (may not include all citations):
910
Fast algorithms for Mining Association Rules
- Agrawal, Srikant - 1994
681
The Unified Modeling Language User Guide (context) - Booch, Rumbaugh et al. - 1999
641
The anatomy of a large-scale hypertextual Web search engine
- Brin, Page - 1998
463
Term-weighting approaches in automatic text retrieval (context) - Salton, Buckley - 1988
344
The PageRank citation ranking: Bringing order to the Web
- Page, Brin et al. - 1998
153
AutoClass: A Bayesian classification system (context) - Cheeseman, Kelly et al. - 1988
123
A vector space model for automatic indexing (context) - Salton, Wong et al. - 1975
116
Multi-service search and comparison using the MetaCrawler
- Selberg, Etzioni - 1999
103
ScatterGather Cluster Based Approach to Browsing Large Docum..
- at, Pedersen et al. - 1992
68
A Technique for Measuring the Relative Size and Overlap of P.. (context) - Bharat, Broder - 1998
56
Reexamining Cluster Hypothesi ScatterGather Retrieval Result
- Pedersen, Reexamining et al. - 1996
42
Incremental Clustering and Dynamic Information Retrieval
- Charikar, Chekuri et al. - 1997
39
Fast and Intuitive Clustering of Web Documents
- Zamir, Etzioni et al. - 2001
36
Advances in Knowledge Discovery and Data Mining (context) - Agrawal, Manilia et al. - 1996
29
Mechanical Translation and computational Linguistics (context) - Development, Algorithm - 1968
28
UML Toolkit (context) - Eriksson, Penker - 1998
27
Partitioning-based clustering for web document categorizatio..
- Boley, Gini et al. - 1999
12
Human performance on clustering Web pages: A preliminary stu..
- Macskassy, Banerjee et al. - 1998
11
Information Retrieval: Data Structures and Algorithms (context) - Clustering, chapter et al. - 1992
8
KEA: Practical Automatic Keyphrase Extraction
- Witten, Paynter et al. - 2001
8
Visualization of Search Results in Document Retrieval System.. (context) - Zamir, Etzioni - 1998
7
the home page finder (context) - Shakes, Langheinrich et al. - 1997
4
The Harvest Information Discovery and Access System (context) - Schwartz, Bowman et al. - 1994
2
Amalthea: An Evolving Multi-Agent Information Filtering and .. (context) - Moukas, Maes - 1998
2
Schematyczny indeks a tergo polskich form wyrazowych (context) - in - 1993
2
Visualization of Search Results: Evolution and Evaluation (context) - Cugini, Laskowski et al. - 2001
1
good overview oPolish grammar and morphology (context) - Jagodzifiski, the et al. - 2001
1
An excellent overview of how search engines work and a good .. (context) - Web, Finding et al. - 2000
1
Efficient algorithms/or building so-called Sparse Suffix Tre.. (context) - Karkkainen, Ukkonen - 1996
1
A description of the most widely used English stemmer (context) - An, suffix et al. - 1980
1
Pitkow: Tenth WWW Survey Report (context) - Kehoe - 1999
1
An online translation engine supporting a number of most com.. (context) - engine, babelfish et al. - 2001
1
A nice description of pros and cons of using the most popula.. (context) - Guide, Search et al. - 2001
1
cluster hypothesis (context) - Information, Butterworths et al. - 1979
1
Polish edition Wprowadzenie do algorytmw) (context) - at, Charles et al. - 1990
1
An excellent newspaper (context) - online, archives et al. - 2001
1
custom folders (context) - search, http et al. - 2001
1
A little hazy description of the Vivisimo's algorithm (context) - overview, vivisimo et al. - 2001
1
Dogpile meta search engine (context) - search, http et al. - 2001
1
HyperText Markup Language specification from W3 Consortium (context) - Raggett, LeHors et al. - 1997
1
Framsticks: towards a simulation o// a nature-like world (context) - Komosifiski - 1999
1
Practically-efficient and space-economical implementation of.. (context) - Dorohonceanu, Nevill-Manning - 2000
1
TREC and BASE1 databases (context) - AcSys, University et al. - 2001
1
Yahoo Web search and directory (context) - engine, www et al. - 2001
1
Monographs on Statistics and Probability (context) - Gordon - 1999
1
Keyphrase extraction using a Bayesian classifier (context) - Frank, Paynter et al. - 2001
1
Detailed information about Grouper (context) - Zamir, Etzioni et al. - 1999
1
A search engine specialized in lookingforfiles on FTP server.. (context) - engine, ftpsearch et al. - 2001
1
Juicer - a data mining approach to information extraction fr.. (context) - MasIowska, Weiss - 2000
1
httpwww'inflm'cmwhitepaper'htm' Topic map explained (context) - Inc, Maps et al. - 2001
1
My favorite and beloved (context) - engine, www et al. - 2001
1
Introduces path compression and suffix trees (context) - suffix, algorithm et al. - 1976
1
analysis and modeling standard (context) - Group, Language et al. - 2001
Documents on the same site (http://www.cs.put.poznan.pl/dweiss/index.php/publications/index.xml?lang=en): More
Traceability: Taming uncontrolled change in software.. - Kowalczykiewicz, Weiss (2002)
(Correct)
Environments to Support Collaborative Software Engineering - Cornelia Boldyreff Mike
(Correct)
Web Search Results Clustering in Polish: Experimental.. - Weiss, Stefanowski (2003)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC