| Maarek, Y.S., Fagin, R., Ben-Shaul, I.Z., Pelleg, D., Ephemeral document clustering for web applications, IBM Research Report RJ 10186, April, 2000. |
....its own. Arguably the first query result visualization algorithm based on the paradigm of clustering was presented in Scatter Gather system [1] but it was not until Su#x Tree Clustering (STC) technique appeared [7] that the field of search results clustering, also called ephemeral clustering [4] had been given a substantial momentum. STC is an algorithm with at least two distinguishing features: its time complexity is linear with respect to the number of clustered snippets, and it operates on phrases present in the text, in contrast to most previous e#orts, built on top of standard IR ....
....clusters a and b defined by the formula: similarity(a, b) 1 a b , 0 otherwise, where # in the above formula denotes an arbitrarily chosen merge threshold. Unfortunately due to space constraints, we are not able to give a broader insight into algorithms that followed STC (refer to [4] for a review of existing methods) but they all had one common drawback: were designed, implemented and evaluated for English only. This puts in question their applicability to other languages, because, as we are about to show, the properties of a language severely a#ect algorithm s performance. ....
[Article contains additional citation context not shown here]
Maarek Y., Fagin R., Ben-Shaul I., et al. (2000) Ephemeral Document Clustering for Web Applications. IBM research report RJ 10186
....to be reformulated. Because the clustering process is performed dynamically for each query, the discovered set of groups is apt to depict the real structure of results, not some predefined categories (to differentiate this process from off line clustering or classification, we call it ephemeral [1], or on line clustering [2] Our research follows this direction also taking into account certain characteristic properties of the Polish language rich inflection (words have different sUffLxes depending on their role in a sentence) and less strict word order in a sentence (compared to English) ....
....algorithm was used. Undesired and troublesome high dimensionality of term frequency vectors was addressed in [6] where two deriva tions of graph partitioning were presented. Simple terms were replaced with lex ical affinities (pairs of words commonly appearing together in the text) in [1], with a modification of AHC as the clustering algorithm. A different approach to finding similarity measure between documents was introduced in Grouper [2] and MSEEC [7] systems. Both of these discover phrases shared by document references in the search results and perform clustering according to ....
[Article contains additional citation context not shown here]
Maarek, Y.S., Fagin, R., Ben-Shaul, I.Z., Pelleg, D.: Ephemeral document cluster- ing for web applications. Technical Report RJ 10186, IBM Research (2000)
....on its own. Arguably the first algorithm to deal with clustering and browsing large collections was presented in Scatter Gather system [1] but it was not until Su#x Tree Clustering (STC) technique appeared [7] that the field of search results clustering, also called ephemeral clustering [4] had been given a substantial momentum. STC is an algorithm with at least two distinguishing features: it s time complexity is linear with respect to the number of clustered snippets, and it operates on phrases present in the text, in contrast to most previous e#orts, built on top of standard IR ....
....to the number of clustered snippets, and it operates on phrases present in the text, in contrast to most previous e#orts, built on top of standard IR measures of terms frequency distribution. Unfortunately we are not able to give a broader insight into algorithms that followed STC (refer to [4] for more a review of existing methods) but they all had one common drawback: were designed, implemented and evaluated for English only. This puts in question their applicability to other languages, because, as we are about to show, the properties of a language severely a#ect algorithm s ....
[Article contains additional citation context not shown here]
Maarek Y., Fagin R., Ben-Shaul I., et al. (2000) Ephemeral Document Clustering for Web Applications. IBM research report RJ 10186
No context found.
Maarek, Y.S., Fagin, R., Ben-Shaul, I.Z., Pelleg, D., Ephemeral document clustering for web applications, IBM Research Report RJ 10186, April, 2000.
No context found.
Y. S. Maarek, R. Fagin, I. Z. Ben-Shaul, and D. Pelleg. Ephemeral document clustering for web applications. Technical Report RJ 10186, IBM Research, 2000.
No context found.
Y. S. Maarek, R. Fagin, I. Z. Ben-Shaul, and D. Pelleg. Ephemeral document clustering for web applications. Technical Report RJ 10186, IBM Research, 2000.
No context found.
R. Fagin, Y. Maarek, I. Ben-Shaul, and D. Pelleg. Ephemeral document clustering for web applications. Technical Report 10186, IBM, 2000.
No context found.
Maarek, Y.S., Fagin, R., Ben-Shaul, I.Z., and Pelleg, D., Ephemeral Document Clustering for Web Applications. Research Report RJ10186, IBM Almaden Res. Ctr., San Jose, CA, 2000. Available at http://www.almaden.ibm.com/cs/people/fagin/cluster.ps
No context found.
Yoelle S. Maarek, Ronald Fagin, Israel Z. Ben-Shaul, and Dan Pelleg. Ephemeral document clustering for web applications. Technical Report RJ 10186, IBM Research, 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC