2 citations found. Retrieving documents...
A. Gionis, D. Gunopulos, and N. Koudas. Efficient and Tunable Similar Set Retrieval. SIGMOD Conference. 2001.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Similarity Search in Sets and Categorical Data Using the.. - Nikos Mamoulis David   (Correct)

....high dimensional datasets of the UCI KDD Archive [22] collected by real application domains, the majority of the attributes are categorical. In addition, set data types (e.g. market basket transactions) are frequently used to describe complex data in object oriented object relational systems [11]. In this paper we show how a hierarchical index can be used to process efficiently similarity search and other related query types on sets and categorical data. In contrast to a previous method [1] the signature tree (SG tree) is suitable for a dynamic environment with frequent updates and ....

....not straightforward. To our knowledge the only method previously proposed for similarity search in set and categorical data spaces is [1] Due to its high relevance to our approach, we describe it in detail in the following paragraph. The similarity search problem for sets has also been studied in [11], where hash based indexes which provide approximate results are proposed. In this paper, we deal with the problem of finding the exact answers to queries, thus our method is not directly comparable to these indexes. Finally, a similar hierarchical index to the SG tree was proposed in [7] ....

A. Gionis, D. Gunopulos, and N. Koudas. Efficient and Tunable Similar Set Retrieval. SIGMOD Conference. 2001.


Thesus: Organizing Web Document Collections Based.. - Halkidi, Nguyen.. (2003)   (Correct)

....In order to be competitive, our measure should also try to be tractable in polynomial time. In [GH 96] the idea of calculating similarities between sets, using the Jaccard coefficient, is investigated. The indexing issue for distance similarity between sets of values is treated in recent work [GGK01], again using the Jaccard coefficient to calculate the similarity. BFS02] also investigate this with their mediator approach. The traditional cosine measure from the Information Retrieval literature (see [SM83] has the same behavior as the Jaccard coefficient. As a matter of fact, it can be ....

A. Gionis, D. Gunopulos, N. Koudras, "Efficient and Tunable Similar Set Retrieval", ACM- SIGMOD (2001).

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC