SETS USING P-TREES
Abstract:
Hierarchical clustering methods have attracted much attention by giving the user a maximum amount of flexibility. Rather than requiring parameter choices to be predetermined, the result represents all possible levels of granularity. In this paper, a hierarchical method is introduced that is fundamentally related to partitioning methods, such as k-medoids and k-means, as well as to a density based method, center-defined DENCLUE. It is superior to both k-means and k-medoids in its reduction of outlier influence. Nevertheless, it avoids both the time complexity of some partition-based algorithms and the storage requirements of density-based ones. An implementation that is particularly suited to spatial, stream, and multimedia data using P-trees for efficient data storage and access is presented. Many clustering algorithms require choosing parameters that will determine the granularity of the result. Partitioning methods such as the k-means and k-medoids [1] algorithms require that the number of clusters, k, be specified. Density-based methods, e.g., DENCLUE [2] and DBScan [3], use input parameters that relate directly to cluster size

