| V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. Clustering large datasets in arbitrary metric spaces. International Conference on Data Engineering, pages 502--511, 1999. |
....of the cluster. 2. Cluster the rest of the points in main memory and if a cluster is very tight, then replace the corresponding set of points by its statistics; this is a compression set. 3. Consider merging compression sets. For arbitrary metric spaces, a single pass algorithm is developed in [44] which uses a data structure like an R tree to store clusters and uses summaries of data in a similar manner as in [16] Regarding the question of nding similarity measures according to which to cluster objects, in [33] methods for determining similar regions in tabular data (given in a matrix) ....
....be missed) are avoided by density biased sampling. The goal is to under sample dense regions and over sample sparse regions of the data. A memory ecient single pass algorithm is proposed that approximates density biased sampling. An excellent detailed exposition of algorithms in [51] 16] and [44] can be found in [84] An excellent survey of the algorithms in [48, 2, 11, 13] is given in [41] 7 Mining the web The challenge in mining the web for useful information is the huge size and unstructured placement of data. Search engines aim to search the web for a speci c topic and give the ....
V. Ganti, R. Ramakrishnan, J. Gehrke, A. L. Powell, and J. C. French. Clustering large datasets in arbitrary metric spaces. In ICDE, pages 502-511, 1999.
....high dimensional feature spaces (HDFS) it must be used in conjunction with a dimensionality reduction technique in order to exploit the correlations in data and hence achieve further scalability. This approach is commonly used in both multimedia retrieval ( 43, 103, 76, 142] and data mining ([47, 8, 49]) applications. The idea is to first reduce the dimensionality of the data and then index the reduced space using a multidimensional index structure [43] Most of the information in the dataset is condensed to a few dimensions (the first few principal components (PCs) by using principal component ....
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. Clustering large datasets in arbitrary metric spaces. Proc. of ICDE, 1999.
.... literature describes various interesting data clustering approaches including their efficient and refined implementations [5] 8] 11] 12] 16] 17] 24] Because our main interest lies in visualizing clusters, we focus on the problem of clustering large data sets in coordinate space [7], also referred to as the Euclidian space, in which data objects can be represented as vectors . Unlike data sets in a distance space [7] also referred to as the data domain or the arbitrary metric space, the vector representation gives access to various efficiently implemented vector operations ....
....[11] 12] 16] 17] 24] Because our main interest lies in visualizing clusters, we focus on the problem of clustering large data sets in coordinate space [7] also referred to as the Euclidian space, in which data objects can be represented as vectors . Unlike data sets in a distance space [7], also referred to as the data domain or the arbitrary metric space, the vector representation gives access to various efficiently implemented vector operations (e.g. addition, multiplication, dot product, etc. which enables one to calculate simplified representations of complex data subregions ....
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. "Clustering Large Datasets in Arbitrary Metric Spaces." Technical report, University of Wisconsin-Madison, 1998.
....can be considered as a constraint. iii) A very general solution to the problem of working in non metric spaces is provided by FastMap [3] where each object is mapped into a point of an Euclidean space trying to preserve mutual distances. Such a technique is applied for instance by the Bubble FM [5] clustering algorithm. The main drawback of this approach is due to the approximations induced by the mapping: in some applications (such as clustering) it can force the coexistence of computations over the original space (to obtain precise results) and over the mapped metric space (because the ....
V. Ganti, R. Ramakrishnan, J. Gehrke, A. L. Powell, and J. C. French. Clustering large datasets in arbitrary metric spaces. In Proceedings of ICDE
....data, do not perform equally well on high dimensional data. Some of the most popular database indexing structures used for similarity querying are the R # tree [5] X tree [4] SS tree [20] SR tree [12] TV tree [16] the Pyramid Technique [2] PK tree [21] BUBBLE and BUBBLE FM [9] are two recently proposed algorithms for clustering datasets in arbitrary distance spaces, based on BIRCH [22] a scalable clustering algorithm. BUBBLE FM uses FastMap to improve scalability. 3 Embedding Methods FastMap. The approach taken in FastMap [7] for embedding points into k dimensional ....
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., French, J.: Clustering large datasets in arbitrary metric spaces. Proc. 15th ICDE Conf. (1999) 502--511
.... 20 30 dimensions) a simple sequential scan usually performs better at higher dimensionalities [6, 43] To scale to higher dimensionalities, a commonly used approach is dimensionality reduction [20] This technique has been proposed for both multimedia retrieval [17, 36, 27, 42] and data mining ([18, 4, 21]) applications. The idea is to first reduce the dimensionality of the data and then index the reduced space using a multidimensional index structure [17] Most of the information in the dataset is condensed to a few dimensions (the first few principal components (PCs) by using principal component ....
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. Clustering large datasets in arbitrary metric spaces. Proc. of ICDE, 1999.
.... 20 30 dimensions) a simple sequential scan usually performs better at higher dimensionalities [6, 43] To scale to higher dimensionalities, a commonly used approach is dimensionality reduction [20] This technique has been proposed for both multimedia retrieval [17, 36, 27, 42] and data mining ([18, 4, 21]) applications. The idea is to first reduce the dimensionality of the data and then index the reduced space using a multidimensional index structure [17] Most of the information in the dataset is condensed to a few dimensions (the first few principal components (PCs) by using principal component ....
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. Clustering large datasets in arbitrary metric spaces. Proc. of ICDE, 1999.
No context found.
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., and French, J. (1999). Clustering Large Datasets in Arbitrary Metric Spaces. In 15th International Conference on Data Engineering (ICDE'99), pages 502--511, Sydney.
No context found.
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. In Proceedings of the IEEE International Conference on Data Engineering, Sydney, March 1999.
No context found.
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. Clustering large datasets in arbitrary metric spaces. Technical report, University of Wisconsin-Madison, 1998.
....problem that users encounter. The terms in the vocabulary capture semantic nuances and as a result are often related in subtle ways. The figure shows how we are attempting to help users visualize this aspect of the vocabulary. Using techniques for automating the construction of authority files [6, 7, 8, 10] that we developed in collaboration with astronomers studying publication trends CHEMILUMINESCENT CHEMILUMINESCENCE PHOTOLYSIS CHEMILUMINESCENCE NITRIC OXIDE CHEMILUMINESCENCE LUMINOL CHEMILUMINESCENCE ETHENE CHEMILUMINESCENCE ETHYNE CHEMILUM EDDY CORR ....
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. Clustering Large Datasets in Arbitrary Metric Spaces. In 15th International Conference on Data Engineering (ICDE'99), pages 502--511, Sydney, March 1999.
No context found.
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French. Clustering large datasets in arbitrary metric spaces. International Conference on Data Engineering, pages 502--511, 1999.
No context found.
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French, "Clustering large datasets in arbitrary metric spaces," in Proceedings of the Fifteenth International Conference on Data Engineering, pp. 502--511, 1999.
No context found.
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A.L., French, J.C., Clustering Large Datasets in Arbitrary Metric Spaces. Proceedings of the 15th International Conference on Data Engineering (ICDE), pp.502-511, 1999
No context found.
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. In Proc. 15th IEEE Conf. Data Engineering, ICDE, 23-26 March1999.
No context found.
V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French, "Clustering large datasets in arbitrary metric spaces," in Proceedings of the Fifteenth International Conference on Data Engineering, pp. 502--511, 1999.
No context found.
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A.L., French, J.C.: Clustering large datasets in arbitrary metric spaces. In: Proceedings of the 15th International Conference on Data Engineering (ICDE 1999.
No context found.
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. In Proc. 15th IEEE Conf. Data Engineering, ICDE, 23-26 March 1999.
No context found.
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison L. Powell, and James C. French. Clustering large datasets in arbitrary metric spaces. In Proceedings of the 15th International Conference on Data Engineering (ICDE 1999), pages 502--511, Sydney, Austrialia, March 1999.
No context found.
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., French, J.: Clustering large datasets in arbitrary metric spaces. Proc. 15th ICDE Conf. (1999) 502--511
No context found.
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. Technical Report 250, University of Wisconsin-Madison, 1998.
No context found.
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. In Proc. 15th IEEE Conf. Data Engineering, ICDE, 23-26 March 1999.
No context found.
. V. Ganti, R. Ramakrishnan, J. Gehrke, A. Powell and J. French. Clustering large datasets in arbitrary metric spaces. In the Proceedings of International Conference on Data Engineering, 1999.
No context found.
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., and French, J. Clustering large datasets in arbitrary metric spaces. in Proceedings of the I. th Int'l Conf. on Data Engineering, 1999, 502-511.
No context found.
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison L. Powell, and James C. French. Clustering large datasets in arbitrary metric spaces. In ICDE, pages 502-511, 1999. 12
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC