• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 330,663
Next 10 →

BIRCH: an efficient data clustering method for very large databases

by Tian Zhang, Raghu Ramakrishnan, Miron Livny - In Proc. of the ACM SIGMOD Intl. Conference on Management of Data (SIGMOD , 1996
"... Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely st,udied problems in this area is the identification of clusters, or deusel y populated regions, in a multi-dir nensional clataset. Prior work does not adequately address the problem of ..."
Abstract - Cited by 576 (2 self) - Add to MetaCart
of large datasets and minimization of 1/0 costs. This paper presents a data clustering method named Bfll (;”H (Balanced Iterative Reducing and Clustering using Hierarchies), and demonstrates that it is especially suitable for very large databases. BIRCH incrementally and clynamicall y clusters incoming

Novel data clustering methods and applications

by Sijia Liu , 2011
"... ..."
Abstract - Add to MetaCart
Abstract not found

Efficient and Effective Clustering Methods for Spatial Data Mining

by Raymond T. Ng, Jiawei Han , 1994
"... Spatial data mining is the discovery of interesting relationships and characteristics that may exist implicitly in spatial databases. In this paper, we explore whether clustering methods have a role to play in spatial data mining. To this end, we develop a new clustering method called CLARANS which ..."
Abstract - Cited by 709 (37 self) - Add to MetaCart
Spatial data mining is the discovery of interesting relationships and characteristics that may exist implicitly in spatial databases. In this paper, we explore whether clustering methods have a role to play in spatial data mining. To this end, we develop a new clustering method called CLARANS which

Clustering by passing messages between data points

by Brendan J. Frey, Delbert Dueck - Science , 2007
"... Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such “exemplars ” can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initi ..."
Abstract - Cited by 696 (8 self) - Add to MetaCart
if that initial choice is close to a good solution. We devised a method called “affinity propagation,” which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges

Consistency of spectral clustering

by Ulrike von Luxburg, Mikhail Belkin, Olivier Bousquet , 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract - Cited by 572 (15 self) - Add to MetaCart
under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis

A Comparison of Categorical Attribute Data Clustering Methods

by Tomi Kinnunen, Kong Aik Lee, Haizhou Li
"... Abstract. Clustering data in Euclidean space has a long tradition and there has been considerable attention on analyzing several different cost functions. Unfortunately these result rarely generalize to clustering of categorical attribute data. Instead, a simple heuristic k-modes is the most commonl ..."
Abstract - Add to MetaCart
on well known data sets. The proposed method provides better clustering quality than the other iterative methods at the cost of higher time complexity. 1

Distributional Clustering Of English Words

by Fernando Pereira, Naftali Tishby, Lillian Lee - In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics , 1993
"... We describe and evaluate experimentally a method for clustering words according to their dis- tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the si ..."
Abstract - Cited by 629 (27 self) - Add to MetaCart
We describe and evaluate experimentally a method for clustering words according to their dis- tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used

Data clustering: A review

by A. K. Jain, et al.
"... ..."
Abstract - Cited by 1940 (14 self) - Add to MetaCart
Abstract not found

Model-Based Clustering, Discriminant Analysis, and Density Estimation

by Chris Fraley, Adrian E. Raftery - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION , 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract - Cited by 573 (29 self) - Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However

OPTICS: Ordering Points To Identify the Clustering Structure

by Mihael Ankerst, Markus M. Breunig, Hans-peter Kriegel, Jörg Sander , 1999
"... Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract - Cited by 527 (51 self) - Add to MetaCart
Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all
Next 10 →
Results 1 - 10 of 330,663
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University