Clustering algorithms are exploratory data analysis tools that have proved to be essential for gaining valuable insights on various aspects and relationships of the underlying systems. In this paper we present gCLUTO, a stand-alone clustering software package which serves as an easy-to-use platform that combines clustering algorithms along with a number of analysis, reporting, and visualization tools to aid in interactive exploration and clustering-driven analysis of large datasets. gCLUTO provides a wide-range of algorithms that operate either directly on the original feature-based representation of the objects or on the object-to-object similarity graphs and are capable of analyzing different types of datasets and finding clusters with different characteristics. In addition, gCLUTO implements a project-oriented work-flow that eases the process of data analysis.
|
2252
|
Self-Organizing Maps
– Kohonen
- 1995
|
|
1479
|
Algorithms for Clustering Data
– Jain, C
- 1988
|
|
1009
|
Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer
– Salton
- 1989
|
|
728
|
Finding Groups in Data: An Introduction to Cluster Analysis
– Kaufman, Rousseeuw
- 1990
|
|
597
|
Data clustering: A review
– Jain, Murty, et al.
- 1999
|
|
572
|
A density-based algorithm for discovering clusters in large spatial databases with noise
– Ester, Kriegel, et al.
- 1996
|
|
468
|
Efficient and Effective Clustering Methods for Spatial Data Mining
– Ng, Han
- 1994
|
|
409
|
Using linear algebra for intelligent information retrieval
– Berry, Dumais, et al.
- 1995
|
|
377
|
Bayesian classification (AUTOCLASS): Theory and results
– Cheeseman, Stutz
- 1996
|
|
371
|
CURE: an efficient clustering algorithm for large databases
– Guha, Rastogi, et al.
- 1998
|
|
331
|
Quantitative monitoring of gene expression patterns with a complementary DNA microarray
– Schena, Shalon, et al.
- 1995
|
|
289
|
Data mining: an overview from a database perspective
– Chen, Han, et al.
- 1996
|
|
272
|
Reexamining the cluster hypothesis: Scatter/gather on retrieval results
– Hearst, Pedersen
- 1996
|
|
264
|
A Focus+Context Technique Based on Hyperbolic Geometry for Visualizing Large Hierarchies
– Lamping, Rao, et al.
- 1995
|
|
244
|
A Nonlinear Mapping for Data Structure Analysis
– Sammon
- 1969
|
|
214
|
Tree visualization with tree-maps: 2-d space-filling approach
– Shneiderman
- 1992
|
|
211
|
ROCK: A robust clustering algorithm for categorical attributes
– Guha, Rastogi, et al.
- 2000
|
|
170
|
A User’s Guide to Principal Components
– Jackson
- 1991
|
|
161
|
Graph visualization and navigation in information visualization: A survey
– Herman, Melançon, et al.
- 2000
|
|
160
|
Numerical Taxonomy
– Sneath, Sokal
- 1973
|
|
147
|
Chameleon: A hierarchical clustering algorithm using dynamic modeling
– Karypis, Han, et al.
- 1999
|
|
141
|
Multidimensional Scaling
– Kruskal, Wish
- 1978
|
|
100
|
Serial analysis of gene expression
– Velculescu, Zhang, et al.
- 1995
|
|
86
|
Criterion functions for document clustering: Experiments and analysis
– Zhao, Karypis
- 2001
|
|
69
|
Evaluation of hierarchical clustering algorithms for document datasets
– Zhao, Karypis
- 2002
|
|
55
|
A fast and highly quality multilevel scheme for partitioning irregular graphs
– Karypis, Kumar
- 1999
|
|
53
|
Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments
– Kerr, Churchill
- 2001
|
|
47
|
Step-wise clustering procedures
– King
- 1967
|
|
45
|
Bagging to improve the accuracy of a clustering procedure
– Dudoit, Fridlyand
- 2003
|
|
45
|
TM4: A free, open-source system for microarray data management and analysis
– Saeed, Sharov, et al.
- 2003
|
|
45
|
Visualization of search results: A comparative evaluation of text, 2d, and 3d interfaces
– Sebrechts, Vasilakis, et al.
- 1999
|
|
44
|
An interface for navigating clustered document sets returned by queries
– Allen, Obry, et al.
- 1993
|
|
44
|
CLUTO - A Clustering Toolkit
– Karypis
- 2002
|
|
43
|
Aspect windows, 3-D visualizations, and indirect comparisons of information retrieval systems
– Swan, Allan
- 1998
|
|
39
|
DBMS research at a crossroads: The vienna update
– Stonebraker, Agrawal, et al.
- 1993
|
|
36
|
Spatial clustering methods in data mining: A survey
– Han, Kamber, et al.
- 2001
|
|
32
|
An evaluation of techniques for clustering search results
– Leouski, Croft
- 1996
|
|
30
|
Multiplexed biochemical assays with biological chips
– Fodor, Rava, et al.
- 1993
|
|
28
|
Lighthouse: showing the way to relevant information
– Leuski, Allan
- 2000
|
|
23
|
On interactive visualization of high-dimensional data using the hyperbolic plane
– Walter, Ritter
- 2002
|
|
23
|
Birch: an efficient data clustering method for large databases
– Zhang, Ramakrishnan, et al.
- 1996
|
|
19
|
Clustering analysis and its applications
– Lee
- 1981
|
|
14
|
Value-based customer grouping from large retail data-sets
– Strehl, Ghosh
- 2000
|
|
5
|
Document clustering in concept space: The NIST information retrieval engine (NIRVE
– Cugini, Laskowski, et al.
- 1997
|
|
5
|
Clustering in the life sciences
– Zhao, Karypis
- 2003
|
|
2
|
Cluster 2.20 and treeview 1.60
– Eisen
- 2002
|
|
2
|
Sitaram Dikshitulu, Isidore Rigoutsos, and Kaizhong Zhang. Automated discovery of active motifs in three dimensional molecules
– Wang, Wang, et al.
- 1997
|
|
1
|
gcluto: A graphical interface for clustering algorithms and visualizations
– Rasmussen
- 2004
|
|
1
|
wxwindows - cross platform toolkit. http://www.wxwindows.org
– Smart
|