MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Extension of SIAM paper VizCluster and Its Application on Clustering Gene Expression Data

Download:
Download as a PDF
by Li Zhang, Chun Tang, Yuqing Song, Aidong Zhang, Murali Ramanathan
http://www.cs.buffalo.edu/~ys2/publication/dapd2002.pdf
Add To MetaCart

Abstract:

Visualization enables us to find structures, features, patterns and relationships in a dataset by presenting the data in various graphical forms with possible interactions. A visualization can provide a qualitative overview of large and complex datasets, can summarize data, and can assist in identifying regions of interest and appropriate parameters focused on quantitative analysis. Recent development of DNA microarray technology can be used to measure the expression levels of thousands of genes simultaneously. It has already had a significant impact on the field of bioinformatics, requiring innovative techniques to efficiently and effectively extract, analyze and visualize these fast growing data. In this paper, we present a dynamic interactive visualization environment, VizCluster, and its application on clustering gene expression data. VizCluster takes the advantage of graphical visualization methods to reveal the underlining data patterns. It combines the merits of both high dimensional projection scatter-plot and parallel coordinate plots. In its core lies a nonlinear projection which maps the n-dimensional vectors into twodimensional points. To preserves the information at different scales and yet reduces the typical problem of parallel coordinate plots being messy caused by overlapping lines, a zip zooming viewing method is proposed. Integrated with other features, VizCluster is developed to give a simple, fast, intuitive and yet powerful view of the data set. Its primary applications are on the classification of samples and evaluation of gene clusters for microarray datasets. Other applications include clustering detection and validation on low dimensional datasets. We demonstrate that VizCluster approach is promising to be used for analyzing and visualizing microarray data sets and further development is worthwhile. 1 1

Citations

798 Cluster analysis and display of genome-wide expression patterns – Eisen, Spellman, et al. - 1998
511 Molecular classification of cancer: class discovery and class prediction by gene expression monitoring – Goloub, Slonim, et al. - 1999
496 The Use of Multiple Measurements in Taxonomic Problems – Fisher - 1936
344 Readings in Information Visualization: Using Vision to Think – Card, Mackinlay, et al. - 1999
338 A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection – Kohavi - 1995
270 Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation – Tamayo, Slonim, et al. - 1999
254 Significance analysis of microarrays applied to the ionizing radiation response – VG, Tibshirani, et al. - 2001
230 A nonlinear mapping for data structure analysis – Sammon - 1969
199 Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling – Alizadeh, Eisen, et al.
162 Support vector machine classification and validation of cancer tissue samples using microarray expression data – Furey, Cristianini, et al. - 2000
153 Singular value decomposition for genome-wide expression data processing and modeling – Alter, Brown, et al. - 2000
123 The Grand Tour: A Tool for Viewing Multidimensional Data – Asimov - 1985
82 Interactive HighDimensional Data Visualization – Buja, Cook, et al. - 1996
74 XGobi: Interactive dynamic graphics in the X window system with a link to S – Swayne, Cook, et al. - 1991
61 Principal components analysis to summarize microarray experiments: application to sporulation time series – Raychaudhuri, JM, et al.
54 Validating clustering for gene expression data – YEUNG, HAYNOR, et al.
50 Fundamental patterns underlying gene expression profiles: Simplicity from complexity – Holter, Mitra, et al. - 2000
37 Gene functional classification from heterogeneous data – PAVLIDIS, WESTON - 2001
29 An empirical study on principal component analysis for clustering gene expression data – Yeung, Ruzzo
28 Class discovery in gene expression data – Ben-Dor, Friedman, et al. - 2001
28 Cross-validation and the bootstrap: estimating the error rate of a prediction rule – EFRON, TIBSHIRANI - 1995
25 Visualizing Class Structures of Multi-Dimensional Data – Dhillon, Modha, et al. - 1998
20 Dynamic modeling of gene expression data – HOLTER, MARITAN, et al. - 2001
19 A nonparametric scoring algorithm for identifying informative genes from microarray data – Park, Pagano, et al. - 2001
17 Cluster analysis for researchers. Lifetime Learning – Romesburg - 1984
15 An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles – Thomas, Olson, et al. - 2001
14 Supervised Harvesting of Expression Trees – Hastie, Tibshirani, et al.
14 I (2001) J-Express: exploring gene expression data using Java – Disvik, Jonassen
13 Recursive partitioning for tumor classification with gene expression microarray data – Zhang, Yu, et al. - 2001
11 Yannis Batistakis, and Michalis Vazirgiannis. On clustering validation techniques – Halkidi - 2001
9 Optimal Arrangement of Leaves in the Tree Representing Hierarchical Clustering of Gene Expression Data – Biedl, Brejova, et al. - 2001
9 Super-Paramagnetic Clustering of Yeast Gene Expression Profiles – Getz, Levine, et al. - 2000
9 Zipf’s Law in Importance of Genes for Cancer Classification Using Microarray Data – Li - 2001
8 What is bioinformatics? An introduction and overview – Luscombe, Greenbaum, et al. - 2001
7 DNA microarray data analysis and regression modeling for genetic expression profiling – West, Nevins, et al. - 2000
7 Analysis of Molecular Profile Data Using Generative and Discriminative Methods – Moler, Chow, et al. - 2000
7 Topographic Mappings and Feed-Forward Neural Networks – Tipping - 1996
7 Orca: A Visualization Toolkit for High–Dimensional Data – Sutherland, Rossini, et al. - 2000
6 Esa Alhoniemi, and Juha Parhankangas. SOM Toolbox for Matlab 5 – Vesanto, Himberg - 2000
5 Computational Aspects of Expression Data – Vingron, Hoheisel - 1999
5 An interactive visual framework for detecting clusters of a multidimensional dataset – Bhadra, Garg - 2001
4 Visualization of expression clusters using Sammon’s non-linear mapping – Ewing, Cherry - 2001
4 Interactive Visualization and Analysis for Gene Expression Data – Tang, Zhang, et al. - 2002
4 and Qiang Luo. High dimensional clustering using parallel coordinates and the grand tour – Wegman - 1996
3 Brazma and Jaak Vilo. Minireview: Gene expression data analysis. Federation of European Biochemical societies – Alvis - 2000
3 Large-Scale Temporal Gene Expression Mapping – Wen, Fuhrman, et al. - 1998
3 Zhang and Aidong Zhang , Murali Ramanathan. A maximum entropy approach to classifying gene array data sets – Jiang, Tang, et al. - 2001
2 Genome-Wide Gene Expression Profiles of the Developing Mouse Hippocampus – Mody, Cao, et al. - 2001
2 Aidong Zhang, Murali Ramanathan. Interrelated Two-way CLustering: An Unsupervised Approach for Gene Expression Data Analysis – Tang, Zhang - 2001
2 Yuqing Song, Aidong Zhang, Murali Ramanathan, “IVADA: An Interactive Visualization Approach to Data Analysis and – Zhang, Tang, et al. - 2002