Results 1 - 10
of
150
Exact Indexing of Dynamic Time Warping
, 2002
"... The problem of indexing time series has attracted much research interest in the database community. Most algorithms used to index time series utilize the Euclidean distance or some variation thereof. However is has been forcefully shown that the Euclidean distance is a very brittle distance me ..."
Abstract
-
Cited by 350 (34 self)
- Add to MetaCart
The problem of indexing time series has attracted much research interest in the database community. Most algorithms used to index time series utilize the Euclidean distance or some variation thereof. However is has been forcefully shown that the Euclidean distance is a very brittle distance measure. Dynamic Time Warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis.
Computational Discovery of Gene Modules, Regulatory Networks and Expression Programs
, 2007
"... High-throughput molecular data are revolutionizing biology by providing massive amounts of information about gene expression and regulation. Such information is applicable both to furthering our understanding of fundamental biology and to developing new diagnostic and treatment approaches for diseas ..."
Abstract
-
Cited by 236 (17 self)
- Add to MetaCart
High-throughput molecular data are revolutionizing biology by providing massive amounts of information about gene expression and regulation. Such information is applicable both to furthering our understanding of fundamental biology and to developing new diagnostic and treatment approaches for diseases. However, novel mathematical methods are needed for extracting biological knowledge from highdimensional, complex and noisy data sources. In this thesis, I develop and apply three novel computational approaches for this task. The common theme of these approaches is that they seek to discover meaningful groups of genes, which confer robustness to noise and compress complex information into interpretable models. I first present the GRAM algorithm, which fuses information from genome-wide expression and in vivo transcription factor-DNA binding data to discover regulatory networks of
From patterns to pathways: gene expression data analysis comes of age.
- Nature Genetics
, 2002
"... ..."
Continuous representations of time-series gene expression data
- J COMPUT BIOL
, 2003
"... We present algorithms for time-series gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point ..."
Abstract
-
Cited by 96 (11 self)
- Add to MetaCart
(Show Context)
We present algorithms for time-series gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point influences the overall smooth expression curve. We constrain the spline coefficients of genes in the same class to have similar expression patterns, while also allowing for gene specific parameters. We show that unobserved time points can be reconstructed using our method with 10–15 % less error when compared to previous best methods. Our clustering algorithm operates directly on the continuous representations of gene expression profiles, and we demonstrate that this is particularly effective when applied to nonuniformly sampled data. Our continuous alignment algorithm also avoids difficulties encountered by discrete approaches. In particular, our method allows for control of the number of degrees of freedom of the warp through the specification of parameterized functions, which helps to avoid overfitting. We demonstrate that our algorithm produces stable low-error alignments on real expression data and further show a specific application to yeast knock-out data that produces biologically meaningful results.
A new approach to analyzing gene expression time series data
, 2002
"... 1 Introduction Principled methods for estimating unobserved time-points,clustering, and aligning microarray gene expression timeseries are needed to make such data useful for detailed anal-ysis. Datasets measuring temporal behavior of thousands of genes offer rich opportunities for computational bio ..."
Abstract
-
Cited by 91 (5 self)
- Add to MetaCart
1 Introduction Principled methods for estimating unobserved time-points,clustering, and aligning microarray gene expression timeseries are needed to make such data useful for detailed anal-ysis. Datasets measuring temporal behavior of thousands of genes offer rich opportunities for computational biologists. For example, Dynamic Bayesian Networks may be usedto build models and try to understand how genetic responses unfold. However, such modeling frameworks need a suf-ficient quantity of data in the appropriate format. Current gene expression time-series data often do not meet these re-quirements, since they may be missing data points, sampled non-uniformly, and measure biological processes that exhibittemporal variation.
Making Time-series Classification More Accurate Using Learned Constraints
, 2004
"... It has long been known that Dynamic Time Warping (DTW) is superior to Euclidean distance for classification and clustering of time series. However, until lately, most research has utilized Euclidean distance because it is more efficiently calculated. A recently introduced technique that greatly miti ..."
Abstract
-
Cited by 82 (18 self)
- Add to MetaCart
(Show Context)
It has long been known that Dynamic Time Warping (DTW) is superior to Euclidean distance for classification and clustering of time series. However, until lately, most research has utilized Euclidean distance because it is more efficiently calculated. A recently introduced technique that greatly mitigates DTWs demanding CPU time has sparked a flurry of research activity. However, the technique and its many extensions still only allow DTW to be applied to moderately large datasets. In addition, almost all of the research on DTW has focused exclusively on speeding up its calculation; there has been little work done on improving its accuracy. In this work, we target the accuracy aspect of DTW performance and introduce a new framework that learns arbitrary constraints on the warping path of the DTW calculation. Apart from improving the accuracy of classification, our technique as a side effect speeds up DTW by a wide margin as well. We show the utility of our approach on datasets from diverse domains and demonstrate significant gains in accuracy and efficiency.
Analysis Techniques for Microarray Time-Series Data (Extended Abstract)
- J. Comput. Biol
, 2000
"... Vladimir Filkov Steven Skiena Jizu Zhi Dept. of Computer Science and Center for Biotechnology State University of New York Stony Brook, NY 11794-4400 fvl lkov|skiena|zjizug@cs.sunysb.edu September 27, 2000 1 ..."
Abstract
-
Cited by 68 (3 self)
- Add to MetaCart
Vladimir Filkov Steven Skiena Jizu Zhi Dept. of Computer Science and Center for Biotechnology State University of New York Stony Brook, NY 11794-4400 fvl lkov|skiena|zjizug@cs.sunysb.edu September 27, 2000 1
Indexing large human-motion databases
- In Proc. 30th VLDB Conf
, 2004
"... Data-driven animation has become the industry standard for computer games and many animated movies and special effects. In particular, motion capture data recorded from live actors, is the most promising approach offered thus far for animating realistic human characters. However, the manipulation of ..."
Abstract
-
Cited by 64 (6 self)
- Add to MetaCart
(Show Context)
Data-driven animation has become the industry standard for computer games and many animated movies and special effects. In particular, motion capture data recorded from live actors, is the most promising approach offered thus far for animating realistic human characters. However, the manipulation of such data for general use and re-use is not yet a solved problem. Many of the existing techniques dealing with editing motion rely on indexing for annotation, segmentation, and re-ordering of the data. Euclidean distance is inappropriate for solving these indexing problems because of the inherent variability found in human motion. The limitations of Euclidean distance stems from the fact that it is very sensitive to distortions in the time axis. A partial solution to this problem, Dynamic Time Warping (DTW), aligns the time axis
Path similarity skeleton graph matching
- IEEE TRANS. PAMI
, 2008
"... This paper proposes a novel graph matching algorithm and applies it to shape recognition based on object silhouettes. The main idea is to match skeleton graphs by comparing the geodesic paths between skeleton endpoints. In contrast to typical tree or graph matching methods, we do not consider the to ..."
Abstract
-
Cited by 53 (8 self)
- Add to MetaCart
This paper proposes a novel graph matching algorithm and applies it to shape recognition based on object silhouettes. The main idea is to match skeleton graphs by comparing the geodesic paths between skeleton endpoints. In contrast to typical tree or graph matching methods, we do not consider the topological graph structure. Our approach is motivated by the fact that visually similar skeleton graphs may have completely different topological structures. The proposed comparison of geodesic paths between endpoints of skeleton graphs yields correct matching results in such cases. The skeletons are pruned by contour partitioning with Discrete Curve Evolution, which implies that the endpoints of skeleton branches correspond to visual parts of the objects. The experimental results demonstrate that our method is able to produce correct results in the presence of articulations, stretching, and contour deformations.