Results 1  10
of
119
On Groupoid C∗Algebras, Persistent Homology and TimeFrequency Analysis
"... We study some topological aspects in timefrequency analysis in the context of dimensionality reduction using C ∗algebras and noncommutative topology. Our main objective is to propose and analyze new conceptual and algorithmic strategies for computing topological features of datasets arising in tim ..."
Abstract

Cited by 44 (1 self)
 Add to MetaCart
(Show Context)
We study some topological aspects in timefrequency analysis in the context of dimensionality reduction using C ∗algebras and noncommutative topology. Our main objective is to propose and analyze new conceptual and algorithmic strategies for computing topological features of datasets arising in timefrequency analysis. The main result of our work is to illustrate how noncommutative C ∗algebras and the concept of Morita equivalence can be applied as a new type of analysis layer in signal processing. From a conceptual point of view, we use groupoid C∗algebras constructed with timefrequency data in order to study a given signal. From a computational point of view, we consider persistent homology as an algorithmic tool for estimating topological properties in timefrequency analysis. The usage of C∗algebras in our environment, together with the problem of designing computational algorithms, naturally leads to our proposal of using AFalgebras in the persistent homology setting. Finally, a computational toy example is presented, illustrating some elementary aspects of our framework. Due to the interdisciplinary nature
Persistencebased segmentation of deformable shapes
 Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on
"... In this paper, we combine two ideas: persistencebased clustering and the Heat Kernel Signature (HKS) function to obtain a multiscale isometry invariant mesh segmentation algorithm. The key advantages of this approach is that it is tunable through a few intuitive parameters and is stable under near ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
In this paper, we combine two ideas: persistencebased clustering and the Heat Kernel Signature (HKS) function to obtain a multiscale isometry invariant mesh segmentation algorithm. The key advantages of this approach is that it is tunable through a few intuitive parameters and is stable under nearisometric deformations. Indeed the method comes with feedback on the stability of the number of segments in the form of a persistence diagram. There are also spatial guarantees on part of the segments. Finally, we present an extension to the method which first detects regions which are inherently unstable and segments them separately. Both approaches are reasonably scalable and come with strong guarantees. We show numerous examples and a comparison with the segmentation benchmark and the curvature function. 1.
Sample complexity of testing the manifold hypothesis
 Advances in Neural Information Processing Systems 23
, 2010
"... The hypothesis that high dimensional data tends to lie in the vicinity of a low dimensional manifold is the basis of a collection of methodologies termed Manifold Learning. In this paper, we study statistical aspects of the question of fitting a manifold with a nearly optimal least squared error. Gi ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
The hypothesis that high dimensional data tends to lie in the vicinity of a low dimensional manifold is the basis of a collection of methodologies termed Manifold Learning. In this paper, we study statistical aspects of the question of fitting a manifold with a nearly optimal least squared error. Given upper bounds on the dimension, volume, and curvature, we show that Empirical Risk Minimization can produce a nearly optimal manifold using a number of random samples that is independent of the ambient dimension of the space in which data lie. We obtain an upper bound on the required number of samples that depends polynomially on the curvature, exponentially on the intrinsic dimension, and linearly on the intrinsic volume. For constant error, we prove a matching minimax lower bound on the sample complexity that shows that this dependence on intrinsic dimension, volume and curvature is unavoidable. Whether the known lower bound of O ( k ɛ2 1 log δ ɛ2) for the sample complexity of Empirical Risk minimization on k−means applied to data in a unit ball of arbitrary dimension is tight, has been an open question since 1997 [3]. Here ɛ is the desired bound on the error and δ is a bound on the probability of failure. We improve the best currently known upper bound [14] of k ɛ 2 min k,. Based on these results, we O ( k2 ɛ2 1 log δ ɛ2 k log4 ɛ) to O ɛ2 1 log δ ɛ2 devise a simple algorithm for k−means and another that uses a family of convex programs to fit a piecewise linear curve of a specified length to high dimensional data, where the sample complexity is independent of the ambient dimension. 1
Twostage framework for a topologybased projection and visualization of classified document collections
 In Proc. IEEE Symposium on Visual Analytics Science and Technology
, 2010
"... Figure 1: Islandlike visualization of a document point cloud’s topological structure. By sharing similar dimensions, documents accumulate in subspaces of the high dimensional information space. Considering dimensions as words, clusters are assumed to describe topics, i.e., islands, in the final vis ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
Figure 1: Islandlike visualization of a document point cloud’s topological structure. By sharing similar dimensions, documents accumulate in subspaces of the high dimensional information space. Considering dimensions as words, clusters are assumed to describe topics, i.e., islands, in the final visualization. During the last decades, electronic textual information has become the world’s largest and most important information source available. People have added a variety of daily newspapers, books, scientific and governmental publications, blogs and private messages to this wellspring of endless information and knowledge. Since neither the existing nor the new information can be read in its entirety, computers are used to extract and visualize meaningful or interesting topics and documents from this huge information clutter. In this paper, we extend, improve and combine existing individual approaches into an overall framework that supports topological analysis of high dimensional document point clouds given by the wellknown tfidf documentterm weighting method. We show that traditional distancebased approaches fail in very high dimensional spaces, and we describe an improved twostage method for topologybased projections from the original high dimensional information space to both two dimensional (2D) and three dimensional (3D) visualizations. To show the accuracy and usability of this framework, we compare it to methods introduced recently and apply it to complex document and patent collections.
The Optimality of the Interleaving Distance on Multidimensional Persistence Modules
 In: arXiv:1106.5305
, 2011
"... ar ..."
SLIDING WINDOWS AND PERSISTENCE: AN APPLICATION OF TOPOLOGICAL METHODS TO SIGNAL ANALYSIS
"... Abstract. We develop in this paper a theoretical framework for the topological study of time series data. Broadly speaking, we describe geometrical and topological properties of sliding window embeddings, as seen through the lens of persistent homology. In particular, we show that maximum persistenc ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
Abstract. We develop in this paper a theoretical framework for the topological study of time series data. Broadly speaking, we describe geometrical and topological properties of sliding window embeddings, as seen through the lens of persistent homology. In particular, we show that maximum persistence at the pointcloud level can be used to quantify periodicity at the signal level, prove structural and convergence theorems for the resulting persistence diagrams, and derive estimates for their dependency on window size and embedding dimension. We apply this methodology to quantifying periodicity in synthetic data sets, and compare the results with those obtained using stateoftheart methods in gene expression analysis. We call this new method SW1PerS which stands for Sliding Windows and 1dimensional Persistence Scoring. 1.
Cech Type Approach to Computing Homology of Maps, in preparation
"... Abstract. A new approach to algorithmic computation of the homology of spaces and maps is presented. The key point of the approach is a change in the representation of sets. The proposed representation is based on a combinatorial variant of the Čech homology and the Nerve Theorem. In many situations ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
Abstract. A new approach to algorithmic computation of the homology of spaces and maps is presented. The key point of the approach is a change in the representation of sets. The proposed representation is based on a combinatorial variant of the Čech homology and the Nerve Theorem. In many situations this change of the representation of the input may help in bypassing the problems with the complexity of the standard homology algorithms by reducing the size of nedcessary input. We show that the approach is particularly advantegous in the case of homology map algorithms. 1. Introduction. Effective algorithms for computing homology of spaces and maps are needed in computer assisted proofs in dynamics based on topological tools (see [4, 17, 22, 23] and references therein). Recently, homology algorithms have also been used in robotics [30], material structure analysis [10, 11] and image recognition [3, 36],
LIMIT THEORY FOR POINT PROCESSES IN MANIFOLDS
, 2013
"... Let Yi, i ≥ 1, be i.i.d. random variables having values in an mdimensional manifold M ⊂ Rd and consider sums ∑ni=1 ξ(n1/mYi, {n1/mYj}nj=1), where ξ is a real valued function defined on pairs (y,Y), with y ∈ Rd and Y ⊂ Rd locally finite. Subject to ξ satisfying a weak spatial dependence and continui ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Let Yi, i ≥ 1, be i.i.d. random variables having values in an mdimensional manifold M ⊂ Rd and consider sums ∑ni=1 ξ(n1/mYi, {n1/mYj}nj=1), where ξ is a real valued function defined on pairs (y,Y), with y ∈ Rd and Y ⊂ Rd locally finite. Subject to ξ satisfying a weak spatial dependence and continuity condition, we show that such sums satisfy weak laws of large numbers, variance asymptotics and central limit theorems. We show that the limit behavior is controlled by the value of ξ on homogeneous Poisson point processes on mdimensional hyperplanes tangent to M. We apply the general results to establish the limit theory of dimension and volume content estimators, Rényi and Shannon entropy estimators and clique counts in the Vietoris–Rips complex on {Yi}ni=1.
Dualities in persistent (co)homology
, 2010
"... Abstract. We consider sequences of absolute and relative homology and cohomology groups that arise naturally for a filtered cell complex. We establish algebraic relationships between their persistence modules, and show that they contain equivalent information. We explain how one can use the existing ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Abstract. We consider sequences of absolute and relative homology and cohomology groups that arise naturally for a filtered cell complex. We establish algebraic relationships between their persistence modules, and show that they contain equivalent information. We explain how one can use the existing algorithm for persistent homology to process any of the four modules, and relate it to a recently introduced persistent cohomology algorithm. We present experimental evidence for the practical efficiency of the latter algorithm. 1.
Curvature Analysis of Frequency Modulated Manifolds in Dimensionality Reduction
"... Recent advances in the analysis of highdimensional signal data have triggered an increasing interest in geometrybased methods for nonlinear dimensionality reduction (NDR). In many applications, highdimensional datasets typically contain redundant information, and NDR methods are important for an ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
(Show Context)
Recent advances in the analysis of highdimensional signal data have triggered an increasing interest in geometrybased methods for nonlinear dimensionality reduction (NDR). In many applications, highdimensional datasets typically contain redundant information, and NDR methods are important for an efficient analysis of their properties. During the last few years, concepts from differential geometry were used to create a whole new range of NDR methods. In the construction of such geometrybased strategies, a natural question is to understand their interaction with classical and modern signal processing tools (convolution transforms, Fourier analysis, wavelet functions). In particular, an important task is the analysis of the incurred geometrical deformation when applying signal transforms to the elements of a dataset. In this paper, we propose the concepts of frequency modulation maps and modulation manifolds for the construction of particular datasets relevant in signal processing and NDR. Moreover, we design a numerical algorithm for analyzing geometrical properties of the modulation manifolds, with a particular focus on their scalar curvature. Finally, in our numerical examples, we apply the resulting geometrybased analysis algorithm to two model problems, where we present geometrical and topological effects of relevance in manifold learning.