Results 1  10
of
30
On Learning, Representing and Generalizing a Task in a Humanoid Robot
 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, PART B. SPECIAL
, 2007
"... We present a programmingbydemonstration framework for generically extracting the relevant features of a given task and for addressing the problem of generalizing the acquired knowledge to different contexts. We validate the architecture through a series of experiments, in which a human demonstra ..."
Abstract

Cited by 239 (48 self)
 Add to MetaCart
(Show Context)
We present a programmingbydemonstration framework for generically extracting the relevant features of a given task and for addressing the problem of generalizing the acquired knowledge to different contexts. We validate the architecture through a series of experiments, in which a human demonstrator teaches a humanoid robot simple manipulatory tasks. A probabilitybased estimation of the relevance is suggested by first projecting the motion data onto a generic latent space using principal component analysis. The resulting signals are encoded using a mixture of Gaussian/Bernoulli distributions (Gaussian mixture model/Bernoulli mixture model). This provides a measure of the spatiotemporal correlations across the different modalities collected from the robot, which can be used to determine a metric of the imitation performance. The trajectories are then generalized using Gaussian mixture regression. Finally, we analytically compute the trajectory which optimizes the imitation metric and use this to generalize the skill to different contexts.
Gaussian mean shift is an EM algorithm
 IEEE Trans. on Pattern Analysis and Machine Intelligence
, 2005
"... The meanshift algorithm, based on ideas proposed by Fukunaga and Hostetler (1975), is a hillclimbing algorithm on the density defined by a finite mixture or a kernel density estimate. Meanshift can be used as a nonparametric clustering method and has attracted recent attention in computer vision ..."
Abstract

Cited by 42 (4 self)
 Add to MetaCart
(Show Context)
The meanshift algorithm, based on ideas proposed by Fukunaga and Hostetler (1975), is a hillclimbing algorithm on the density defined by a finite mixture or a kernel density estimate. Meanshift can be used as a nonparametric clustering method and has attracted recent attention in computer vision applications such as image segmentation or tracking. We show that, when the kernel is Gaussian, meanshift is an expectationmaximisation (EM) algorithm, and when the kernel is nongaussian, meanshift is a generalised EM algorithm. This implies that meanshift converges from almost any starting point and that, in general, its convergence is of linear order. For Gaussian meanshift we show: (1) the rate of linear convergence approaches 0 (superlinear convergence) for very narrow or very wide kernels, but is often close to 1 (thus extremely slow) for intermediate widths, and exactly 1 (sublinear convergence) for widths at which modes merge; (2) the iterates approach the mode along the local principal component of the data points from the inside of the convex hull of the data points; (3) the convergence domains are nonconvex and can be disconnected and show fractal behaviour. We suggest ways of accelerating meanshift based on the EM interpretation.
Antbased clustering and topographic mapping
 Artificial Life
, 2005
"... Abstract Antbased clustering and sorting is a natureinspired heuristic first introduced as a model for explaining two types of emergent behavior observed in real ant colonies. More recently, it has been applied in a datamining context to perform both clustering and topographic mapping. Early work ..."
Abstract

Cited by 28 (1 self)
 Add to MetaCart
(Show Context)
Abstract Antbased clustering and sorting is a natureinspired heuristic first introduced as a model for explaining two types of emergent behavior observed in real ant colonies. More recently, it has been applied in a datamining context to perform both clustering and topographic mapping. Early work demonstrated some promising characteristics of the heuristic but did not extend to a rigorous investigation of its capabilities. We describe an improved version, called ATTA, incorporating adaptive, heterogeneous ants, a timedependent transporting activity, and a method (for clustering applications) that transforms the spatial embedding produced by the algorithm into an explicit partitioning. ATTA is then subjected to the most rigorous experimental evaluation of an antbased clustering and sorting algorithm undertaken to date: we compare its performance with standard techniques for clustering and topographic mapping using a set of analytical evaluation functions and a range of synthetic and real data collections. Our results demonstrate the ability of antbased clustering and sorting to automatically identify the number of clusters inherent in a data collection, and to produce high quality solutions; indeed, we show that it is particularly robust for clusters of differing sizes and for overlapping clusters. The results obtained for topographic mapping are, however, disappointing. We provide evidence that the solutions generated by the ant algorithm are barely topologypreserving, and we explain in detail why results have—in spite of this—been misinterpreted (much more positively) in previous research.
On the number of modes of a Gaussian mixture

, 2003
"... We consider a problem intimately related to the creation of maxima under Gaussian blurring: the number of modes of a Gaussian mixture in D dimensions. To our knowledge, a general answer to this question is not known. We conjecture that if the components of the mixture have the same covariance matr ..."
Abstract

Cited by 25 (5 self)
 Add to MetaCart
We consider a problem intimately related to the creation of maxima under Gaussian blurring: the number of modes of a Gaussian mixture in D dimensions. To our knowledge, a general answer to this question is not known. We conjecture that if the components of the mixture have the same covariance matrix (or the same covariance matrix up to a scaling factor), then the number of modes cannot exceed the number of components. We demonstrate
Feature Subset Selection and Ranking for Data Dimensionality Reduction
"... Abstract—A new unsupervised forward orthogonal search (FOS) algorithm is introduced for feature selection and ranking. In the new algorithm, features are selected in a stepwise way, one at a time, by estimating the capability of each specified candidate feature subset to represent the overall featur ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
Abstract—A new unsupervised forward orthogonal search (FOS) algorithm is introduced for feature selection and ranking. In the new algorithm, features are selected in a stepwise way, one at a time, by estimating the capability of each specified candidate feature subset to represent the overall features in the measurement space. A squared correlation function is employed as the criterion to measure the dependency between features and this makes the new algorithm easy to implement. The forward orthogonalization strategy, which combines good effectiveness with high efficiency, enables the new algorithm to produce efficient feature subsets with a clear physical interpretation. Index Terms—Dimensionality reduction, feature selection, highdimensional data. 1
Feature selection for Descriptor based Classification Models
 Part II  Human Intestinal Absorption (HIA). J. Chem. Inf. Comput. Sci
, 2003
"... The paper describes different aspects of classification models based on molecular data sets with the focus on feature selection methods. Especially model quality and avoiding a high variance on unseen data (overfitting) will be discussed with respect to the feature selection problem. We present seve ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
The paper describes different aspects of classification models based on molecular data sets with the focus on feature selection methods. Especially model quality and avoiding a high variance on unseen data (overfitting) will be discussed with respect to the feature selection problem. We present several standard approaches and modifications of our Genetic Algorithm based on the Shannon Entropy Cliques (GASEC) algorithm and the extension for classification problems using boosting.
A Comparison of Acoustic Features for Articulatory Inversion
"... We study empirically the best acoustic parameterization for articulatory inversion (the problem of recovering the sequence of vocal tract shapes that produce a given acoustic speech signal). We compare all combinations of the following factors: 1) popular acoustic features such as MFCC and PLP with ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
We study empirically the best acoustic parameterization for articulatory inversion (the problem of recovering the sequence of vocal tract shapes that produce a given acoustic speech signal). We compare all combinations of the following factors: 1) popular acoustic features such as MFCC and PLP with and without dynamic features; 2) different shorttime window lengths; 3) different levels of smoothing of the acoustic temporal trajectories. Experimental results on a real speech production database show consistent improvement when using features closely related to the vocal tract (in particular LSF), dynamic features, and large window length and smoothing (which reduce the jaggedness of the acoustic trajectory). Further improvements are obtained with a 15 ms time delay between acoustic and articulatory frames. However, the improvement attained over other combinations is very small (at most 0.3 mm RMSE). Index Terms: acoustictoarticulatory mapping, articulatory inversion, acoustic features, MOCHA database
A Support Vector Approach to the AcoustictoArticulatory Mapping
, 2005
"... We report work on mapping the acoustic speech signal, parametrized using Mel Frequency Cepstral Analysis, onto electromagnetic articulography trajectories from the MOCHA database. We employ the machine learning technique of Support Vector Regression, contrasting previous works that applied Neural Ne ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
We report work on mapping the acoustic speech signal, parametrized using Mel Frequency Cepstral Analysis, onto electromagnetic articulography trajectories from the MOCHA database. We employ the machine learning technique of Support Vector Regression, contrasting previous works that applied Neural Networks to the same task. Our results are comparable to those older attempts, even though, due to training time considerations, we use a much smaller training set, derived by means of clustering the acoustic data.
Comparison of dimensionality reduction methods for wood surface inspection
 In Proceedings of the 6 th International Conference on Quality Control by Artificial Vision
, 2003
"... Dimensionality reduction methods for visualization map the original highdimensional data typically into two dimensions. Mapping preserves the important information of the data, and in order to be useful, fulfils the needs of a human observer. We have proposed a selforganizing map (SOM) based appr ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Dimensionality reduction methods for visualization map the original highdimensional data typically into two dimensions. Mapping preserves the important information of the data, and in order to be useful, fulfils the needs of a human observer. We have proposed a selforganizing map (SOM) based approach for visual surface inspection. The method provides the advantages of unsupervised learning and an intuitive user interface that allows one to very easily set and tune the class boundaries based on observations made on visualization, for example, to adapt to changing conditions or material. There are, however, some problems with a SOM. It does not address the true distances between data, and it has a tendency to ignore rare samples in the training set at the expense of more accurate representation of common samples. In this paper, some alternative methods for a SOM are evaluated. These methods, PCA, MDS, LLE, ISOMAP, and GTM, are used to reduce dimensionality in order to visualize the data. Their principal differences are discussed and performances quantitatively evaluated in a few special classification cases, such as in wood inspection using centile features. For the test material experimented with, SOM and GTM outperform the others when classification performance is considered. For data mining kinds of applications, ISOMAP and LLE appear to be more promising methods.
Differential priors for elastic nets
 In Proc. of the 6th Int. Conf. Intelligent Data Engineering and Automated Learning (IDEAL’05
, 2005
"... Abstract. The elastic net and related algorithms, such as generative topographic mapping, are key methods for discretized dimensionreduction problems. At their heart are priors that specify the expected topological and geometric properties of the maps. However, up to now, only a very small subset o ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
(Show Context)
Abstract. The elastic net and related algorithms, such as generative topographic mapping, are key methods for discretized dimensionreduction problems. At their heart are priors that specify the expected topological and geometric properties of the maps. However, up to now, only a very small subset of possible priors has been considered. Here we study a much more general family originating from discrete, highorder derivative operators. We show theoretically that the form of the discrete approximation to the derivative used has a crucial influence on the resulting map. Using a new and more powerful iterative elastic net algorithm, we confirm these results empirically, and illustrate how different priors affect the form of simulated ocular dominance columns. 1