Results 11  20
of
710
A Framework for Robust Subspace Learning
 International Journal of Computer Vision
, 2003
"... Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multilinear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications. ..."
Abstract

Cited by 177 (10 self)
 Add to MetaCart
(Show Context)
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multilinear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications.
Representation learning: A review and new perspectives.
 of IEEE Conf. Comp. Vision Pattern Recog. (CVPR),
, 2005
"... AbstractThe success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can b ..."
Abstract

Cited by 173 (4 self)
 Add to MetaCart
(Show Context)
AbstractThe success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representationlearning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. This motivates longer term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation, and manifold learning.
Incremental Online Learning in High Dimensions
 Neural Computation
, 2005
"... Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally e ..."
Abstract

Cited by 164 (19 self)
 Add to MetaCart
Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally e#cient and numerically robust, each local model performs the regression analysis with a small number of univariate regressions in selected directions in input space in the spirit of partial least squares regression. We discuss when and how local learning techniques can successfully work in high dimensional spaces and review the various techniques for local dimensionality reduction before finally deriving the LWPR algorithm. The properties of LWPR are that it i) learns rapidly with second order learning methods based on incremental training, ii) uses statistically sound stochastic leaveoneout cross validation for learning without the need to memorize training data, iii) adjusts its weighting kernels based only on local information in order to minimize the danger of negative interference of incremental learning, iv) has a computational complexity that is linear in the number of inputs, and v) can deal with a large number of  possibly redundant  inputs, as shown in various empirical evaluations with up to 90 dimensional data sets. For a probabilistic interpretation, predictive variance and confidence intervals are derived. To our knowledge, LWPR is the first truly incremental spatially localized learning method that can successfully and e#ciently operate in very high dimensional spaces.
Gaussian process dynamical models for human motion
 IEEE TRANS. PATTERN ANAL. MACHINE INTELL
, 2008
"... We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series analysis, with applications to learning models of human pose and motion from highdimensional motion capture data. A GPDM is a latent variable model. It comprises a lowdimensional latent space with associated dynamics, ..."
Abstract

Cited by 158 (5 self)
 Add to MetaCart
(Show Context)
We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series analysis, with applications to learning models of human pose and motion from highdimensional motion capture data. A GPDM is a latent variable model. It comprises a lowdimensional latent space with associated dynamics, as well as a map from the latent space to an observation space. We marginalize out the model parameters in closed form by using Gaussian process priors for both the dynamical and the observation mappings. This results in a nonparametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach and compare four learning algorithms on human motion capture data, in which each pose is 50dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces.
A generalization of principal component analysis to the exponential family
 Advances in Neural Information Processing Systems
, 2001
"... Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not realvalued, such as binaryvalued data. This paper draws on ideas from the Exponential family, Generaliz ..."
Abstract

Cited by 155 (1 self)
 Add to MetaCart
(Show Context)
Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not realvalued, such as binaryvalued data. This paper draws on ideas from the Exponential family, Generalized linear models, and Bregman distances, to give a generalization of PCA to loss functions that we argue are better suited to other data types. We describe algorithms for minimizing the loss functions, and give examples on simulated data. 1
EM Algorithms for PCA and SPCA
 in Advances in Neural Information Processing Systems
, 1998
"... I present an expectationmaximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data. It is computationally very efficient in space and time. It also naturally accommodates ..."
Abstract

Cited by 146 (1 self)
 Add to MetaCart
(Show Context)
I present an expectationmaximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data. It is computationally very efficient in space and time. It also naturally accommodates missing information. I also introduce a new variant of PCA called sensible principal component analysis (SPCA) which defines a proper density model in the data space. Learning for SPCA is also done with an EM algorithm. I report results on synthetic and real data showing that these EM algorithms correctly and efficiently find the leading eigenvectors of the covariance of datasets in a few iterations using up to hundreds of thousands of datapoints in thousands of dimensions.
Face Transfer with Multilinear Models
 TO APPEAR IN SIGGRAPH 2005
, 2005
"... Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speechrelated mouth articulations), expressions, and threedimensional (3D) pose from monocular video or film footage. These parameters are then used to generate ..."
Abstract

Cited by 145 (3 self)
 Add to MetaCart
Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speechrelated mouth articulations), expressions, and threedimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target—the attributes are separably controllable. This supports
Multitask Gaussian Process Prediction
"... In this paper we investigate multitask learning in the context of Gaussian Processes (GP). We propose a model that learns a shared covariance function on inputdependent features and a “freeform ” covariance matrix over tasks. This allows for good flexibility when modelling intertask dependencies ..."
Abstract

Cited by 144 (6 self)
 Add to MetaCart
(Show Context)
In this paper we investigate multitask learning in the context of Gaussian Processes (GP). We propose a model that learns a shared covariance function on inputdependent features and a “freeform ” covariance matrix over tasks. This allows for good flexibility when modelling intertask dependencies while avoiding the need for large amounts of data for training. We show that under the assumption of noisefree observations and a block design, predictions for a given task only depend on its target values and therefore a cancellation of intertask transfer occurs. We evaluate the benefits of our model on two practical applications: a compiler performance prediction problem and an exam score prediction task. Additionally, we make use of GP approximations and properties of our model in order to provide scalability to large data sets. 1
Nonnegative tensor factorization with applications to statistics and computer vision
 In Proceedings of the International Conference on Machine Learning (ICML
, 2005
"... We derive algorithms for finding a nonnegative ndimensional tensor factorization (nNTF) which includes the nonnegative matrix factorization (NMF) as a particular case when n = 2. We motivate the use of nNTF in three areas of data analysis: (i) connection to latent class models in statistics, (ii ..."
Abstract

Cited by 139 (5 self)
 Add to MetaCart
(Show Context)
We derive algorithms for finding a nonnegative ndimensional tensor factorization (nNTF) which includes the nonnegative matrix factorization (NMF) as a particular case when n = 2. We motivate the use of nNTF in three areas of data analysis: (i) connection to latent class models in statistics, (ii) sparse image coding in computer vision, and (iii) model selection problems. We derive a ”direct ” positivepreserving gradient descent algorithm and an alternating scheme based on repeated multiple rank1 problems. 1.