Results 1  10
of
10
Efficient spectral feature selection with minimum redundancy
 In Proceedings of the Twenty4th AAAI Conference on Artificial Intelligence (AAAI), 2010. Ji Zhu, Saharon
"... Spectral feature selection identifies relevant features by measuring their capability of preserving sample similarity. It provides a powerful framework for both supervised and unsupervised feature selection, and has been proven to be effective in many realworld applications. One common drawback ass ..."
Abstract

Cited by 31 (7 self)
 Add to MetaCart
Spectral feature selection identifies relevant features by measuring their capability of preserving sample similarity. It provides a powerful framework for both supervised and unsupervised feature selection, and has been proven to be effective in many realworld applications. One common drawback associated with most existing spectral feature selection algorithms is that they evaluate features individually and cannot identify redundant features. Since redundant features can have significant adverse effect on learning performance, it is necessary to address this limitation for spectral feature selection. To this end, we propose a novel spectral feature selection algorithm to handle feature redundancy, adopting an embedded model. The algorithm is derived from a formulation based on a sparse multioutput regression with a L2,1norm constraint. We conduct theoretical analysis on the properties of its optimal solutions, paving the way for designing an efficient pathfollowing solver. Extensive experiments show that the proposed algorithm can do well in both selecting relevant features and removing redundancy.
MultiLabel Transfer Learning with Sparse Representation
"... Abstract—Due to the visually polysemous barrier, videos and images may be annotated by multiple tags. Discovering the correlations among different tags can significantly help predicting precise labels for videos and images. Many of recent studies toward multilabel learning construct a linear subspa ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Due to the visually polysemous barrier, videos and images may be annotated by multiple tags. Discovering the correlations among different tags can significantly help predicting precise labels for videos and images. Many of recent studies toward multilabel learning construct a linear subspace embedding with encoded multilabel information, such that data points sharing many common labels tend to be close to each other in the embedded subspace. Motivated by the advances of compressive sensing research, a sparse representation that selects a compact subset to describe the input data can be more discriminative. In this paper, we propose a sparse multilabel learning method to circumvent the visually polysemous barrier of multiple tags. Our approach learns a multilabel encoded sparse linear embedding space from a related dataset, and maps the target data into the learned new representation space to achieve better annotation performance. Instead of using l1norm penalty (lasso) to induce sparse representation, we propose to formulate the multilabel learning as a penalized least squares optimization problem with elasticnet penalty. By casting the video concept detection and image annotation tasks into a sparse multilabel transfer learning framework in this paper, ridge regression, lasso, elastic net, and the multilabel extended sparse discriminant analysis methods are, respectively, well explored and compared. Index Terms—Image annotation, multilabel learning, sparse representation, transfer learning, video concept detection. I.
Towards quantifying vertex similarity in networks
 InternetMathematics
, 2014
"... Abstract. Vertex similarity is a major problem in network science with a wide range of applications. In this work we provide novel perspectives on finding (dis)similar vertices within a network and across two networks with the same number of vertices (graph matching). With respect to the former pro ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Vertex similarity is a major problem in network science with a wide range of applications. In this work we provide novel perspectives on finding (dis)similar vertices within a network and across two networks with the same number of vertices (graph matching). With respect to the former problem, we propose to optimize a geometric objective which allows us to express each vertex uniquely as a convex combination of a few extreme types of vertices. Our method has the important advantage of supporting efficiently several types of queries such as “which other vertices are most similar to this vertex? ” by the use of the appropriate data structures and of mining interesting patterns in the network. With respect to the latter problem (graph matching), we propose the generalized condition number –a quantity widely used in numerical analysis – κ(LG, LH) of the Laplacian matrix representations of G,H as a measure of graph similarity, where G,H are the graphs of interest. We show that this objective has a solid theoretical basis and propose a deterministic and a randomized graph alignment algorithm. We evaluate our algorithms on both synthetic and real data. We observe that our proposed methods achieve highquality results and provide us with significant insights into the network structure. 1.
Efficient Implementations Dimensionality Reduction Algorithms in our
"... • One of the challenges in multilabel learning is how to capture the correlation among different labels in dimensionality reduction. • Multilabel learning from highdimensional data suffers from the curse of dimensionality. • For largescale problems, one of the key challenges is how to perform di ..."
Abstract
 Add to MetaCart
• One of the challenges in multilabel learning is how to capture the correlation among different labels in dimensionality reduction. • Multilabel learning from highdimensional data suffers from the curse of dimensionality. • For largescale problems, one of the key challenges is how to perform dimensionality reduction efficiently? Contributions • The design and analysis of dimensionality reduction algorithms for multilabel learning: • Proposed a novel multilabel dimensionality reduction algorithm called hypergraph spectral learning which uses a hypergraph to capture label correlation (KDD’08) • Extended Canonical Correlation Analysis (CCA) and elucidated key properties of CCA (TPAMI’11) • Proposed two efficient algorithms to solve a class of dimensionality reduction algorithms: • A direct least squares approach(ICML’09) • A twostage approach (KDD’10) Hypergraph Spectral Learning What is a hypergraph? 1. A hypergraph is a generalization of the traditional graph. 2. In a hypergraph, each hyperedge is a nonempty subset of the vertex set. 2. Intuitively, data points sharing many common labels tend to be close to each other in the embedded space. 3. CCA can be shown to be a special case of HSL. Projection onto 1dimensional space using HSL
Sparse Manifold Alignment
, 2012
"... Previous approaches to manifold alignment are based on solving a (generalized) eigenvector problem. We propose a least squares formulation of a class of manifold alignment approaches, which has the potential of scaling better to realworld data sets. Furthermore, the leastsquares formulation enable ..."
Abstract
 Add to MetaCart
(Show Context)
Previous approaches to manifold alignment are based on solving a (generalized) eigenvector problem. We propose a least squares formulation of a class of manifold alignment approaches, which has the potential of scaling better to realworld data sets. Furthermore, the leastsquares formulation enables various regularization techniques to be readily incorporated to improve model sparsity and generalization ability. In particular, it enables using the l1 norm regularization framework to make previous manifold alignment algorithms more robust. The new approach can prune domaindependent features automatically helping to improve transfer learning. This extension significantly broadens the scope of manifold alignment techniques and leads to faster algorithms. We present detailed experiments to illustrate the approach using the domains of crosslingual information retrieval and social network analysis. 1 1
Linear Dimensionality Reduction: Survey, Insights, and Generalizations
, 2015
"... Abstract Linear dimensionality reduction methods are a cornerstone of analyzing high dimensional data, due to their simple geometric interpretations and typically attractive computational properties. These methods capture many data features of interest, such as covariance, dynamical structure, corr ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Linear dimensionality reduction methods are a cornerstone of analyzing high dimensional data, due to their simple geometric interpretations and typically attractive computational properties. These methods capture many data features of interest, such as covariance, dynamical structure, correlation between data sets, inputoutput relationships, and margin between data classes. Methods have been developed with a variety of names and motivations in many fields, and perhaps as a result the connections between all these methods have not been highlighted. Here we survey methods from this disparate literature as optimization programs over matrix manifolds. We discuss principal component analysis, factor analysis, linear multidimensional scaling, Fisher's linear discriminant analysis, canonical correlations analysis, maximum autocorrelation factors, slow feature analysis, sufficient dimensionality reduction, undercomplete independent component analysis, linear regression, distance metric learning, and more. This optimization framework gives insight to some rarely discussed shortcomings of wellknown methods, such as the suboptimality of certain eigenvector solutions. Modern techniques for optimization over matrix manifolds enable a generic linear dimensionality reduction solver, which accepts as input data and an objective to be optimized, and returns, as output, an optimal lowdimensional projection of the data. This simple optimization framework further allows straightforward generalizations and novel variants of classical methods, which we demonstrate here by creating an orthogonalprojection canonical correlations analysis. More broadly, this survey and generic solver suggest that linear dimensionality reduction can move toward becoming a blackbox, objectiveagnostic numerical technology.
Unifying Linear Dimensionality Reduction
, 2014
"... Linear dimensionality reduction methods are a cornerstone of analyzing high dimensional data, due to their simple geometric interpretations and typically attractive computational properties. These methods capture many data features of interest, such as covariance, dynamical structure, correlation b ..."
Abstract
 Add to MetaCart
Linear dimensionality reduction methods are a cornerstone of analyzing high dimensional data, due to their simple geometric interpretations and typically attractive computational properties. These methods capture many data features of interest, such as covariance, dynamical structure, correlation between data sets, inputoutput relationships, and margin between data classes. Methods have been developed with a variety of names and motivations in many fields, and perhaps as a result the deeper connections between all these methods have not been understood. Here we unify methods from this disparate literature as optimization programs over matrix manifolds. We discuss principal component analysis, factor analysis, linear multidimensional scaling, Fisher’s linear discriminant analysis, canonical correlations analysis, maximum autocorrelation factors, slow feature analysis, undercomplete independent component analysis, linear regression, and more. This optimization framework helps elucidate some rarely discussed shortcomings of wellknown methods, such as the suboptimality of certain eigenvector solutions. Modern techniques for optimization over matrix manifolds enable a generic linear dimensionality reduction solver, which accepts as input data and an objective to be optimized, and returns, as output, an optimal lowdimensional projection of the data. This optimization framework further allows rapid development of novel variants of classical methods, which we demonstrate here by creating an orthogonalprojection canonical correlations analysis. More broadly, we suggest that our generic linear dimensionality reduction solver can move linear dimensionality reduction toward becoming a blackbox, objectiveagnostic numerical technology. 1
A Unified Framework for Probabilistic Component Analysis
"... Abstract. We present a unifying framework which reduces the construction of probabilistic component analysis techniques to a mere selection of the latent neighbourhood, thus providing an elegant and principled framework for creating novel component analysis models as well as constructing probabil ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. We present a unifying framework which reduces the construction of probabilistic component analysis techniques to a mere selection of the latent neighbourhood, thus providing an elegant and principled framework for creating novel component analysis models as well as constructing probabilistic equivalents of deterministic component analysis methods. Under our framework, we unify many very popular and wellstudied component analysis algorithms, such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Locality Preserving Projections (LPP) and Slow Feature Analysis (SFA), some of which have no probabilistic equivalents in literature thus far. We firstly define the Markov Random Fields (MRFs) which encapsulate the latent connectivity of the aforementioned component analysis techniques; subsequently, we show that the projection directions produced by all PCA, LDA, LPP and SFA are also produced by the Maximum Likelihood (ML) solution of a single joint probability density function, composed by selecting one of the defined MRF priors while utilising a simple observation model. Furthermore, we propose novel Expectation Maximization (EM) algorithms, exploiting the proposed joint PDF, while we generalize the proposed methodologies to arbitrary connectivities via parametrizable MRF products. Theoretical analysis and experiments on both simulated and real world data show the usefulness of the proposed framework, by deriving methods which well outperform stateoftheart equivalents.