Results 1 - 10
of
102
A Survey on Transfer Learning
"... A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task i ..."
Abstract
-
Cited by 459 (24 self)
- Add to MetaCart
(Show Context)
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as co-variate shift. We also explore some potential future issues in transfer learning research.
Geodesic flow kernel for unsupervised domain adaptation
- In CVPR
, 2012
"... In real-world applications of visual recognition, many factors—such as pose, illumination, or image quality—can cause a significant mismatch between the source domain on which classifiers are trained and the target domain to which those classifiers are applied. As such, the classifiers often perform ..."
Abstract
-
Cited by 97 (6 self)
- Add to MetaCart
(Show Context)
In real-world applications of visual recognition, many factors—such as pose, illumination, or image quality—can cause a significant mismatch between the source domain on which classifiers are trained and the target domain to which those classifiers are applied. As such, the classifiers often perform poorly on the target domain. Domain adaptation techniques aim to correct the mismatch. Existing approaches have concentrated on learning feature representations that are invariant across domains, and they often do not directly exploit low-dimensional structures that are intrinsic to many vision datasets. In this paper, we propose a new kernel-based method that takes advantage of such structures. Our geodesic flow kernel models domain shift by integrating an infinite number of subspaces that characterize changes in geometric and statistical properties from the source to the target domain. Our approach is computationally advantageous, automatically inferring important algorithmic parameters without requiring extensive crossvalidation or labeled data from either domain. We also introduce a metric that reliably measures the adaptability between a pair of source and target domains. For a given target domain and several source domains, the metric can be used to automatically select the optimal source domain to adapt and avoid less desirable ones. Empirical studies on standard datasets demonstrate the advantages of our approach over competing methods. 1.
Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation
"... Learning domain-invariantfeatures is of vital importance to unsupervised domain adaptation, where classifiers trained on the source domain need to be adapted to a different target domainforwhich nolabeled examplesare available. In this paper, we propose a novel approachfor learningsuchfeatures. The ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Learning domain-invariantfeatures is of vital importance to unsupervised domain adaptation, where classifiers trained on the source domain need to be adapted to a different target domainforwhich nolabeled examplesare available. In this paper, we propose a novel approachfor learningsuchfeatures. The central idea is to exploit the existence of landmarks, which are a subset of labeled data instances in the source domain that are distributed most similarly to the target domain. Our approach automatically discovers the landmarks and use them to bridge the source to the target by constructing provably easier auxiliary domain adaptation tasks. The solutions of those auxiliarytasks form the basis to compose invariant features for the original task. We show how this composition can be optimized discriminatively without requiring labels from the target domain. We validate the method on standard benchmark datasets for visual object recognition and sentiment analysis of text. Empirical results show the proposed method outperforms the state-ofthe-art significantly. 1.
Transfer Defect Learning
"... Abstract—Many software defect prediction approaches have been proposed and most are effective in within-project prediction settings. However, for new projects or projects with limited training data, it is desirable to learn a prediction model by using sufficient training data from existing source pr ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
(Show Context)
Abstract—Many software defect prediction approaches have been proposed and most are effective in within-project prediction settings. However, for new projects or projects with limited training data, it is desirable to learn a prediction model by using sufficient training data from existing source projects and then apply the model to some target projects (cross-project defect prediction). Unfortunately, the performance of crossproject defect prediction is generally poor, largely because of feature distribution differences between the source and target projects. In this paper, we apply a state-of-the-art transfer learning approach, TCA, to make feature distributions in source and target projects similar. In addition, we propose a novel transfer defect learning approach, TCA+, by extending TCA. Our experimental results for eight open-source projects show that TCA+ significantly improves cross-project prediction performance. Index Terms—cross-project defect prediction, transfer learning, empirical software engineering I.
Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
"... Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsupervised domain adaptation, where labeled data are only available in the source dom ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
(Show Context)
Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsupervised domain adaptation, where labeled data are only available in the source domain. We propose to interpolate subspaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. We present experiments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art. 1.
Unsupervised domain adaptation by domain invariant projection
- in IEEE International Conference on Computer Vision
, 2013
"... Domain-invariant representations are key to addressing the domain shift problem where the training and test exam-ples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the origi ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
(Show Context)
Domain-invariant representations are key to addressing the domain shift problem where the training and test exam-ples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the original feature space. This space, however, may not be di-rectly suitable for such a comparison, since some of the fea-tures may have been distorted by the domain shift, or may be domain specific. In this paper, we introduce a Domain Invariant Projection approach: An unsupervised domain adaptation method that overcomes this issue by extracting the information that is invariant across the source and tar-get domains. More specifically, we learn a projection of the data to a low-dimensional latent space where the distance between the empirical distributions of the source and target examples is minimized. We demonstrate the effectiveness of our approach on the task of visual object recognition and show that it outperforms state-of-the-art methods on a stan-dard domain adaptation benchmark dataset. 1.
Least-Squares Probabilistic Classifier: A Computationally Efficient Alternative to Kernel Logistic Regression
- In Proceedings of International Workshop on Statistical Machine Learning for Speech Processing (IWSML2012), Kyoto
"... Human activity recognition from accelerometric data (e.g., obtained by smart phones) is gathering a great deal of attention since it can be used for various purposes such as remote health-care. However, since collecting labeled data is bothersome for new users, it is desirable to utilize data obtain ..."
Abstract
-
Cited by 13 (7 self)
- Add to MetaCart
(Show Context)
Human activity recognition from accelerometric data (e.g., obtained by smart phones) is gathering a great deal of attention since it can be used for various purposes such as remote health-care. However, since collecting labeled data is bothersome for new users, it is desirable to utilize data obtained from existing users. In this paper, we formulate this adaptation problem as learning under covariate shift, and propose a computationally efficient probabilistic classification method based on adaptive importance sampling. The usefulness of the proposed method is demonstrated in real-world human activity recognition. 1
Information-Theoretical Learning of Discriminative Clusters for Unsupervised Domain Adaptation
"... We study the problem of unsupervised domain adaptation, which aims to adapt classifiers trained on a labeled source domain to an unlabeled target domain. Many existing approaches first learn domain-invariant features and then construct classifiers with them. We propose a novel approach that jointly ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We study the problem of unsupervised domain adaptation, which aims to adapt classifiers trained on a labeled source domain to an unlabeled target domain. Many existing approaches first learn domain-invariant features and then construct classifiers with them. We propose a novel approach that jointly learn the both. Specifically, while the method identifies a feature space where data in the source and the target domains are similarly distributed, it also learns the feature space discriminatively, optimizing an informationtheoretic metric as an proxy to the expected misclassification error on the target domain. We show how this optimization can be effectively carried out with simple gradient-based methods and how hyperparameters can be cross-validated without demanding any labeled data from the target domain. Empirical studies on benchmark tasks of object recognition and sentiment analysis validated our modeling assumptions and demonstrated significant improvement of our method over competing ones in classification accuracies. 1.
Dual transfer learning
- In Proceedings of the 12th SIAM International Conference on Data Mining, SDM
, 2012
"... Transfer learning aims to leverage the knowledge in the source domain to facilitate the learning tasks in the target domain. It has attracted extensive research interests recently due to its effectiveness in a wide range of applications. The general idea of the existing methods is to utilize the com ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
Transfer learning aims to leverage the knowledge in the source domain to facilitate the learning tasks in the target domain. It has attracted extensive research interests recently due to its effectiveness in a wide range of applications. The general idea of the existing methods is to utilize the common latent structure shared across domains as the bridge for knowledge transfer. These methods usually model the common latent structure by using either the marginal distribution or the conditional distribution. However, without exploring the duality between these two distributions, these single bridge methods may not achieve optimal capability of knowledge transfer. In this paper, we propose a novel approach, Dual Transfer Learning (DTL), which simultaneously learns the marginal and conditional distributions, and exploits the duality between them in a principled way. The key idea behind DTL is that learning one distribution can help to learn the other. This duality property leads to mutual reinforcement when adapting both distributions across domains to transfer knowledge. The proposed method is formulated as an optimization problem based on joint nonnegative matrix trifactorizations (NMTF). The two distributions are learned from the decomposed latent factors that exhibit the duality property. An efficient alternating minimization algorithm is developed to solve the optimization problem with convergence guarantee. Extensive experimental results demonstrate that DTL is more effective than alternative transfer learning methods.
Quantifying and Transferring Contextual Information in Object Detection
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2011
"... Abstract—Context is critical for reducing the uncertainty in object detection. However, context modeling is challenging because there are often many different types of contextual information coexisting with different degrees of relevance to the detection of target object(s) in different images. It i ..."
Abstract
-
Cited by 11 (7 self)
- Add to MetaCart
(Show Context)
Abstract—Context is critical for reducing the uncertainty in object detection. However, context modeling is challenging because there are often many different types of contextual information coexisting with different degrees of relevance to the detection of target object(s) in different images. It is therefore crucial to devise a context model to automatically quantify and select the most effective contextual information for assisting in detecting the target object. Nevertheless, the diversity of contextual information means that learning a robust context model requires a larger training set than learning the target object appearance model, which may not be available in practice. In this work, a novel context modeling framework is proposed without the need for any prior scene segmentation or context annotation. We formulate a polar geometric context descriptor for representing multiple types of contextual information. In order to quantify context, we propose a new maximum margin context (MMC) model to evaluate and measure the usefulness of contextual information directly and explicitly through a discriminant context inference method. Furthermore, to address the problem of context learning with limited data, we exploit the idea of transfer learning based on the observation that although two categories of objects can have very different visual appearance, there can be similarity in their context and/or the way contextual information helps to distinguish target objects from nontarget objects. To that end, two novel context transfer learning models are proposed which utilize training samples from source object classes to improve the learning of the context model for a target object class based on a joint maximum margin learning framework. Experiments are carried out on PASCAL VOC2005 and VOC2007 data sets, a luggage detection data set extracted from the i-LIDS data set, and a vehicle detection data set extracted from outdoor surveillance footage. Our results validate the effectiveness of the proposed models for quantifying and transferring contextual information, and demonstrate that they outperform related alternative context models. Index Terms—Context modeling, object detection, transfer learning. Ç