Results 1  10
of
25
Domain Adaptation via Transfer Component Analysis
"... Domain adaptation solves a learning problem in a target domain by utilizing the training data in a different but related source domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we propose to find such a representation through a new learning met ..."
Abstract

Cited by 102 (18 self)
 Add to MetaCart
(Show Context)
Domain adaptation solves a learning problem in a target domain by utilizing the training data in a different but related source domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we propose to find such a representation through a new learning method, transfer component analysis (TCA), for domain adaptation. TCA tries to learn some transfer components across domains in a Reproducing Kernel Hilbert Space (RKHS) using Maximum Mean Discrepancy (MMD). In the subspace spanned by these transfer components, data distributions in different domains are close to each other. As a result, with the new representations in this subspace, we can apply standard machine learning methods to train classifiers or regression models in the source domain for use in the target domain. The main contribution of our work is that we propose a novel feature representation in which to perform domain adaptation via a new parametric kernel using feature extraction methods, which can dramatically minimize the distance between domain distributions by projecting data onto the learned transfer components. Furthermore, our approach can handle large datsets and naturally lead to outofsample generalization. The effectiveness and efficiency of our approach in are verified by experiments on two realworld applications: crossdomain indoor WiFi localization and crossdomain text classification. 1
Dimensionality Reduction for Density Ratio Estimation in Highdimensional Spaces
 NEURAL NETWORKS, VOL.23, NO.1, PP.44–59
, 2010
"... The ratio of two probability density functions is becoming a quantity of interest these days in the machine learning and data mining communities since it can be used for various data processing tasks such as nonstationarity adaptation, outlier detection, and feature selection. Recently, several met ..."
Abstract

Cited by 24 (18 self)
 Add to MetaCart
The ratio of two probability density functions is becoming a quantity of interest these days in the machine learning and data mining communities since it can be used for various data processing tasks such as nonstationarity adaptation, outlier detection, and feature selection. Recently, several methods have been developed for directly estimating the density ratio without going through density estimation and were shown to work well in various practical problems. However, these methods still perform rather poorly when the dimensionality of the data domain is high. In this paper, we propose to incorporate a dimensionality reduction scheme into a densityratio estimation procedure and experimentally show that the estimation accuracy in highdimensional cases can be improved.
Direct Densityratio Estimation with Dimensionality Reduction via Leastsquares Heterodistributional Subspace Search
 NEURAL NETWORKS, VOL.24, NO.2, PP.183–198
, 2011
"... Methods for directly estimating the ratio of two probability density functions have been actively explored recently since they can be used for various data processing tasks such as nonstationarity adaptation, outlier detection, and feature selection. In this paper, we develop a new method which inc ..."
Abstract

Cited by 23 (15 self)
 Add to MetaCart
Methods for directly estimating the ratio of two probability density functions have been actively explored recently since they can be used for various data processing tasks such as nonstationarity adaptation, outlier detection, and feature selection. In this paper, we develop a new method which incorporates dimensionality reduction into a direct densityratio estimation procedure. Our key idea is to find a lowdimensional subspace in which densities are significantly different and perform density ratio estimation only in this subspace. The proposed method, D³LHSS (Direct Densityratio estimation with Dimensionality reduction via Leastsquares Heterodistributional Subspace Search), is shown to overcome the limitation of baseline methods.
A Densityratio Framework for Statistical Data Processing
 IPSJ TRANSACTIONS ON COMPUTER VISION AND APPLICATIONS, VOL.1, PP.183–208
, 2009
"... In statistical pattern recognition, it is important to avoid density estimation since density estimation is often more difficult than pattern recognition itself. Following this idea—known as Vapnik’s principle, a statistical data processing framework that employs the ratio of two probability density ..."
Abstract

Cited by 12 (7 self)
 Add to MetaCart
In statistical pattern recognition, it is important to avoid density estimation since density estimation is often more difficult than pattern recognition itself. Following this idea—known as Vapnik’s principle, a statistical data processing framework that employs the ratio of two probability density functions has been developed recently and is gathering a lot of attention in the machine learning and data mining communities. The purpose of this paper is to introduce to the computer vision community recent advances in density ratio estimation methods and their usage in various statistical data processing tasks such as nonstationarity adaptation, outlier detection, feature selection, and independent component analysis.
Constructive setting of the density ratio estimation problem and its rigorous solution (Technical Report 1306.0407). arXiv
, 2013
"... ar ..."
(Show Context)
Efficient invariant search for distributed information systems
 In ICDM’13
, 2013
"... Abstract—In today’s distributed information systems, a large amount of monitoring data such as log files have been collected. These monitoring data at various points of a distributed information system provide unparallel opportunities for us to characterize and track the information system via effec ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract—In today’s distributed information systems, a large amount of monitoring data such as log files have been collected. These monitoring data at various points of a distributed information system provide unparallel opportunities for us to characterize and track the information system via effectively correlating all monitoring data across the distributed system. [1] proposed a concept named flow intensity to measure the intensity with which the monitoring data reacts to the volume of different user requests. The AutoRegressive model with eXogenous inputs (ARX) was used to quantify the relationship between each pair of flow intensity measured at various points across distributed systems. If such relationships hold all the time, they are considered as invariants of the underlying systems. Such invariants have been successfully used to characterize complex systems and support various system management tasks, such as system fault detection and localization. However, it is very timeconsuming to search the complete set of invariants of large scale systems and existing algorithms are not scalable for thousands of flow intensity measurements. To this end, in this paper, we develop effective pruning techniques based on the identified upper bounds. Accordingly, two efficient algorithms are proposed to search the complete set of invariants based on the pruning techniques. Finally we demonstrate the efficiency and effectiveness of our algorithms with both realworld and synthetic data sets.
Gaze Evidence for different activities in program understanding. 24 th Psychology of Programming Workshop
, 2012
"... Abstract We present an empirical study that illustrates the potential of dual eyetracking to detect successful understanding and social processes during pairprogramming. The gaze of forty pairs of programmers was recorded during a program understanding task. An analysis of the gaze transitions bet ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract We present an empirical study that illustrates the potential of dual eyetracking to detect successful understanding and social processes during pairprogramming. The gaze of forty pairs of programmers was recorded during a program understanding task. An analysis of the gaze transitions between structural elements of the code, declarations of identifiers and expressions shows that pairs with better understanding do less systematic execution of the code and more “tracing ” of the data flow by alternating between identifiers and expressions. Interaction consists of moments where partners ’ attention converges on the same same part of the code and moments where it diverges. Moments of convergence are accompanied by more systematic execution of the code and less transitions among identifiers and expressions. 1
Direct Density Ratio Estimation with Dimensionality Reduction
"... Methods for directly estimating the ratio of two probability density functions without going through density estimation have been actively explored recently since they can be used for various data processing tasks such as nonstationarity adaptation, outlier detection, conditional density estimation ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Methods for directly estimating the ratio of two probability density functions without going through density estimation have been actively explored recently since they can be used for various data processing tasks such as nonstationarity adaptation, outlier detection, conditional density estimation, feature selection, and independent component analysis. However, even the stateoftheart density ratio estimation methods still perform rather poorly in highdimensional problems. In this paper, we propose a new density ratio estimation method which incorporates dimensionality reduction into a density ratio estimation procedure. Our key idea is to identify a lowdimensional subspace in which the two densities corresponding to the denominator and the numerator in the density ratio are significantly different. Then the density ratio is estimated only within this lowdimensional subspace. Through numerical examples, we illustrate the effectiveness of the proposed method. 1
Changepoint detection of weak signals: How to use signal correlation?,” tech
, 2010
"... We explore the problem of how to exploit known signal temporal correlation is detecting weak signal. We model the signal as a time series with a known signal correlation structure, and proposed a novel maximum score test (MST) for weak signal detection. The MST avoids the computational expensive inv ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We explore the problem of how to exploit known signal temporal correlation is detecting weak signal. We model the signal as a time series with a known signal correlation structure, and proposed a novel maximum score test (MST) for weak signal detection. The MST avoids the computational expensive inversion of covariance matrix in the maximum likelihood test. We develop analytic approximation of significant level and standardized power function of MST, which are shown to be very accurate by simulation. We compare the MST detector with the maximum likelihood detector without using signal correlation. 1
Density Ratio Estimation: A Comprehensive Review
, 2010
"... Density ratio estimation has attracted a great deal of attention in the statistics and machine learning communities since it can be used for solving various statistical data processing tasks such as nonstationarity adaptation, twosample test, outlier detection, independence test, feature selection ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Density ratio estimation has attracted a great deal of attention in the statistics and machine learning communities since it can be used for solving various statistical data processing tasks such as nonstationarity adaptation, twosample test, outlier detection, independence test, feature selection/extraction, independent component analysis, causal inference, and conditional probability estimation. When estimating the density ratio, it is preferable to avoid estimating densities since density estimation is known to be a hard problem. In this paper, we give a comprehensive review of density ratio estimation methods based on moment matching, probabilistic classification, and ratio matching.