Results 1  10
of
2,537,590
Correcting sample selection bias by unlabeled data
"... We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We prese ..."
Abstract

Cited by 203 (11 self)
 Add to MetaCart
We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We
Feature selection: Evaluation, application, and small sample performance
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1997
"... Abstract—A large number of algorithms have been proposed for feature subset selection. Our experimental results show that the sequential forward floating selection (SFFS) algorithm, proposed by Pudil et al., dominates the other algorithms tested. We study the problem of choosing an optimal feature s ..."
Abstract

Cited by 457 (13 self)
 Add to MetaCart
feature selection in small sample size situations. Index Terms—Feature selection, curse of dimensionality, genetic algorithm, node pruning, texture models, SAR image classification. 1
Sample Selection Models in R: Package sampleSelection
"... This introduction to the R package sampleSelection is a slightly modified version of Toomet and Henningsen (2008b), published in the Journal of Statistical Software. This paper describes the implementation of Heckmantype sample selection models in R. We discuss the sample selection problem as well ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
This introduction to the R package sampleSelection is a slightly modified version of Toomet and Henningsen (2008b), published in the Journal of Statistical Software. This paper describes the implementation of Heckmantype sample selection models in R. We discuss the sample selection problem as well
Sample Selection Models in R: Package˜sampleSelection
"... This introduction to the R package sampleSelection is a slightly modified version of Toomet and Henningsen (2008b), published in the Journal of Statistical Software. This paper describes the implementation of Heckmantype sample selection models in R. We discuss the sample selection problem as well ..."
Abstract
 Add to MetaCart
This introduction to the R package sampleSelection is a slightly modified version of Toomet and Henningsen (2008b), published in the Journal of Statistical Software. This paper describes the implementation of Heckmantype sample selection models in R. We discuss the sample selection problem as well
Sample selection for statistical parsing
 In Proceedings of the ISMB BioLINK
, 2007
"... Corpusbased statistical parsing relies on using large quantities of annotated text as training examples. Building this kind of resource is expensive and laborintensive. This work proposes to use sample selection to find helpful training examples and reduce human effort spent on annotating less inf ..."
Abstract

Cited by 49 (1 self)
 Add to MetaCart
Corpusbased statistical parsing relies on using large quantities of annotated text as training examples. Building this kind of resource is expensive and laborintensive. This work proposes to use sample selection to find helpful training examples and reduce human effort spent on annotating less
A Study of CrossValidation and Bootstrap for Accuracy Estimation and Model Selection
 INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 1995
"... We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on artificial data and theoretical results in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), te ..."
Abstract

Cited by 1249 (11 self)
 Add to MetaCart
validation, we vary the number of folds and whether the folds are stratified or not; for bootstrap, we vary the number of bootstrap samples. Our results indicate that for realword datasets similar to ours, the best method to use for model selection is tenfold stratified cross validation, even if computation
Lag length selection and the construction of unit root tests with good size and power
 Econometrica
, 2001
"... It is widely known that when there are errors with a movingaverage root close to −1, a high order augmented autoregression is necessary for unit root tests to have good size, but that information criteria such as the AIC and the BIC tend to select a truncation lag (k) that is very small. We conside ..."
Abstract

Cited by 532 (14 self)
 Add to MetaCart
It is widely known that when there are errors with a movingaverage root close to −1, a high order augmented autoregression is necessary for unit root tests to have good size, but that information criteria such as the AIC and the BIC tend to select a truncation lag (k) that is very small. We
Propensity Score Matching Methods For NonExperimental Causal Studies
, 2002
"... This paper considers causal inference and sample selection bias in nonexperimental settings in which: (i) few units in the nonexperimental comparison group are comparable to the treatment units; and (ii) selecting a subset of comparison units similar to the treatment units is difficult because uni ..."
Abstract

Cited by 690 (3 self)
 Add to MetaCart
This paper considers causal inference and sample selection bias in nonexperimental settings in which: (i) few units in the nonexperimental comparison group are comparable to the treatment units; and (ii) selecting a subset of comparison units similar to the treatment units is difficult because
Selective sampling using the Query by Committee algorithm
 Machine Learning
, 1997
"... We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the twomember committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially with the numbe ..."
Abstract

Cited by 421 (7 self)
 Add to MetaCart
We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the twomember committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially with the number of queries. We show that, in particular, this exponential decrease holds for query learning of perceptrons.
Bayesian Model Selection in Social Research (with Discussion by Andrew Gelman & Donald B. Rubin, and Robert M. Hauser, and a Rejoinder)
 SOCIOLOGICAL METHODOLOGY 1995, EDITED BY PETER V. MARSDEN, CAMBRIDGE,; MASS.: BLACKWELLS.
, 1995
"... It is argued that Pvalues and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a singl ..."
Abstract

Cited by 548 (21 self)
 Add to MetaCart
It is argued that Pvalues and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a
Results 1  10
of
2,537,590