Results 1 
6 of
6
Noiseadaptive marginbased active learning for multidimensional data,” arXiv Preprint
"... We present and analyze an adaptive marginbased algorithm that actively learns the optimal linear separator for multidimensional data. The algorithm has the capacity of adapting to unknown level of label noise in the underlying distribution, making it suitable for model selection under the active l ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We present and analyze an adaptive marginbased algorithm that actively learns the optimal linear separator for multidimensional data. The algorithm has the capacity of adapting to unknown level of label noise in the underlying distribution, making it suitable for model selection under the active learning setting. Compared to other alternative agnostic active learning algorithms, our proposed method is much simpler and achieves the optimal convergence rate in query budget T and data dimension d, if logarithm factors are ignored. Furthermore, our algorithm can handle classification loss functions other than the 01 loss, such as hinge and logistic loss, and hence is computationally feasible. 1
An Analysis of Active Learning With Uniform Feature Noise
"... In active learning, the user sequentially chooses values for feature X and an oracle returns the corresponding label Y. In this paper, we consider the effect of feature noise in active learning, which could arise either because X itself is being measured, or it is corrupted in transmission to the or ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
In active learning, the user sequentially chooses values for feature X and an oracle returns the corresponding label Y. In this paper, we consider the effect of feature noise in active learning, which could arise either because X itself is being measured, or it is corrupted in transmission to the oracle, or the oracle returns the label of a noisy version of the query point. In statistics, feature noise is known as “errors in variables ” and has been studied extensively in nonactive settings. However, the effect of feature noise in active learning has not been studied before. We consider the wellknown Berkson errorsinvariables model with additive uniform noise of width σ. Our simple but revealing setting is that of onedimensional binary classification setting where the goal is to learn a threshold (point where the probability of a + label crosses half). We deal with regression functions that are antisymmetric in a region of size σ around the threshold and also satisfy Tsybakov’s margin condition around the threshold. We prove minimax lower and upper bounds which demonstrate that when σ is smaller than the minimiax active/passive noiseless error derived in Castro & Nowak (2007), then noise has no effect on the rates and one achieves the same noiseless rates. For larger σ, the unflattening of the regression function on convolution with uniform noise, along with its local antisymmetry around the threshold, together yield a behaviour where noise appears to be beneficial. Our key result is that active learning can buy significant improvement over a passive strategy even in the presence of feature noise.
Quantile Search: A DistancePenalized Active Learning Algorithm for Spatial Sampling
"... Abstract — Adaptive sampling theory has shown that, with proper assumptions on the signal class, algorithms exist to reconstruct a signal in Rd with an optimal number of samples. We generalize this problem to when the cost of sampling is not only the number of samples but also the distance traveled ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract — Adaptive sampling theory has shown that, with proper assumptions on the signal class, algorithms exist to reconstruct a signal in Rd with an optimal number of samples. We generalize this problem to when the cost of sampling is not only the number of samples but also the distance traveled between samples. This is motivated by our work studying regions of low oxygen concentration in the Great Lakes. We show that for onedimensional threshold classifiers, a tradeoff between number of samples and distance traveled can be achieved using a generalization of binary search, which we refer to as quantile search. We derive the expected total sampling time for noiseless measurements and the expected number of samples for an extension to the noisy case. We illustrate our results in simulations relevant to our sampling application. I.
Computational and statistical advances in testing and learning
, 2015
"... This thesis makes fundamental computational and statistical advances in testing and estimation, making critical progress in theory and application of classical statistical methods like classification, regression and hypothesis testing, and understanding the relationships between them. Our work con ..."
Abstract
 Add to MetaCart
This thesis makes fundamental computational and statistical advances in testing and estimation, making critical progress in theory and application of classical statistical methods like classification, regression and hypothesis testing, and understanding the relationships between them. Our work connects multiple fields in often counterintuitive and surprising ways, leading to new theory, new algorithms, and new insights, and ultimately to a crossfertilization of varied fields like optimization, statistics and machine learning. The first of three thrusts has to do with active learning, a form of sequential learning from feedbackdriven queries that often has a provable statistical advantage over passive learning. We unify concepts from two seemingly different areas — active learning and stochastic firstorder optimization. We use this unified view to develop new lower bounds for stochastic optimization using tools from active learning and new algorithms for active learning using ideas from optimization. We also study the effect of feature noise, or errorsinvariables, on
1Quantile Search: A DistancePenalized Active Learning Algorithm for Spatial Sampling
"... Adaptive sampling theory has shown that, with proper assumptions on the signal class, algorithms exist to reconstruct a signal in Rd with an optimal number of samples. We generalize this problem to when the cost of sampling is not only the number of samples but also the distance traveled between sam ..."
Abstract
 Add to MetaCart
(Show Context)
Adaptive sampling theory has shown that, with proper assumptions on the signal class, algorithms exist to reconstruct a signal in Rd with an optimal number of samples. We generalize this problem to when the cost of sampling is not only the number of samples but also the distance traveled between samples. This is motivated by our work studying regions of low oxygen concentration in the Great Lakes. We show that for onedimensional threshold classifiers, a tradeoff between number of samples and distance traveled can be achieved using a generalization of binary search, which we refer to as quantile search. We derive the expected total sampling time for noiseless measurements and the expected number of samples for an extension to the noisy case. We illustrate our results in simulations relevant to our sampling application. I.
Noiseadaptive Marginbased Active Learning and Lower Bounds under Tsybakov Noise Condition
"... We present a simple noiserobust marginbased active learning algorithm to find homogeneous (passing the origin) linear separators and analyze its error convergence when labels are corrupted by noise. We show that when the imposed noise satisfies the Tsybakov low noise condition (Mammen, Tsybakov, ..."
Abstract
 Add to MetaCart
We present a simple noiserobust marginbased active learning algorithm to find homogeneous (passing the origin) linear separators and analyze its error convergence when labels are corrupted by noise. We show that when the imposed noise satisfies the Tsybakov low noise condition (Mammen, Tsybakov, and others 1999; Tsybakov 2004) the algorithm is able to adapt to unknown level of noise and achieves optimal statistical rate up to polylogarithmic factors. We also derive lower bounds for margin based active learning algorithms under Tsybakov noise conditions (TNC) for the membership query synthesis scenario (Angluin 1988). Our result implies lower bounds for the stream based selective sampling scenario (Cohn 1990) under TNC for some fairly simple data distributions. Quite surprisingly, we show that the sample complexity cannot be improved even if the underlying data distribution is as simple as the uniform distribution on the unit ball. Our proof involves the construction of a wellseparated hypothesis set on the ddimensional unit ball along with carefully designed label distributions for the Tsybakov noise condition. Our analysis might provide insights for other forms of lower bounds as well. 1