A Universal Noise Cleaning Procedure via the Dual Learning Problem
Abstract:
We present a new scheme for cleaning a sample corrupted by labeling noise. Our scheme is universal in the sense that we only make general assumptions on the dual learning problem and therefore it is completely detached from the specifics of the primal problem itself. In a nutshell, we turn to the dual learning problem to exploit valuable information about the underlying structure of the primal one, which in turn provides the means to device a simple "noise cleaning " mechanism, using Membership Queries. We demonstrate the strength and applicability of the suggested method with a few learning problems of different nature. Of particular interest is the problem of learning in the restricted class of parity functions, where only k out of n bits are active. We show that in the MQ model we can outperforme the recent result by Blum et al. [3] and handle k = O (n \Gamma c log (n) log log (n)). This also provides a sharp separation between our method and the SQ model. The suggested procedure works not only for classification problems but for regression problems as well. To this end, we present a uniform upper bound on the fat-shattering dimension of both the primal and dual problems, which is derived from a geometric property of classes of real-valued functions- called type.
Citations
| 528 | Queries and concept learning – Angluin - 1988 |
| 201 | Efficient noise-tolerant learning from statistical queries – Kearns - 1993 |
| 183 | Learning from noisy examples – Angluin, Laird - 1988 |
| 126 | Learning in the presence of malicious errors – KEARNS, LI - 1993 |
| 125 | Combinatorial Geometry – Pach, Agarwal - 1995 |
| 122 | Learning decision trees using the Fourier spectrum – Kushilevitz, Mansour - 1993 |
| 35 | Noise-tolerant learning, the parity problem, and the statistical query model – Blum, Kalai, et al. |
| 31 | Prediction-preserving reducibility – Pitt, Warmuth - 1990 |
| 30 | Probabilistic methods in the geometry of Banach spaces. Probability and analysis (Varenna – Pisier - 1985 |
| 22 | On the sample complexity of pac-learning using random and chosen examples – Eisenberg, Rivest - 1990 |
| 4 | Learnability in Banach Spaces with Reproducing Kernels – Mendelson - 1999 |
| 3 | Noise tolerant learning using early predictors – Fine, Gilad-Bachrach, et al. - 1999 |

