Results 1 -
8 of
8
Everything Old Is New Again: A Fresh Look at Historical Approaches
- in Machine Learning. PhD thesis, MIT
, 2002
"... 2 Everything Old Is New Again: A Fresh Look at Historical ..."
Abstract
-
Cited by 68 (5 self)
- Add to MetaCart
2 Everything Old Is New Again: A Fresh Look at Historical
Almost-Everywhere Algorithmic Stability and Generalization Error
- In UAI-2002: Uncertainty in Artificial Intelligence
, 2002
"... We introduce a new notion of algorithmic stability, which we call training stability. ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
We introduce a new notion of algorithmic stability, which we call training stability.
On the relation between low density separation, spectral clustering and graph cuts
- Advances in Neural Information Processing Systems (NIPS) 19
, 2006
"... One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion of weighted boundary volume, which measures the length of the class/cluster boundary weighted by the density of the underlying probability distribution. We show that sizes of the cuts of certain commonly used data adjacency graphs converge to this continuous weighted volume of the boundary. keywords: Clustering, Semi-Supervised Learning 1
Stable transductive learning
- Proc. 19th Annual Conference on Computational Learning Theory
, 2006
"... Abstract. We develop a new error bound for transductive learning algorithms. The slack term in the new bound is a function of a relaxed notion of transductive stability, which measures the sensitivity of the algorithm to most pairwise exchanges of training and test set points. Our bound is based on ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. We develop a new error bound for transductive learning algorithms. The slack term in the new bound is a function of a relaxed notion of transductive stability, which measures the sensitivity of the algorithm to most pairwise exchanges of training and test set points. Our bound is based on a novel concentration inequality for symmetric functions of permutations. We also present a simple sampling technique that can estimate, with high probability, the weak stability of transductive learning algorithms with respect to a given dataset. We demonstrate the usefulness of our estimation technique on a well known transductive learning algorithm. 1
unknown title
"... Almost-everywhere algorithmic stability and generalization error Samuel Kutin\Lambda and Partha Niyogi # Abstract We explore in some detail the notion of algorith-mic stability as a viable framework for analyzing the generalization error of learning algorithms.We introduce the new notion of training ..."
Abstract
- Add to MetaCart
Almost-everywhere algorithmic stability and generalization error Samuel Kutin\Lambda and Partha Niyogi # Abstract We explore in some detail the notion of algorith-mic stability as a viable framework for analyzing the generalization error of learning algorithms.We introduce the new notion of training stability of a learning algorithm and show that, in ageneral setting, it is sufficient for good bounds on generalization error. In the PAC setting, train-ing stability is both necessary and sufficient for learnability. The approach based on training stability makesno reference to VC dimension or VC entropy. There is no need to prove uniform convergence,and generalization error is bounded directly via an extended McDiarmid inequality. As a result itpotentially allows us to deal with a broader class of learning algorithms than Empirical Risk Min-imization. We also explore the relationships among VC di-mension, generalization error, and various notions of stability. Several examples of learningalgorithms are considered.
c ○ World Scientific Publishing Company STABILITY RESULTS IN LEARNING THEORY
, 2005
"... The problem of proving generalization bounds for the performance of learning algorithms can be formulated as a problem of bounding the bias and variance of estimators of the expected error. We show how various stability assumptions canbeemployedforthis purpose. We provide a necessary and sufficient ..."
Abstract
- Add to MetaCart
The problem of proving generalization bounds for the performance of learning algorithms can be formulated as a problem of bounding the bias and variance of estimators of the expected error. We show how various stability assumptions canbeemployedforthis purpose. We provide a necessary and sufficient stability condition for bounding the bias and variance for the Empirical Risk Minimization algorithm, and various sufficient conditions for bounding bias and variance of estimators for general algorithms. We discuss settings in which it is possible to obtain exponential bounds, and we prove an extension of the bounded-difference inequality for “almost always ” stable algorithms.
Ranking Categorical Features Using Generalization Properties ∗
"... Feature ranking is a fundamental machine learning task with various applications, including feature selection and decision tree learning. We describe and analyze a new feature ranking method that supports categorical features with a large number of possible values. We show that existing ranking crit ..."
Abstract
- Add to MetaCart
Feature ranking is a fundamental machine learning task with various applications, including feature selection and decision tree learning. We describe and analyze a new feature ranking method that supports categorical features with a large number of possible values. We show that existing ranking criteria rank a feature according to the training error of a predictor based on the feature. This approach can fail when ranking categorical features with many values. We propose the Ginger ranking criterion, that estimates the generalization error of the predictor associated with the Gini index. We show that for almost all training sets, the Ginger criterion produces an accurate estimation of the true generalization error, regardless of the number of values in a categorical feature. We also address the question of finding the optimal predictor that is based on a single categorical feature. It is shown that the predictor associated with the misclassification error criterion has the minimal expected generalization error. We bound the bias of this predictor with respect to the generalization error of the Bayes optimal predictor, and analyze its concentration properties. We demonstrate the efficiency of our approach for feature selection and for learning decision trees in a series of experiments with synthetic and natural data sets.

