Results 1 
8 of
8
Selection of relevant features and examples in machine learning
 ARTIFICIAL INTELLIGENCE
, 1997
"... In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been mad ..."
Abstract

Cited by 590 (2 self)
 Add to MetaCart
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area.
General convergence results for linear discriminant updates
 Machine Learning
, 1997
"... Abstract. The problem of learning lineardiscriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In this paper we define the general class of “quasiadditive ” algorithms, which includes Perce ..."
Abstract

Cited by 98 (0 self)
 Add to MetaCart
Abstract. The problem of learning lineardiscriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In this paper we define the general class of “quasiadditive ” algorithms, which includes Perceptron and Winnow as special cases. We give a single proof of convergence that covers a broad subset of algorithms in this class, including both Perceptron and Winnow, but also many new algorithms. Our proof hinges on analyzing a generic measure of progress construction that gives insight as to when and how such algorithms converge. Our measure of progress construction also permits us to obtain good mistake bounds for individual algorithms. We apply our unified analysis to new algorithms as well as existing algorithms. When applied to known algorithms, our method “automatically ” produces close variants of existing proofs (recovering similar bounds)—thus showing that, in a certain sense, these seemingly diverse results are fundamentally isomorphic. However, we also demonstrate that the unifying principles are more broadly applicable, and analyze a new class of algorithms that smoothly interpolate between the additiveupdate behavior of Perceptron and the multiplicativeupdate behavior of Winnow.
Online Learning with Delayed Label Feedback
 In Proceedings of the 16th Annual International Conference on Algorithmic Learning Theory
"... Abstract. We generalize online learning to handle delays in receiving labels for instances. After receiving an instance x, the algorithm may need to make predictions on several new instances before the label for x is returned by the environment. We give two simple techniques for converting a tradit ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We generalize online learning to handle delays in receiving labels for instances. After receiving an instance x, the algorithm may need to make predictions on several new instances before the label for x is returned by the environment. We give two simple techniques for converting a traditional online algorithm into an algorithm for solving a delayed online problem. One technique is for instances generated by an adversary; the other is for instances generated by a distribution. We show how these techniques effect the original online mistake bounds by giving upperbounds and restricted lowerbounds on the number of mistakes. 1
Transforming Linearthreshold Learning Algorithms into Multiclass Linear Learning Algorithms
, 2001
"... In this paper, we present a new type of multiclass learning algorithm called a linearmax algorithm. Linearmax algorithms learn with a special type of attribute called a subexpert. A subexpert is a vector attribute that has a value for each output class. The goal of the multiclass algorithm ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
In this paper, we present a new type of multiclass learning algorithm called a linearmax algorithm. Linearmax algorithms learn with a special type of attribute called a subexpert. A subexpert is a vector attribute that has a value for each output class. The goal of the multiclass algorithm is to learn a linear function combining the subexperts and to use this linear function to make correct class predictions. We will prove that, in the online mistakebounded model of learning, these multiclass learning algorithms have the same mistake bounds as a related two class linearthreshold algorithm. We will also show how subexperts can be used to solve more traditional problems composed of real valued attributes. This leads to a natural extension of the algorithm to multiclass problems that contain both traditional attributes and subexperts. 1
© 1998 SpringerVerlag New York Inc. On Bayes Methods for OnLine Boolean Prediction 1
"... Abstract. We examine a general Bayesian framework for constructing online prediction algorithms in the experts setting. These algorithms predict the bits of an unknown Boolean sequence using the advice of a finite set of experts. In this framework we use probabilistic assumptions on the unknown seq ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. We examine a general Bayesian framework for constructing online prediction algorithms in the experts setting. These algorithms predict the bits of an unknown Boolean sequence using the advice of a finite set of experts. In this framework we use probabilistic assumptions on the unknown sequence to motivate prediction strategies. However, the relative bounds that we prove on the number of prediction mistakes made by these strategies hold for any sequence. The Bayesian framework provides a unified derivation and analysis of previously known prediction strategies, such as the Weighted Majority and Binomial Weighting algorithms. Furthermore, it provides a principled way of automatically adapting the parameters of Weighted Majority to the sequence, in contrast to previous ad hoc doubling techniques. Finally, we discuss the generalization of our methods to algorithms making randomized predictions.
Direct and indirect algorithms for online learning
"... www.elsevier.com/locate/tcs ..."
(Show Context)
unknown title
"... General convergence results for linear discriminant updates The problem of learning linear discriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In this paper we define the general class of ..."
Abstract
 Add to MetaCart
(Show Context)
General convergence results for linear discriminant updates The problem of learning linear discriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In this paper we define the general class of quasiadditive algorithms, which includes Perceptron and Winnow as special cases. We give a single proof of convergence that covers much of this class, including both Perceptron and Winnow but also many novel algorithms. Our proof introduces a generic measure of progress that seems to capture much of when and how these algorithms converge. Using this measure, we develop a simple general technique for proving mistake bounds, which we apply to the new algorithms as well as existing algorithms. When applied to known algorithms, our technique “automatically ” produces close variants of existing proofs (and we generally obtain the known bounds, to within constants)— thus showing, in a certain sense, that these seemingly diverse results are fundamentally isomorphic. 1
Artificial Intelligence Selection of relevant features and examples in machine
, 1995
"... In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been mad ..."
Abstract
 Add to MetaCart
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area. @ 1997 Elsevier Science B.V.