Results 1 - 10
of
13
FeatureBoost: A Meta Learning Algorithm that Improves Model Robustness
- In Proceedings of the Seventeenth International Conference on Machine Learning
, 2000
"... Most machine learning algorithms are lazy: they extract from the training set the minimum information needed to predict its labels. Unfortunately, this often leads to models that are not robust when features are removed or obscured in future test data. For example, a backprop net trained to st ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Most machine learning algorithms are lazy: they extract from the training set the minimum information needed to predict its labels. Unfortunately, this often leads to models that are not robust when features are removed or obscured in future test data. For example, a backprop net trained to steer a car typically learns to recognize the edges of the road, but does not learn to recognize other features such as the stripes painted on the road which could be useful when road edges disappear in tunnels or are obscured by passing trucks. The net learns the minimum necessary to steer on the training set. In contrast, human driving is remarkably robust as features become obscured. Motivated by this, we propose a framework for robust learning that biases induction to learn many different models from the same inputs. We present a meta algorithm for robust learning called FeatureBoost, and demonstrate it on several problems using backprop nets, k-nearest neighbor, and decision trees.
Logical Analysis of Binary Data with Missing Bits
, 1999
"... We model a given pair of sets of positive and negative examples, each of which may contain missing components, as a partially defined Boolean function with missing bits (pBmb) ( T , F ), where T # {0, 1, #} n and F # {0, 1, #} n , and "#" stands for a missing bit. Then we consider ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
We model a given pair of sets of positive and negative examples, each of which may contain missing components, as a partially defined Boolean function with missing bits (pBmb) ( T , F ), where T # {0, 1, #} n and F # {0, 1, #} n , and "#" stands for a missing bit. Then we consider the problem of establishing a Boolean function (an extension) f : {0, 1} n # {0, 1} belonging to a given function class C, such that f is true (resp., false) for every vector in T (resp., in F ). This is a fundamental problem, encountered in many areas such as learning theory, pattern recognition, example-based knowledge bases, logical analysis of data, knowledge discovery and data mining. In this paper, depending upon how to deal with missing bits, we formulate three types of extensions called robust, consistent and most robust extensions, for various classes of Boolean functions such as general, positive, Horn, threshold, decomposable and k-DNF. The complexity of the associated p...
Multiple-Instance Learning of Real-Valued Geometric Patterns
- Annals of Mathematics and Artificial Intelligence
, 2000
"... Recently, there has been a significant amount of research studying the multipleinstance learning model, yet all of this work has only considered this model when there are boolean labels. However, in many of the application areas for which the multiple-instance model fits, real-valued labels are more ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Recently, there has been a significant amount of research studying the multipleinstance learning model, yet all of this work has only considered this model when there are boolean labels. However, in many of the application areas for which the multiple-instance model fits, real-valued labels are more appropriate than boolean labels. In this paper we define and study a real-valued multiple-instance model in which each multiple-instance example is given a real-valued classification in [0, 1]. The real-valued classification indicates the degree to which the example satisfies the target concept. To provide additional structure to the resulting learning problem, we associate a real-valued label with each point in the multiple-instance example. These values are then combined using a real-valued aggregation operator to obtain the classification for the example. Motivated by the possible application of learning geometric patterns to problems in pattern recognition and scene classification (with...
A General Dimension for Exact Learning
- In Proc. 14th ACM Conference on Computational Learning Theory
, 2001
"... We introduce a new combinatorial dimension that gives a good approximation of the number of queries needed to learn in the exact learning model, no matter what set of queries is used. This new dimension generalizes previous dimensions providing upper and lower bounds for all sorts of queries, and ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
We introduce a new combinatorial dimension that gives a good approximation of the number of queries needed to learn in the exact learning model, no matter what set of queries is used. This new dimension generalizes previous dimensions providing upper and lower bounds for all sorts of queries, and not for just example-based queries as in previous works. Our new approach gives also simpler proofs for previous results. We present specific applications of our general dimension for the case of unspecified attribute value queries, and show that unspecified attribute value membership and equivalence queries are not more powerful than standard membership and equivalence queries for the problem of learning DNF formulas. Work supported in part by the EC through the Esprit Program EU BRA program under project 20244 (ALCOM-IT), the EC Working Group EP27150 (NeuroColt II) and the spanish government grant PB980937 -C04-04. y Part of this work was done while this author was still in LSI, UPC. 1
A New Abstract Combinatorial Dimension for Exact Learning via Queries
, 2001
"... We introduce an abstract model of exact learning via queries that can be instantiated to all the query learning models currently in use, while being closer to them than previous unifying attempts. We present a characterization of those Boolean function classes learnable in this abstract model, in te ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
We introduce an abstract model of exact learning via queries that can be instantiated to all the query learning models currently in use, while being closer to them than previous unifying attempts. We present a characterization of those Boolean function classes learnable in this abstract model, in terms of a new combinatorial notion that we introduce, the abstract identification dimension. Then we prove that the particularization of our notion to specific known protocols such as equivalence, membership, and membership and equivalence queries results in exactly the same combinatorial notions currently known to characterize learning in these models, such as strong consistency dimension, extended teaching dimension, and certificate size. Our theory thus fully unifies all these characterizations. For models enjoying a specific property that we identify, the notion can be simplified while keeping the same characterizations. From our results we can derive combinatorial characterizations of all those other models for query learning proposed in the literature. We can also obtain the first polynomial-query learning algorithms for specific interesting problems such as learning DNF with proper subset and superset queries.
Facilitating CBR for Incompletely-Described Cases: Distance Metrics for Partial Problem Descriptions
- In Proceedings of the Seventh European Conference on Case-Based Reasoning
, 2004
"... Abstract. A fundamental problem for case-based reasoning systems is how to select relevant prior cases. Numerous strategies have been developed for determining the similarity of prior cases, given full descriptions of the problem at hand, and situation assessment methods have been developed for form ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Abstract. A fundamental problem for case-based reasoning systems is how to select relevant prior cases. Numerous strategies have been developed for determining the similarity of prior cases, given full descriptions of the problem at hand, and situation assessment methods have been developed for formulating appropriate initial case descriptions. However, in real-world applications, attempting to determine all relevant features of a new problem before retrieval may be impractical or impossible. Consequently, how to guide retrieval based on partial problem descriptions is an important question for CBR. This paper examines the problem of assessing similarity in partially-described cases. It proposes a set of similarity assessment strategies for handling missing information, evaluates their performance and efficiency on sample data sets, and discusses their tradeoffs. 1
DL-FOIL - Concept Learning in Description Logics
- In Proc. of ILP2008, volume 5194 of LNAI
, 2008
"... Abstract. In this paper we focus on learning concept descriptions expressed in Description Logics. After stating the learning problem in this context, a FOIL-like algorithm is presented that can be applied to general DL languages, discussing related theoretical aspects of learning with the inherent ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Abstract. In this paper we focus on learning concept descriptions expressed in Description Logics. After stating the learning problem in this context, a FOIL-like algorithm is presented that can be applied to general DL languages, discussing related theoretical aspects of learning with the inherent incompleteness underlying the semantics of this representation. Subsequently we present an experimental evaluation of the implementation of this algorithm performed on some real ontologies in order to empirically assess its performance. 1
On Learning in the Presence of Unspecified Attribute Values (Extended Abstract)
- In: Proceedings of the Twelfth Annual Conference on Computational Learning Theory
, 1999
"... ) Nader H. Bshouty David K. Wilson Department of Computer Science University of Calgary 2500 University Drive NW Calgary, AB, Canada T2N 1N4 Email: fbshouty, wilsondg@cpsc.ucalgary.ca Abstract We continue the study of learning in the presence of unspecified attribute values (UAV) where some of the ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
) Nader H. Bshouty David K. Wilson Department of Computer Science University of Calgary 2500 University Drive NW Calgary, AB, Canada T2N 1N4 Email: fbshouty, wilsondg@cpsc.ucalgary.ca Abstract We continue the study of learning in the presence of unspecified attribute values (UAV) where some of the attributes of the examples may be unspecified [9, 4]. A UAV assignment x 2 f0; 1; ?g n , where ? indicates unspecified, is classified positive (negative) with respect to a Boolean function f if all possible assignments for the unspecified attributes result in a positive (negative) classification. Otherwise, the classification of x is ?. Given an example x 2 f0; 1; ?g n , the oracle UAV-MQ(x) responds with the classification of x with respect to the unknown target. Given a hypothesis h, the oracle UAV-EQ returns an example x 2 f0; 1; ?g n for which h(x) is incorrect, if such an example exists. The new contributions of this paper are as follows. First we define a new oracle called the ...
Structural Results about Exact Learning with Unspecified Attribute Values
, 1998
"... This paper deals with the UAV learning model of Goldman, Kwek and Scott [7]. ("UAV" is the acronym for "Unspecified Attribute Values".) As in [7], we consider exact learning within the UAV framework, where the learner has to exactly identify an unknown target concept by means of UAV membership (U ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper deals with the UAV learning model of Goldman, Kwek and Scott [7]. ("UAV" is the acronym for "Unspecified Attribute Values".) As in [7], we consider exact learning within the UAV framework, where the learner has to exactly identify an unknown target concept by means of UAV membership (UAV-MQs) and/or UAV equivalence queries (UAV-EQs or UAV-ARB-EQs, respectively). A smooth transition between exact learning in the UAV setting and standard exact learning is obtained by putting a fixed bound r on the number of unspecified attribute values per instance. For r = 0, we obtain the standard model. For r = n (the total number of attributes), we obtain the (unrestricted) UAV model. Between these extremes, we find the hierarchies (UAV-MQ r ) 0rn , (UAV-EQ r ) 0rn , and (UAV-ARB-EQ r ) 0rn . Our main results are as follows. We present lower bounds on the number of ARB-EQs and UAV-MQs in terms of the Vapnik Chervonenkis dimension of the concept class. We show furthermore that a...
A General Dimension for Query Learning
"... We introduce a new combinatorial dimension that characterizes the number of queries needed to learn, no matter what set of queries is used. This new dimension generalizes previous dimensions providing upper and lower bounds on the query complexity for all sorts of queries, and not for just examp ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We introduce a new combinatorial dimension that characterizes the number of queries needed to learn, no matter what set of queries is used. This new dimension generalizes previous dimensions providing upper and lower bounds on the query complexity for all sorts of queries, and not for just example-based queries as in previous works. Moreover, the new characterization is not only valid for exact learning but also for approximate learning. We present several Results from sections 4 and 5 were presented at COLT/EUROCOLT 2001 [4]; results from sections 7 and 8 were presented at ALT 2002 [24].

