Results 1 - 10
of
240
The CN2 Induction Algorithm
- MACHINE LEARNING
, 1989
"... Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensib ..."
Abstract
-
Cited by 682 (6 self)
- Add to MetaCart
Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensible production rules in domains where problems of poor description language and/or noise may be present. Implementations of the cn2, id3 and aq algorithms are compared on three medical classification tasks.
Speaker recognition: A tutorial
"... A tutorial on the design and development of automatic speaker-recognition systems is presented. Automatic speaker recognition is the use of a machine to recognize a person from a spoken phrase. These systems can operate in two modes: to identify a particular person or to verify a person’s claimed id ..."
Abstract
-
Cited by 121 (1 self)
- Add to MetaCart
A tutorial on the design and development of automatic speaker-recognition systems is presented. Automatic speaker recognition is the use of a machine to recognize a person from a spoken phrase. These systems can operate in two modes: to identify a particular person or to verify a person’s claimed identity. Speech processing and the basic components of automatic speakerrecognition systems are shown and design tradeoffs are discussed. Then, a new automatic speaker-recognition system is given. This recognizer performs with 98.9 % correct identification. Last, the performances of various systems are compared.
A Generic Grouping Algorithm and its Quantitative Analysis
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... This paper presents a generic method for perceptual grouping, and an analysis of its expected grouping quality. The grouping method is fairly general: it may be used for the grouping of various types of data features, and to incorporate different grouping cues, operating over feature sets of diff ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
This paper presents a generic method for perceptual grouping, and an analysis of its expected grouping quality. The grouping method is fairly general: it may be used for the grouping of various types of data features, and to incorporate different grouping cues, operating over feature sets of different sizes. The proposed method is divided into two parts: Constructing a graph representation of the available perceptual grouping evidence, and then finding the "best" partition of the graph into groups. The first stage includes a cue enhancement procedure, which integrates the information available from multi-feature cues into very reliable bi-feature cues. Both stages are implemented using known statistical tools such as Wald's SPRT algorithm and the Maximum Likelihood criterion. The accompanying theoretical analysis of this grouping criterion quantifies intuitive expectations and predicts that the expected grouping quality increases with cue reliability. It also shows that investing more computational effort in the grouping algorithm leads to better grouping results. This analysis, which quantifies the grouping power of the Maximum Likelihood criterion, is independent of the grouping domain. To our best knowledge, such an analysis of a grouping process is given here for the first time. Three grouping algorithms, in three different domains, are synthesized as instances of the generic method, They demonstrate the applicability and generality of this grouping method. Keywords : Perceptual Grouping, Grouping Analysis, Graph Clustering, Maximum Likelihood, Wald's SPRT, Performance Prediction, Generic Grouping Algorithm. 1
An Optimal Algorithm for Monte Carlo Estimation
, 1995
"... A typical approach to estimate an unknown quantity is to design an experiment that produces a random variable Z distributed in [0; 1] with E[Z] = , run this experiment independently a number of times and use the average of the outcomes as the estimate. In this paper, we consider the case when no a ..."
Abstract
-
Cited by 41 (3 self)
- Add to MetaCart
A typical approach to estimate an unknown quantity is to design an experiment that produces a random variable Z distributed in [0; 1] with E[Z] = , run this experiment independently a number of times and use the average of the outcomes as the estimate. In this paper, we consider the case when no a priori information about Z is known except that is distributed in [0; 1]. We describe an approximation algorithm AA which, given ffl and ffi, when running independent experiments with respect to any Z, produces an estimate that is within a factor 1 + ffl of with probability at least 1 \Gamma ffi. We prove that the expected number of experiments run by AA (which depends on Z) is optimal to within a constant factor for every Z. An announcement of these results appears in P. Dagum, D. Karp, M. Luby, S. Ross, "An optimal algorithm for Monte-Carlo Estimation (extended abstract)", Proceedings of the Thirtysixth IEEE Symposium on Foundations of Computer Science, 1995, pp. 142-149 [3]. Section ...
WaldBoost - Learning for Time Constrained Sequential Detection
- Proc. of the Conference on Computer Vision and Pattern Recognition
, 2005
"... In many computer vision classification problems, both the error and time characterizes the quality of a decision. We show that such problems can be formalized in the framework of sequential decision-making. If the false positive and false negative error rates are given, the optimal strategy in terms ..."
Abstract
-
Cited by 41 (1 self)
- Add to MetaCart
In many computer vision classification problems, both the error and time characterizes the quality of a decision. We show that such problems can be formalized in the framework of sequential decision-making. If the false positive and false negative error rates are given, the optimal strategy in terms of the shortest average time to decision (number of measurements used) is the Wald’s sequential probability ratio test (SPRT). We built on the optimal SPRT test and enlarge its capabilities to problems with dependent measurements. We show how to overcome the requirements of SPRT – (i) a priori ordered measurements and (ii) known joint probability density functions. We propose an algorithm with near optimal time and error rate trade-off, called WaldBoost, which integrates the AdaBoost algorithm for measurement selection and ordering and the joint probability density estimation with the optimal SPRT decision strategy. The WaldBoost algorithm is tested on the face detection problem. The results are superior to the state-of-the-art methods in the average evaluation time and comparable in detection rates. 1.
Induction in Noisy Domains
, 1994
"... This paper examines the induction of classification rules from examples using real-world data. Real-world data is almost always characterized by two features, which are important for the design of an induction algorithm. Firstly, there is often noise present, for example, due to imperfect measuri ..."
Abstract
-
Cited by 38 (5 self)
- Add to MetaCart
This paper examines the induction of classification rules from examples using real-world data. Real-world data is almost always characterized by two features, which are important for the design of an induction algorithm. Firstly, there is often noise present, for example, due to imperfect measuring equipment used to collect the data. Secondly the description language is often incomplete, such that examples with identical descriptions in the language will not always be members of the same class. Many induction systems make the `noiseless domain' assumption that the examples do not contain errors and the description language is complete, and consequently constrain their search for rules to those for which no counterexamples exist in the data used for induction. However, in real-world domains correlations between attributes and classes in a data set are rarely without exceptions. To locate such correlations and induce rules describing them it is also necessary to consider rules which may not classify all the training examples correctly. This paper firstly discusses some of the problems presented by noise and proposes a top-down induction algorithm for induction in real-world domains. Secondly, an experimental comparison of this algorithm with other induction systems is presented using three sets of real-world medical data.

