Results 11 - 20
of
65
Learning from incomplete data
, 1994
"... Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neura ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and maketwo distinct appeals to the Expectation-Maximization (EM) principle (Dempster et al., 1977)---both for the estimation of mixture components and for coping with the missing data.
Feature-Based Face Recognition Using Mixture-Distance
, 1996
"... We consider the problem of feature-based face recognition in the setting where only a single example of each face is available for training. The mixture-distance technique we introduce achieves a recognition rate of 95% on a database of 685 people in which each face is represented by 30 measured dis ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
We consider the problem of feature-based face recognition in the setting where only a single example of each face is available for training. The mixture-distance technique we introduce achieves a recognition rate of 95% on a database of 685 people in which each face is represented by 30 measured distances. This is currently the best recorded recognition rate for a feature-based system applied to a database of this size. By comparison, nearest neighbor search using Euclidean distance yields 84%. In our work a novel distance function is constructed based on local second order statistics as estimated by modeling the training data as a mixture of normal densities. We report on the results from mixtures of several sizes. We demonstrate that a flat mixture of mixtures performs as well as the best model and therefore represents an effective solution to the model selection problem. A mixture perspective is also taken for individual Gaussians to choose between first order (variance) and second ...
Progress-Based Regulation of Low-Importance Processes
- In Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles
, 1999
"... MS Manners is a mechanism that employs progress-based regulation to prevent resource contention with lowimportance processes from degrading the performance of high-importance processes. The mechanism assumes that resource contention that degrades the performance of a high-importance process will als ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
MS Manners is a mechanism that employs progress-based regulation to prevent resource contention with lowimportance processes from degrading the performance of high-importance processes. The mechanism assumes that resource contention that degrades the performance of a high-importance process will also retard the progress of the low-importance process. MS Manners detects this contention by monitoring the progress of the lowimportance process and inferring resource contention from a drop in the progress rate. This technique recognizes contention over any system resource, as long as the performance impact on contending processes is roughly symmetric. MS Manners employs statistical mechanisms to deal with stochastic progress measurements; it automatically calibrates a target progress rate, so no manual tuning is required; it supports multiple progress metrics from applications that perform several distinct tasks; and it orchestrates multiple low-importance processes to prevent measurement i...
On-line EM Algorithm for the Normalized Gaussian Network
, 1999
"... A Normalized Gaussian Network (NGnet) (Moody and Darken 1989) is a network of local linear regression units. The model softly partitions the input space by normalized Gaussian functions and each local unit linearly approximates the output within the partition. In this article, we propose a new on ..."
Abstract
-
Cited by 46 (6 self)
- Add to MetaCart
A Normalized Gaussian Network (NGnet) (Moody and Darken 1989) is a network of local linear regression units. The model softly partitions the input space by normalized Gaussian functions and each local unit linearly approximates the output within the partition. In this article, we propose a new on-line EM algorithm for the NGnet, which is derived from the batch EM algorithm (Xu, Jordan and Hinton 1995) by introducing a discount factor. We show that the on-line EM algorithm is equivalent to the batch EM algorithm if a specific scheduling of the discount factor is employed. In addition, we show that the on-line EM algorithm can be considered as a stochastic approximation method to find the maximum likelihood estimator. A new regularization method is proposed in order to deal with a singular input distribution. In order to manage dynamic environments, where the input-output distribution of data changes over time, unit manipulation mechanisms such as unit production, unit deletion...
Accelerating EM for large databases
- Machine Learning
, 2001
"... The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the computational cost of applying the EM algorithm to databases with a large number of cases, including databases with large dimensionality. Both approaches are based on partial E-steps for which we can use the results of Neal and Hinton (1998) to obtain the standard convergence guarantees of EM. The rst approach is a version of the incremental EM, described in Neal and Hinton (1998), which cycles through data cases in blocks. The number of cases in each block dramatically e ects the e ciency of the algorithm. We provide a method for selecting a near optimal block size. The second approach, which we call lazy EM, will, at scheduled iterations, evaluate the signi cance of each data case and then proceed for several iterations actively using only the signi cant cases. We demonstrate that both methods can signi cantly reduce computational costs through their application to high-dimensional real-world and synthetic mixture modeling problems for large databases. Keywords: Expectation Maximization Algorithm, incremental EM, lazy EM, online EM, data blocking, mixture models, clustering.
Sonar-Based Mapping With Mobile Robots Using EM
"... This paper presents an algorithms for learning occupancy grid maps with mobile robots equipped with range finders, such as sonar sensors. Our approach employs the EM algorithm to solve the concurrent mapping and localization problem. To accommodate the spatial nature of range data, it relies on a tw ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
This paper presents an algorithms for learning occupancy grid maps with mobile robots equipped with range finders, such as sonar sensors. Our approach employs the EM algorithm to solve the concurrent mapping and localization problem. To accommodate the spatial nature of range data, it relies on a two-layered representation of maps, where global maps are composed from a collection of small, local maps. To avoid local minima during likelihood maximization, a softmax version of the M step is proposed that is gradually annealed to the exact maximum. Experimental results demonstrate that our approach is well suited for constructing large maps of typical indoor environments using sensors as inaccurate as sonars.
Local Linear Perceptrons for Classification
- IEEE Transactions on Neural Networks
, 1996
"... A structure composed of local linear perceptrons for approximating global class discriminants is investigated. Such local linear models may be combined in a cooperative or competitive way. In the cooperative model, a weighted sum of the outputs of the local perceptrons is computed where the weight i ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
A structure composed of local linear perceptrons for approximating global class discriminants is investigated. Such local linear models may be combined in a cooperative or competitive way. In the cooperative model, a weighted sum of the outputs of the local perceptrons is computed where the weight is a function of the distance between the input and the position of the local perceptron. In the competitive model, the cost function dictates a mixture model where only one of the local perceptrons give output. Learning of the local models' positions and the linear mappings they implement are coupled and both supervised. We show that this is preferrable to the uncoupled case where the positions are trained in an unsupervised manner before the separate, supervised training of mappings. We use goodness criteria based on the cross-entropy and give learning equations for both the cooperative and competitive cases. The coupled and uncoupled versions of cooperative and competitive approaches are c...
A Modular Q-Learning Architecture for Manipulator Task Decomposition
- In Proceedings of the Eleventh International Conference on Machine Learning
, 1994
"... Compositional Q-Learning (CQ-L) (Singh 1992) is a modular approach to learning to perform composite tasks made up of several elemental tasks by reinforcement learning. Skills acquired while performing elemental tasks are also applied to solve composite tasks. Individual skills compete for the right ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Compositional Q-Learning (CQ-L) (Singh 1992) is a modular approach to learning to perform composite tasks made up of several elemental tasks by reinforcement learning. Skills acquired while performing elemental tasks are also applied to solve composite tasks. Individual skills compete for the right to act and only winning skills are included in the decomposition of the composite task. We extend the original CQ-L concept in two ways: (1) a more general reward function, and (2) the agent can have more than one actuator. We use the CQ-L architecture to acquire skills for performing composite tasks with a simulated twolinked manipulator having large state and action spaces. The manipulator is a non-linear dynamical system and we require its end-effector to be at specific positions in the workspace. Fast function approximation in each of the Q-modules is achieved through the use of an array of Cerebellar Model Articulation Controller (CMAC) (Albus 1975) structures. 1 INTRODUCTION Reinforce...
Unsupervised Neural Network Learning Procedures . . .
, 1996
"... In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, densi ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, density estimation methods, and feature extraction methods. Each of these major sections concludes with a discussion of successful applications of the methods to real-world problems.

