Results 11  20
of
49
Detection of Unexploded Ordnance via Efficient Semisupervised and Active Learning
"... Abstract—Semisupervised learning and active learning are considered for unexploded ordnance (UXO) detection. Semisupervised learning algorithms are designed using both labeled and unlabeled data, where here labeled data correspond to sensor signatures for which the identity of the buried item (UXO/n ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Semisupervised learning and active learning are considered for unexploded ordnance (UXO) detection. Semisupervised learning algorithms are designed using both labeled and unlabeled data, where here labeled data correspond to sensor signatures for which the identity of the buried item (UXO/nonUXO) is known; for unlabeled data, one only has access to the corresponding sensor data. Active learning is used to define which unlabeled signatures would be most informative to improve the classifier design if the associated label could be acquired (where for UXO sensing, the label is acquired by excavation). A graphbased semisupervised algorithm is applied, which employs the idea of a random Markov walk on a graph, thereby exploiting knowledge of the data manifold (where the manifold is defined by both the labeled and unlabeled data). The algorithm is used to infer labels for the unlabeled data, providing a probability that a given unlabeled signature corresponds to a buried UXO. An efficient activelearning procedure is developed for this algorithm, based on a mutual information measure. In this manner, one initially performs excavation with the purpose of acquiring labels to improve the classifier, and once this activelearning phase is completed, the resulting semisupervised classifier is then applied to the remaining unlabeled signatures to quantify the probability that each such item is a UXO. Example classification results are presented for an actual UXO site, based on electromagnetic induction and magnetometer data. Performance is assessed in comparison to other semisupervised approaches, as well as to supervised algorithms. Index Terms—Detectors, electromagnetic induction (EMI). I.
On classification with incomplete data
 IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—We address the incompletedata problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing anal ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Abstract—We address the incompletedata problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both ExpectationMaximization (EM) and Variational Bayesian EM (VBEM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graphbased regularization. The semisupervised algorithm utilizes all available data—both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown. Index Terms—Classification, incomplete data, missing data, supervised learning, semisupervised learning, imperfect labeling. Ç 1
SEMISUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON A MARKOV RANDOM FIELD AND SPARSE MULTINOMIAL LOGISTIC REGRESSION
"... This paper introduces a new semisupervised classification and segmentation approach tailored to hyperspectral images. The posterior distributions of the classes are modeled by the multinomial logistic regression. The contextual information inherent to the spatial configuration of the image pixels i ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
(Show Context)
This paper introduces a new semisupervised classification and segmentation approach tailored to hyperspectral images. The posterior distributions of the classes are modeled by the multinomial logistic regression. The contextual information inherent to the spatial configuration of the image pixels is modeled by a MultiLevel Logistic (MLL) MarkovGibbs random field. The multinomial logistic regressors, assumed to be random vectors with independent Laplacian components, are learned using the recently introduced LORSAL algorithm. The maximum a posteriori (MAP) segmentation is computed via the αExpansion algorithm, a powerful graph cut based approach to integer optimization. The effectiveness of the proposed methodology is illustrated by classifying simulated and real data sets. Comparisons with stateofart methods are also included. 1.
ON MULTIVIEW LEARNING WITH ADDITIVE MODELS
 SUBMITTED TO THE ANNALS OF APPLIED STATISTICS
"... In many scientific settings, data can be naturally partitioned into variable groupings called views. Common examples include environmental (1st view) and genetic information (2nd view) in ecological applications, chemical (1st view) and biological (2nd view) data in drug discovery. Viewed data also ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
In many scientific settings, data can be naturally partitioned into variable groupings called views. Common examples include environmental (1st view) and genetic information (2nd view) in ecological applications, chemical (1st view) and biological (2nd view) data in drug discovery. Viewed data also occurs in text analysis and proteomics applications where one view consists of a graph with observations as the vertices and a weighted measure of pairwise similarity between observations as the edges. Further in several of these applications the observations can be partitioned into two sets, one where the response is observed (labeled) and the other where the response is not (unlabeled). The problem for simultaneously addressing viewed data and incorporating unlabeled observations in training is referred to as multiview transductive learning. In this work we introduce and study a comprehensive generalized fixed point additive modeling framework for multiview transductive learning, where any view is represented by a linear smoother. The problem of view selection is discussed using a modified generalized Akaike Information Criterion, which provides an approach for testing the contribution of each view. An efficient implementation is provided for fitting these models with both backfitting and localscoring type algorithms adjusted to semisupervised graphbased learning. The proposed technique is assessed on both synthetic and real data sets and is shown to be competitive to stateoftheart cotraining and graphbased techniques.
Clustering, Dimensionality Reduction and Side Information
, 2006
"... Recent advances in sensing and storage technology have created many highvolume, highdimensional data sets in pattern recognition, machine learning, and data mining. Unsupervised learning can provide generic tools for analyzing and summarizing these data sets when there is no welldefined notion of ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Recent advances in sensing and storage technology have created many highvolume, highdimensional data sets in pattern recognition, machine learning, and data mining. Unsupervised learning can provide generic tools for analyzing and summarizing these data sets when there is no welldefined notion of classes. The purpose of this thesis is to study some of the open problems in two main areas of unsupervised learning, namely clustering and (unsupervised) dimensionality reduction. Instancelevel constraint on objects, an example of sideinformation, is also considered to improve the clustering results. Our first contribution is a modification to the isometric feature mapping (ISOMAP) algorithm when the input data, instead of being all available simultaneously, arrive sequentially from a data stream. ISOMAP is representative of a class of nonlinear dimensionality reduction algorithms that are based on the notion of a manifold. Both the standard ISOMAP and the landmark version of ISOMAP are considered. Experimental results on synthetic data as well as real world images demonstrate that the modified algorithm can maintain an accurate lowdimensional representation of the data in an efficient manner. We study the problem of feature selection in modelbased clustering when the number of clusters
Active Learning of Features and Labels
"... Cotraining improves multiview classifier learning by enforcing internal consistency between the predicted classes of unlabeled objects based on different views (different sets of features for characterizing the same object). In some applications, due to the cost involved in data acquisition, only ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Cotraining improves multiview classifier learning by enforcing internal consistency between the predicted classes of unlabeled objects based on different views (different sets of features for characterizing the same object). In some applications, due to the cost involved in data acquisition, only a subset of features may be obtained for many unlabeled objects. Observing additional features of objects that were earlier incompletely characterized, increases the data available for cotraining, hence improving the classification accuracy. This paper addresses the problem of active learning of features: which additional features should be acquired of incompletely characterized objects in order to maximize the accuracy of the learned classifier? Our method, which extends previous techniques for the active learning of labels, is experimentally shown to be effective in a reallife multisensor mine detection problem. 1.
EXPLOITING SPATIAL INFORMATION IN SEMISUPERVISED HYPERSPECTRAL IMAGE SEGMENTATION
"... We present a new semisupervised segmentation algorithm suited to hyperspectral images, which takes full advantage of the spectral and spatial information available in the scenes. We mainly focus on problems involving very few labeled samples and a larger set of unlabeled samples. A multinomial logi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
We present a new semisupervised segmentation algorithm suited to hyperspectral images, which takes full advantage of the spectral and spatial information available in the scenes. We mainly focus on problems involving very few labeled samples and a larger set of unlabeled samples. A multinomial logistic regression (MLR) is used to model the posterior class probability distributions, whereas a multilevel logistic level (MLL) prior is adopted to model the spatial information present in class label images. The multinomial logistic regressors are learnt using an expectation maximization (EM) type algorithm, where the class labels of the unlabeled samples are dealt with as unobserved random variables. The expectation step of the EM algorithm is computed using belief propagation (BP). In the maximization step of the EM algorithm, we compute the maximum a posterioi estimate (MAP) estimate of the multinomial logistic regressors. For the segmentation, we compute both the MAP solution and the maximizer of the posterior marginal (MPM) provided by the belief propagation algorithm. We show, using the wellknown AVIRIS Indian Pines data, that both solutions exhibit stateoftheart performance. Index Terms — Semisupervised classification, belief propagation, expectation maximization, hyperspectral segmentation, integer optimization. 1.
Using Local Dependencies within Batches to Improve Large Margin Classifiers." Journal of machine learning research 10
, 2009
"... Most classification methods assume that the samples are drawn independently and identically from an unknown data generating distribution, yet this assumption is violated in several real life problems. In order to relax this assumption, we consider the case where batches or groups of samples may have ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Most classification methods assume that the samples are drawn independently and identically from an unknown data generating distribution, yet this assumption is violated in several real life problems. In order to relax this assumption, we consider the case where batches or groups of samples may have internal correlations, whereas the samples from different batches may be considered to be uncorrelated. This paper introduces three algorithms for classifying all the samples in a batch jointly: one based on a probabilistic formulation, and two based on mathematical programming analysis. Experiments on three reallife computer aided diagnosis (CAD) problems demonstrate that the proposed algorithms are significantly more accurate than a naive support vector machine which ignores the correlations among the samples.