Results 1 - 10
of
258
Semantic Texton Forests for Image Categorization and Segmentation
"... We propose semantic texton forests, efficient and powerful new low-level features. These are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors. They are extremely fast to both train and tes ..."
Abstract
-
Cited by 304 (14 self)
- Add to MetaCart
We propose semantic texton forests, efficient and powerful new low-level features. These are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors. They are extremely fast to both train and test, especially compared with k-means clustering and nearest-neighbor assignment of feature descriptors. The nodes in the trees provide (i) an implicit hierarchical clustering into semantic textons, and (ii) an explicit local classification estimate. Our second contribution, the bag of semantic textons, combines a histogram of semantic textons over an image region with a region prior category distribution. The bag of semantic textons is computed over the whole image for categorization, and over local rectangular regions for segmentation. Including both histogram and region prior allows our segmentation algorithm to exploit both textural and semantic context. Our third contribution is an image-level prior for segmentation that emphasizes those categories that the automatic categorization believes to be present. We evaluate on two datasets including the very challenging VOC 2007 segmentation dataset. Our results significantly advance the state-of-the-art in segmentation accuracy, and furthermore, our use of efficient decision forests gives at least a five-fold increase in execution speed.
Tree-based batch mode reinforcement learning
- Journal of Machine Learning Research
, 2005
"... Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (xt,ut,rt,xt+1) where xt denotes the system state a ..."
Abstract
-
Cited by 222 (40 self)
- Add to MetaCart
(Show Context)
Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (xt,ut,rt,xt+1) where xt denotes the system state at time t, ut the control action taken, rt the instantaneous reward obtained and xt+1 the successor state of the system, and by determining the control policy from this Q-function. The Q-function approximation may be obtained from the limit of a sequence of (batch mode) supervised learning problems. Within this framework we describe the use of several classical tree-based supervised learning methods (CART, Kd-tree, tree bagging) and two newly proposed ensemble algorithms, namely extremely and totally randomized trees. We study their performances on several examples and find that the ensemble methods based on regression trees perform well in extracting relevant information about the optimal control policy from sets of four-tuples. In particular, the totally randomized trees give good results while ensuring the convergence of the sequence, whereas by relaxing the convergence constraint even better accuracy results are provided by the extremely randomized trees.
F.: Fast discriminative visual codebooks using randomized clustering forests
- In: Advances in Neural Information Processing Systems
, 2006
"... Some of the most effective recent methods for content-based image classifica-tion work by extracting dense or sparse local image descriptors, quantizing them according to a coding rule such as k-means vector quantization, accumulating his-tograms of the resulting “visual word ” codes over the image, ..."
Abstract
-
Cited by 157 (4 self)
- Add to MetaCart
(Show Context)
Some of the most effective recent methods for content-based image classifica-tion work by extracting dense or sparse local image descriptors, quantizing them according to a coding rule such as k-means vector quantization, accumulating his-tograms of the resulting “visual word ” codes over the image, and classifying these with a conventional classifier such as an SVM. Large numbers of descriptors and large codebooks are needed for good results and this becomes slow using k-means. We introduce Extremely Randomized Clustering Forests – ensembles of randomly created clustering trees – and show that these provide more accurate results, much faster training and testing and good resistance to background clutter in several state-of-the-art image classification tasks. 1
Segmentation and recognition using structure from motion point clouds
- In ECCV
, 2008
"... Abstract. We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2 ..."
Abstract
-
Cited by 123 (11 self)
- Add to MetaCart
(Show Context)
Abstract. We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized decision forest combines many such features to achieve a coherent 2D segmentation and recognize the object categories present. Our main contribution is to show how semantic segmentation is possible based solely on motion-derived 3D world structure. Our method works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors. Experiments were performed on a challenging new video database containing sequences filmed from a moving car in daylight and at dusk. The results confirm that indeed, accurate segmentation and recognition are possible using only motion and 3D world structure. Further, we show that the motion-derived information complements an existing state-of-the-art appearance-based method, improving both qualitative and quantitative performance. input video frame reconstructed 3D point cloud automatic segmentation Fig. 1. The proposed algorithm uses 3D point clouds estimated from videos such as the pictured driving sequence (with ground truth inset). Having trained on point clouds from other driving sequences, our new motion and structure features, based purely on the point cloud, perform 11-class semantic segmentation of each test frame. The colors in the ground truth and inferred segmentation indicate category labels. 2 1
Random subwindows for robust image classification
- In CVPR
, 2005
"... We present a novel, generic image classification method based on a recent machine learning algorithm (ensembles of extremely randomized decision trees). Images are classified using randomly extracted subwindows that are suitably normalized to yield robustness to certain image transformations. Our me ..."
Abstract
-
Cited by 94 (15 self)
- Add to MetaCart
(Show Context)
We present a novel, generic image classification method based on a recent machine learning algorithm (ensembles of extremely randomized decision trees). Images are classified using randomly extracted subwindows that are suitably normalized to yield robustness to certain image transformations. Our method is evaluated on four very different, publicly available datasets (COIL-100, ZuBuD, ETH-80, WANG). Our results show that our automatic approach is generic and robust to illumination, scale, and viewpoint changes. An extension of the method is proposed to improve its robustness with respect to rotation changes. 1.
Learning visual similarity measures for comparing never seen objects
- Proc. IEEE CVPR
, 2007
"... In this paper we propose and evaluate an algorithm that learns a similarity measure for comparing never seen objects. The measure is learned from pairs of training images labeled “same ” or “different”. This is far less informative than the commonly used individual image labels (e.g. “car model X”), ..."
Abstract
-
Cited by 84 (0 self)
- Add to MetaCart
(Show Context)
In this paper we propose and evaluate an algorithm that learns a similarity measure for comparing never seen objects. The measure is learned from pairs of training images labeled “same ” or “different”. This is far less informative than the commonly used individual image labels (e.g. “car model X”), but it is cheaper to obtain. The proposed algorithm learns the characteristic differences between local descriptors sampled from pairs of “same ” and “different” images. These differences are vector quantized by an ensemble of extremely randomized binary trees, and the similarity measure is computed from the quantized differences. The extremely randomized trees are fast to learn, robust due to the redundant information they carry and they have been proved to be very good clusterers. Furthermore, the trees efficiently combine different feature types (SIFT and geometry). We evaluate our innovative similarity measure on four very different datasets and consistantly outperform the state-of-the-art competitive approaches. 1.
Randomized clustering forests for image classification
- Pattern Analysis and Machine Intelligence
"... Abstract—This paper introduces three new contributions to the problems of image classification and image search. First, we propose a new image patch quantization algorithm. Other competitive approaches require a large code book and the sampling of many local regions for accurate image description, a ..."
Abstract
-
Cited by 82 (5 self)
- Add to MetaCart
(Show Context)
Abstract—This paper introduces three new contributions to the problems of image classification and image search. First, we propose a new image patch quantization algorithm. Other competitive approaches require a large code book and the sampling of many local regions for accurate image description, at the expense of a prohibitive processing time. We introduce Extremely Randomized Clustering Forests—ensembles of randomly created clustering trees—that are more accurate, much faster to train and test, and more robust to background clutter compared to state-of-the-art methods. Second, we propose an efficient image classification method that combines ERC-Forests and saliency maps very closely with image information sampling. For a given image, a classifier builds a saliency map online, which it uses for classification. We demonstrate speed and accuracy improvement in several state-of-the-art image classification tasks. Finally, we show that our ERC-Forests are used very successfully for learning distances between images of never-seen objects. Our algorithm learns the characteristic differences between local descriptors sampled from pairs of the “same ” or “different ” objects, quantizes these differences with ERC-Forests, and computes the similarity from this quantization. We show significant improvement over state-of-the-art competitive approaches. Index Terms—Randomized trees, image classification, object recognition, similarity measure. Ç 1
Y.: Descriptor based methods in the wild
- In: Faces in Real-Life Images Workshop in ECCV. (2008) (b) Similarity Scores based on Background Samples
"... Abstract. Recent methods for learning similarity between images have presented impressive results in the problem of pair matching (same/notsame classification) of face images. In this paper we explore how well this performance carries over to the related task of multi-option face identification, spe ..."
Abstract
-
Cited by 69 (13 self)
- Add to MetaCart
Abstract. Recent methods for learning similarity between images have presented impressive results in the problem of pair matching (same/notsame classification) of face images. In this paper we explore how well this performance carries over to the related task of multi-option face identification, specifically on the Labeled Faces in the Wild (LFW) image set. In addition, we seek to compare the performance of similarity learning methods to descriptor based methods. We present the following results: (1) Descriptor-Based approaches that efficiently encode the appearance of each face image as a vector outperform the leading similarity based method in the task of multi-option face identification. (2) Straightforward use of Euclidean distance on the descriptor vectors performs somewhat worse than the similarity learning methods on the task of pair matching. (3) Adding a learning stage, the performance of descriptor based methods matches and exceeds that of similarity methods on the pair matching task. (4) A novel patch based descriptor we propose is able to improve the performance of the successful Local Binary Pattern (LBP) descriptor in both multi-option identification and same/not-same classification. 1
Yahoo! Learning to Rank Challenge Overview
, 2011
"... Learning to rank for information retrieval has gained a lot of interest in the recent years but there is a lack for large real-world datasets to benchmark algorithms. That led us to publicly release two datasets used internally at Yahoo! for learning the web search ranking function. To promote these ..."
Abstract
-
Cited by 66 (6 self)
- Add to MetaCart
Learning to rank for information retrieval has gained a lot of interest in the recent years but there is a lack for large real-world datasets to benchmark algorithms. That led us to publicly release two datasets used internally at Yahoo! for learning the web search ranking function. To promote these datasets and foster the development of state-of-the-art learning to rank algorithms, we organized the Yahoo! Learning to Rank Challenge in spring 2010. This paper provides an overview and an analysis of this challenge, along with a detailed description of the released datasets.
C.: Structured Forests for Fast Edge Detection
"... Edge detection is a critical component of many vision systems, including object detectors and image segmentation algorithms. Patches of edges exhibit well-known forms of local structure, such as straight lines or T-junctions. In this paper we take advantage of the structure present in local image pa ..."
Abstract
-
Cited by 64 (1 self)
- Add to MetaCart
(Show Context)
Edge detection is a critical component of many vision systems, including object detectors and image segmentation algorithms. Patches of edges exhibit well-known forms of local structure, such as straight lines or T-junctions. In this paper we take advantage of the structure present in local image patches to learn both an accurate and computation-ally efficient edge detector. We formulate the problem of predicting local edge masks in a structured learning frame-work applied to random decision forests. Our novel ap-proach to learning decision trees robustly maps the struc-tured labels to a discrete space on which standard infor-mation gain measures may be evaluated. The result is an approach that obtains realtime performance that is orders of magnitude faster than many competing state-of-the-art approaches, while also achieving state-of-the-art edge de-tection results on the BSDS500 Segmentation dataset and NYU Depth dataset. Finally, we show the potential of our approach as a general purpose edge detector by showing our learned edge models generalize well across datasets. 1.