• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Shape quantization and recognition with randomized trees (1997)

by Y Amit, D Geman
Venue:Neur. Comp
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 104
Next 10 →

Random forests

by Leo Breiman, E. Schapire - Machine Learning , 2001
"... Abstract. Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the fo ..."
Abstract - Cited by 785 (2 self) - Add to MetaCart
Abstract. Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ∗∗∗, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

Boosting Image Retrieval

by Kinh Tieu , Paul Viola , 2000
"... We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual “causes ” and that images which are visually similar share causes. We p ..."
Abstract - Cited by 217 (4 self) - Add to MetaCart
We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual “causes ” and that images which are visually similar share causes. We propose a mechanism for computing a very large number of highly selective features which capture some aspects of this causal structure (in our implementation there are over 45,000 highly selective features). At query time a user selects a few example images, and a technique known as “boosting ” is used to learn a classification function in this feature space. By construction, the boosting procedure learns a simple classifier which only relies on 20 of the features. As a result a very large database of images can be scanned rapidly, perhaps a million images per second. Finally we will describe a set of experiments performed using our retrieval system on a database of 3000 images.

Object retrieval with large vocabularies and fast spatial matching

by James Philbin, Michael Isard, Josef Sivic, Andrew Zisserman - In Proc. IEEE Conf. on Computer Vision and Pattern Recognition , 2007
"... In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our ..."
Abstract - Cited by 139 (14 self) - Add to MetaCart
In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive

Keypoint recognition using randomized trees

by Vincent Lepetit - IEEE Trans. Pattern Anal. Mach. Intell
"... In many 3–D object-detection and pose-estimation problems, run-time performance is of critical importance. However, there usually is time to train the system, which we will show to be very useful. Assuming that several registered images of the target object are available, we developed a keypoint-bas ..."
Abstract - Cited by 87 (15 self) - Add to MetaCart
In many 3–D object-detection and pose-estimation problems, run-time performance is of critical importance. However, there usually is time to train the system, which we will show to be very useful. Assuming that several registered images of the target object are available, we developed a keypoint-based approach that is effective in this context by formulating wide-baseline matching of keypoints extracted from the input images to those found in the model images as a classification problem. This shifts much of the computational burden to a training phase, without sacrificing recognition performance. As a result, the resulting algorithm is robust, accurate, and fast-enough for frame-rate performance. This reduction in run-time computational complexity is our first contribution. Our second contribution is to show that, in this context, a simple and fast keypoint detector suffices to support detection and tracking even under large perspective and scale variations. While earlier methods require a detector that can be expected to produce very repeatable results in general, which usually is very time-consuming, we simply find the most repeatable object keypoints for the specific target object during the training phase. We have incorporated these ideas into a real-time system that detects planar, non-planar, and deformable objects. It then estimates the pose of the rigid ones and the deformations of the others.

A Computational Model for Visual Selection

by Yali Amit, Donald Geman - NEURAL COMPUTATION , 1999
"... We propose a computational model for detecting and localizing instances from an object class in static grey level images. We divide detection into visual selection and final classification, concentrating on the former: Drastically reducing the number of candidate regions which require further, usual ..."
Abstract - Cited by 77 (14 self) - Add to MetaCart
We propose a computational model for detecting and localizing instances from an object class in static grey level images. We divide detection into visual selection and final classification, concentrating on the former: Drastically reducing the number of candidate regions which require further, usually more intensive, processing, but with a minimum of computation and missed detections. Bottom-up processing is based on local groupings of edge fragments constrained by loose geometrical relationships. They have no a priori semantic or geometric interpretation. The role of training is to select special groupings which are moderately likely at certain places on the object but rare in the background. We show that the statistics in both populations are stable. The candidate regions are those which contain global arrangements of several local groupings. Whereas our model was not conceived to explain brain functions, it does cohere with evidence about the functions of neurons in V1 and V2, such ...

Randomized trees for real-time keypoint recognition

by Vincent Lepetit, Pascal Lagger, Pascal Fua - In CVPR , 2005
"... In earlier work, we proposed treating wide baseline matching of feature points as a classification problem, in which each class corresponds to the set of all possible views of such a point. We used a K-mean plus Nearest Neighbor classifier to validate our approach, mostly because it was simple to im ..."
Abstract - Cited by 75 (4 self) - Add to MetaCart
In earlier work, we proposed treating wide baseline matching of feature points as a classification problem, in which each class corresponds to the set of all possible views of such a point. We used a K-mean plus Nearest Neighbor classifier to validate our approach, mostly because it was simple to implement. It has proved effective but still too slow for real-time use. In this paper, we advocate instead the use of randomized trees as the classification technique. It is both fast enough for real-time performance and more robust. It also gives us a principled way not only to match keypoints but to select during a training phase those that are the most recognizable ones. This results in a real-time system able to detect and position in 3D planar, non-planar, and even deformable objects. It is robust to illuminations changes, scale changes and occlusions. 1.

Semantic Texton Forests for Image Categorization and Segmentation

by Jamie Shotton, Matthew Johnson, Roberto Cipolla
"... We propose semantic texton forests, efficient and powerful new low-level features. These are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors. They are extremely fast to both train and tes ..."
Abstract - Cited by 72 (6 self) - Add to MetaCart
We propose semantic texton forests, efficient and powerful new low-level features. These are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors. They are extremely fast to both train and test, especially compared with k-means clustering and nearest-neighbor assignment of feature descriptors. The nodes in the trees provide (i) an implicit hierarchical clustering into semantic textons, and (ii) an explicit local classification estimate. Our second contribution, the bag of semantic textons, combines a histogram of semantic textons over an image region with a region prior category distribution. The bag of semantic textons is computed over the whole image for categorization, and over local rectangular regions for segmentation. Including both histogram and region prior allows our segmentation algorithm to exploit both textural and semantic context. Our third contribution is an image-level prior for segmentation that emphasizes those categories that the automatic categorization believes to be present. We evaluate on two datasets including the very challenging VOC 2007 segmentation dataset. Our results significantly advance the state-of-the-art in segmentation accuracy, and furthermore, our use of efficient decision forests gives at least a five-fold increase in execution speed.

Coarse-to-Fine Face Detection

by Francois Fleuret, Donald Geman , 2001
"... We study visual selection: Detect and roughly localize all instances of a generic object class, such as a face, in a greyscale scene, measuring performance in terms of computation and false alarms. Our approach is sequential testing which is coarse-to-fine in both in the exploration of poses and th ..."
Abstract - Cited by 69 (11 self) - Add to MetaCart
We study visual selection: Detect and roughly localize all instances of a generic object class, such as a face, in a greyscale scene, measuring performance in terms of computation and false alarms. Our approach is sequential testing which is coarse-to-fine in both in the exploration of poses and the representation of objects. All the tests are binary and indicate the presence or absence of loose spatial arrangements of oriented edge fragments. Starting from training examples, we recursively find larger and larger arrangements which are “decomposable,” which implies the probability of an arrangement appearing on an object decays slowly with its size. Detection means finding a sufficient number of arrangements of each size along a decreasing sequence of pose cells. At the beginning, the tests are simple and universal, accommodating many poses simultaneously, but the false alarm rate is relatively high. Eventually, the tests are more discriminating, but also more complex and dedicated to specific poses. As a result, the spatial distribution of processing is highly skewed and detection is rapid, but at the expense of (isolated) false alarms which, presumably, could be eliminated with localized, more intensive, processing.

Joint Induction of Shape Features and Tree Classifiers

by Donald Geman, Yali Amit, Ken Wilder - IEEE Trans. PAMI , 1997
"... We introduce a very large family of binary features for two-dimensional shapes. The salient ones for separating particular shapes are determined by inductive learning during the construction of classi cation trees. There is a feature for every possible geometric arrangement of local topographic code ..."
Abstract - Cited by 66 (6 self) - Add to MetaCart
We introduce a very large family of binary features for two-dimensional shapes. The salient ones for separating particular shapes are determined by inductive learning during the construction of classi cation trees. There is a feature for every possible geometric arrangement of local topographic codes. The arrangements express coarse constraints on relative angles and distances among the code locations and are nearly invariant to substantial a ne and non-linear deformations. They are also partially ordered, which makes it possible to narrow the search for informative ones at each node of the tree. Di erent trees correspond to di erent aspects of shape. They are statistically weakly dependent due to randomization and are aggregated in a simple way. Adapting the algorithm to a shape family is then fully automatic once training samples are provided. As an illustration, we classify handwritten digits from the NIST database � the error rate is:7%.

Fast keypoint recognition in ten lines of code

by Mustafa Özuysal, Pascal Fua, Vincent Lepetit - In Proc. IEEE Conference on Computing Vision and Pattern Recognition , 2007
"... While feature point recognition is a key component of modern approaches to object detection, existing approaches require computationally expensive patch preprocessing to handle perspective distortion. In this paper, we show that formulating the problem in a Naive Bayesian classification framework ma ..."
Abstract - Cited by 50 (6 self) - Add to MetaCart
While feature point recognition is a key component of modern approaches to object detection, existing approaches require computationally expensive patch preprocessing to handle perspective distortion. In this paper, we show that formulating the problem in a Naive Bayesian classification framework makes such preprocessing unnecessary and produces an algorithm that is simple, efficient, and robust. Furthermore, it scales well to handle large number of classes. To recognize the patches surrounding keypoints, our classifier uses hundreds of simple binary features and models class posterior probabilities. We make the problem computationally tractable by assuming independence between arbitrary sets of features. Even though this is not strictly true, we demonstrate that our classifier nevertheless performs remarkably well on image datasets containing very significant perspective changes. 1.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University