• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Rapid object detection using a boosted cascade of simple features (2001)

by P Viola, M Jones
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 3,282
Next 10 →

Face Recognition: A Literature Survey

by W. Zhao, R. Chellappa, P. J. Phillips, A. Rosenfeld , 2000
"... ... This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into ..."
Abstract - Cited by 1398 (21 self) - Add to MetaCart
... This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition,

Object class recognition by unsupervised scale-invariant learning

by R. Fergus, P. Perona, A. Zisserman - In CVPR , 2003
"... We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible constellations of parts. A probabilistic representation is used for all aspects of the object: shape, appearance, occlusion and ..."
Abstract - Cited by 1127 (50 self) - Add to MetaCart
We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible constellations of parts. A probabilistic representation is used for all aspects of the object: shape, appearance, occlusion and relative scale. An entropy-based feature detector is used to select regions and their scale within the image. In learning the parameters of the scale-invariant object model are estimated. This is done using expectation-maximization in a maximum-likelihood setting. In recognition, this model is used in a Bayesian manner to classify images. The flexible nature of the model is demonstrated by excellent results over a range of datasets including geometrically constrained classes (e.g. faces, cars) and flexible objects (such as animals). 1.
(Show Context)

Citation Context

...g categories, as opposed to specific objects (e.g. [6, 9, 11]), has recently gained some attention in the machine vision literature [1, 2, 3, 4, 13, 14, 19] with an emphasis on the detection of faces =-=[12, 15, 16]-=-. There is broad agreement on the issue of representation: object categories are represented as collection of features, or parts, each part has a distinctive appearance and spatial position. Different...

Visual categorization with bags of keypoints

by Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray - In Workshop on Statistical Learning in Computer Vision, ECCV , 2004
"... Abstract. We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of im ..."
Abstract - Cited by 1005 (14 self) - Add to MetaCart
Abstract. We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naïve Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for simultaneously classifying seven semantic visual categories. These results clearly demonstrate that the method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information. 1.

SURF: Speeded Up Robust Features

by Herbert Bay, Tinne Tuytelaars, Luc Van Gool - ECCV
"... Abstract. In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Ro-bust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be comp ..."
Abstract - Cited by 897 (12 self) - Add to MetaCart
Abstract. In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Ro-bust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descrip-tors (in casu, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper presents experimental results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object recognition application. Both show SURF’s strong performance. 1
(Show Context)

Citation Context

...h increases not only the matching speed, but also the robustness of the descriptor. In order to make the paper more self-contained, we succinctly discuss the concept of integral images, as defined by =-=[23]-=-. They allow for the fast implementation of box type convolution filters. The entry of an integral image IΣ(x) atalocation x =(x, y) represents the sum of all pixels in the input image I of a rectangu...

Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories

by Li Fei-fei , 2004
"... Abstract — Current computational approaches to learning visual object categories require thousands of training images, are slow, cannot learn in an incremental manner and cannot incorporate prior information into the learning process. In addition, no algorithm presented in the literature has been te ..."
Abstract - Cited by 784 (16 self) - Add to MetaCart
Abstract — Current computational approaches to learning visual object categories require thousands of training images, are slow, cannot learn in an incremental manner and cannot incorporate prior information into the learning process. In addition, no algorithm presented in the literature has been tested on more than a handful of object categories. We present an method for learning object categories from just a few training images. It is quick and it uses prior information in a principled way. We test it on a dataset composed of images of objects belonging to 101 widely varied categories. Our proposed method is based on making use of prior information, assembled from (unrelated) object categories which were previously learnt. A generative probabilistic model is used, which represents the shape and appearance of a constellation of features belonging to the object. The parameters of the model are learnt incrementally in a Bayesian manner. Our incremental algorithm is compared experimentally to an earlier batch Bayesian algorithm, as well as to one based on maximum-likelihood. The incremental and batch versions have comparable classification performance on small training sets, but incremental learning is significantly faster, making real-time learning feasible. Both Bayesian methods outperform maximum likelihood on small training sets. I.

LabelMe: A Database and Web-Based Tool for Image Annotation

by B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman , 2008
"... We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sha ..."
Abstract - Cited by 679 (46 self) - Add to MetaCart
We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sharing of such annotations. Using this annotation tool, we have collected a large dataset that spans many object categories, often containing multiple instances over a wide variety of images. We quantify the contents of the dataset and compare against existing state of the art datasets used for object recognition and detection. Also, we show how to extend the dataset to automatically enhance object labels with WordNet, discover object parts, recover a depth ordering of objects in a scene, and increase the number of labels using minimal user supervision and images from the web.
(Show Context)

Citation Context

...that humans can recognize about 30000 entry-level object categories. Recent work in computer vision has shown impressive results for the detection and recognition of a few different object categories =-=[42, 16, 22]-=-. However, the size and contents of existing datasets, among other factors, limit current methods from scaling to thousands of object categories. Research in object detection and recognition would ben...

An extended set of Haar-like features for rapid objection detection

by Rainer Lienhart, Jochen Maydt - IEEE ICIP
"... Recently Viola et al. [5] have introduced a rapid object detection scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated haar-like features. These novel features significantly enrich the simple features of [5] and can also be calculated ef ..."
Abstract - Cited by 577 (4 self) - Add to MetaCart
Recently Viola et al. [5] have introduced a rapid object detection scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated haar-like features. These novel features significantly enrich the simple features of [5] and can also be calculated efficiently. With these new rotated features our sample face detector shows off on average a 10 % lower false alarm rate at a given hit rate. We also present a novel post optimization procedure for a given boosted cascade improving on average the false alarm rate further by 12.5%. 1
(Show Context)

Citation Context

...et of Haar-like Features for Rapid Object Detection Rainer Lienhart and Jochen Maydt Intel Labs, Intel Corporation, Santa Clara, CA 95052, USA Rainer.Lienhart@intel.com ABSTRACT Recently Viola et al. =-=[5]-=- have introduced a rapid object detection scheme based on a boosted cascade of simple features. In this paper we introduce a novel set of rotated haar-like features, which significantly enrich this ba...

Detecting Pedestrians Using Patterns of Motion and Appearance

by Paul Viola, Michael J. Jones, Daniel Snow - IN ICCV , 2003
"... This paper describes a pedestrian detection system that integrates image intensity information with motion information. We use a detection style algorithm that scans a detector over two consecutive frames of a video sequence. The detector is trained (using AdaBoost) to take advantage of both moti ..."
Abstract - Cited by 575 (3 self) - Add to MetaCart
This paper describes a pedestrian detection system that integrates image intensity information with motion information. We use a detection style algorithm that scans a detector over two consecutive frames of a video sequence. The detector is trained (using AdaBoost) to take advantage of both motion and appearance information to detect a walking person. Past approaches have built detectors based on motion information or detectors based on appearance information, but ours is the first to combine both sources of information in a single detector. The implementation described runs at about 4 frames/second, detects pedestrians at very small scales (as small as 20x15 pixels), and has a very low false positive rate

TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object . . .

by J. Shotton, J. Winn, C. Rother, A. Criminisi - IN ECCV , 2006
"... This paper proposes a new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently. The learned model is used for automatic visual recognition and semantic segmentation of photographs. Our discriminative model exploits nov ..."
Abstract - Cited by 426 (17 self) - Add to MetaCart
This paper proposes a new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently. The learned model is used for automatic visual recognition and semantic segmentation of photographs. Our discriminative model exploits novel features, based on textons, which jointly model shape and texture. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating these classifiers in a conditional random field. Efficient training

80 million tiny images: a large dataset for non-parametric object and scene recognition

by Antonio Torralba , Rob Fergus, William T. freeman - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... ..."
Abstract - Cited by 421 (18 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...n indexing an image using nonvisual versus visual cues. Fig. 10 also shows the results obtained when running a frontal face detector (an OpenCV implementation of Viola and Jones boosted cascade [27], =-=[41]-=-). We run the face detector on the original high-resolution images. Note that the performance of our approach working on 32 32 images is comparable to that of the dedicated face detector on highresolu...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University