Results 1 - 10
of
23
Local features and kernels for classification of texture and object categories: a comprehensive study
- International Journal of Computer Vision
, 2007
"... Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations an ..."
Abstract
-
Cited by 211 (21 self)
- Add to MetaCart
Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the χ 2 distance. We first evaluate the performance of our approach with different keypoint detectors and descriptors, as well as different kernels and classifiers. We then conduct a comparative evaluation with several state-of-the-art recognition methods on four texture and five object databases. On most of these databases, our implementation exceeds the best reported results and achieves comparable performance on the rest. Finally, we investigate the influence of background correlations on recognition performance via extensive tests on the PASCAL database, for which ground-truth object localization information is available. Our experiments demonstrate that image representations based on distributions of local features are surprisingly effective for classification of texture and object images under challenging real-world conditions, including significant intra-class variations and substantial background clutter.
The pyramid match kernel: Efficient learning with sets of features
- Journal of Machine Learning Research
, 2007
"... In numerous domains it is useful to represent a single example by the set of the local features or parts that comprise it. However, this representation poses a challenge to many conventional machine learning techniques, since sets may vary in cardinality and elements lack a meaningful ordering. Kern ..."
Abstract
-
Cited by 55 (6 self)
- Add to MetaCart
In numerous domains it is useful to represent a single example by the set of the local features or parts that comprise it. However, this representation poses a challenge to many conventional machine learning techniques, since sets may vary in cardinality and elements lack a meaningful ordering. Kernel methods can learn complex functions, but a kernel over unordered set inputs must somehow solve for correspondences—generally a computationally expensive task that becomes impractical for large set sizes. We present a new fast kernel function called the pyramid match that measures partial match similarity in time linear in the number of features. The pyramid match maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in order to find implicit correspondences based on the finest resolution histogram cell where a matched pair first appears. We show the pyramid match yields a Mercer kernel, and we prove bounds on its error relative to the optimal partial matching cost. We demonstrate our algorithm on both classification and regression tasks, including object recognition, 3-D human pose inference, and time of publication estimation for documents, and we show that the proposed method is accurate and significantly more efficient than current approaches.
Hyperfeatures - multilevel local coding for visual recognition
- In ECCV
, 2006
"... Abstract. Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics at scales ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
Abstract. Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics at scales larger than their local input patches. We present a new multilevel visual representation, ‘hyperfeatures’, that is designed to remedy this. The starting point is the familiar notion that to detect object parts, in practice it often suffices to detect co-occurrences of more local object fragments – a process that can be formalized as comparison (e.g. vector quantization) of image patches against a codebook of known fragments, followed by local aggregation of the resulting codebook membership vectors to detect cooccurrences. This process converts local collections of image descriptor vectors into somewhat less local histogram vectors – higher-level but spatially coarser descriptors. We observe that as the output is again a local descriptor vector, the process can be iterated, and that doing so captures and codes ever larger assemblies of object parts and increasingly abstract or ‘semantic ’ image properties. We formulate the hyperfeatures model and study its performance under several different image coding methods including clustering based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Dirichlet Allocation. We find that the resulting high-level features provide improved performance in several object image and texture image classification tasks. 1
A discriminative approach to robust visual place recognition
- in Proc. IROS’06
"... Abstract — An important competence for a mobile robot system is the ability to localize and perform context interpretation. This is required to perform basic navigation and to facilitate local specific services. Usually localization is performed based on a purely geometric model. Through use of visi ..."
Abstract
-
Cited by 27 (11 self)
- Add to MetaCart
Abstract — An important competence for a mobile robot system is the ability to localize and perform context interpretation. This is required to perform basic navigation and to facilitate local specific services. Usually localization is performed based on a purely geometric model. Through use of vision and place recognition a number of opportunities open up in terms of flexibility and association of semantics to the model. To achieve this the present paper presents an appearance based method for place recognition. The method is based on a large margin classifier in combination with a rich global image descriptor. The method is robust to variations in illumination and minor scene changes. The method is evaluated across several different cameras, changes in time-of-day and weather conditions. The results clearly demonstrate the value of the approach. I.
Unifying discriminative visual codebook generation with classifier training for object category reorganization, CVPR
, 2008
"... The idea of representing images using a bag of visual words is currently popular in object category recognition. Since this representation is typically constructed using unsupervised clustering, the resulting visual words may not capture the desired information. Recent work has explored the construc ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
The idea of representing images using a bag of visual words is currently popular in object category recognition. Since this representation is typically constructed using unsupervised clustering, the resulting visual words may not capture the desired information. Recent work has explored the construction of discriminative visual codebooks that explicitly consider object category information. However, since the codebook generation process is still disconnected from that of classifier training, the set of resulting visual words, while individually discriminative, may not be those best suited for the classifier. This paper proposes a novel optimization framework that unifies codebook generation with classifier training. In our approach, each image feature is encoded by a sequence of “visual bits ” optimized
A statistical approach to material classification using image patch exemplars
, 2006
"... In this paper, we investigate material classification from single images obtained under unknown viewpoint and illumination. It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighbourhoods (starting from as small as 3×3 pixels ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
In this paper, we investigate material classification from single images obtained under unknown viewpoint and illumination. It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighbourhoods (starting from as small as 3×3 pixels square), and that this outperforms classification using filter banks with large support. It is also shown that the performance of filter banks is inferior to that of image patches with equivalent neighbourhoods. We develop novel texton based representations which are suited to modelling this joint neighbour-hood distribution for MRFs. The representations are learnt from training images, and then used to classify novel images (with unknown viewpoint and lighting) into texture classes. Three such representations are proposed, and their performance is assessed and compared to that of filter banks. The power of the method is demonstrated by classifying 2806 images of all 61 materials present in the Columbia-Utrecht database. The classification performance surpasses that of recent state of the art filter bank based classifiers such as Leung and Malik (IJCV 01), Cula and Dana (IJCV 04), and Varma and Zisserman (IJCV 05). We also benchmark performance by classifying all the textures present in the Microsoft Textile database as well as the San Francisco outdoor dataset. We conclude with discussions on why features based on compact neighbourhoods can correctly discriminate between textures with large global structure and why the performance of filter banks is not superior to the source image patches from which they were derived.
Class-specific material categorisation
- In Proceedings of the International Conference on Computer Vision
, 2005
"... Although a considerable amount of work has been published on material classification, relatively little of it studies situations with considerable variation within each class. Many experiments use the exact same sample, or different patches from the same image, for training and test sets. Thus, such ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Although a considerable amount of work has been published on material classification, relatively little of it studies situations with considerable variation within each class. Many experiments use the exact same sample, or different patches from the same image, for training and test sets. Thus, such studies are vulnerable to effectively recognising one particular sample of a material as opposed to the material category. In contrast, this paper places firm emphasis on the capability to generalise to previously unseen instances of materials. We adopt an appearance-based strategy, and conduct experiments on a new database which contains several samples of each of eleven material categories, imaged under a variety of pose, illumination and scale conditions. Together, these sources of intra-class variation provide a stern challenge indeed for recognition. Somewhat surprisingly, the difference in performance between various state-of-the-art texture descriptors proves rather small in this task. On the other hand, we clearly demonstrate that very significant gains can be achieved via different SVM-based classification techniques. Selecting appropriate kernel parameters proves crucial. This motivates a novel recognition scheme based on a decision tree. Each node contains an SVM to split one class from all others with a kernel parameter optimal for that particular node. Hence, each decision is made using a different, optimal, class--specific metric. Experiments show the superiority of this approach over several state-of-the-art classifiers. 1.
Locally invariant fractal features for statistical texture classification
- In ICCV
, 2007
"... We address the problem of developing discriminative, yet invariant, features for texture classification. Texture variations due to changes in scale are amongst the hardest to handle. One of the most successful methods of dealing with such variations is based on choosing interest points and selecting ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
We address the problem of developing discriminative, yet invariant, features for texture classification. Texture variations due to changes in scale are amongst the hardest to handle. One of the most successful methods of dealing with such variations is based on choosing interest points and selecting their characteristic scales [Lazebnik et al. PAMI 2005]. However, selecting a characteristic scale can be unstable for many textures. Furthermore, the reliance on an interest point detector and the inability to evaluate features densely can be serious limitations. Fractals present a mathematically well founded alternative to dealing with the problem of scale. However, they have not become popular as texture features due to their lack of discriminative power. This is primarily because: (a) fractal based classification methods have avoided statistical characterisations of textures (which is essential for accurate analysis) by using global features; and (b) fractal dimension features are unable to distinguish between key texture primitives such as edges, corners and uniform regions. In this paper, we overcome these drawbacks and develop local fractal features that are evaluated densely. The features are robust as they do not depend on choosing interest points or characteristic scales. Furthermore, it is shown that the local fractal dimension is invariant to local bi-Lipschitz transformations whereas its extension is able to correctly distinguish between fundamental texture primitives. Textures are characterised statistically by modelling the full joint PDF of these features. This allows us to develop a texture classification framework which is discriminative, robust and achieves state-of-the-art performance as compared to affine invariant and fractal based methods. 1.
Discriminative Cluster Refinement: Improving Object Category Recognition Given Limited Training Data
"... A popular approach to problems in image classification is to represent the image as a bag of visual words and then employ a classifier to categorize the image. Unfortunately, a significant shortcoming of this approach is that the clustering and classification are disconnected. Since the clustering i ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
A popular approach to problems in image classification is to represent the image as a bag of visual words and then employ a classifier to categorize the image. Unfortunately, a significant shortcoming of this approach is that the clustering and classification are disconnected. Since the clustering into visual words is unsupervised, the representation does not necessarily capture the aspects of the data that are most useful for classification. More seriously, the semantic relationship between clusters is lost, causing the overall classification performance to suffer. We introduce “discriminative cluster refinement” (DCR), a method that explicitly models the pairwise relationships between different visual words by exploiting their co-occurrence information. The assigned class labels are used to identify the co-occurrence patterns that are most informative for object classification. DCR employs a maximum-margin approach to generate an optimal kernel matrix for classification. One important benefit of DCR is that it integrates smoothly into existing bag-of-words information retrieval systems by employing the set of visual words generated by any clustering method. While DCR could improve a broad class of information retrieval systems, this paper focuses on object category recognition. We present a direct comparison with a state-of-the art method on the PASCAL 2006 database and show that cluster refinement results in a significant improvement in classification accuracy given a small number of training examples. 1.

